optim-1.4.1/0002755000175000017500000000000012477523736011313 5ustar olafolafoptim-1.4.1/INDEX0000644000175000017500000000155112477523726012104 0ustar olafolafoptimization >> Optimization Optimization minimize nelder_mead_min d2_min nrm fmin line_min powell fmins adsmax mdsmax nmsmax bfgsmin samin battery cg_min de_min nonlin_min brent_line_min Data fitting expfit wpolyfit leasqr nonlin_residmin nonlin_curvefit LinearRegression polyfitinf wsolve Optimization statistics residmin_stat curvefit_stat polyconf Zero finding vfzero Compatibility quadprog= try Yinyu Ye's code linprog Numerical derivatives dfdp dcdp dfpdp dfxpdp cdiff deriv numgradient numhessian jacobs Pivoting cpiv_bard gjp Tests test_min_1 test_min_2 test_min_3 test_min_4 test_nelder_mead_min_1 test_nelder_mead_min_2 poly_2_ex test_minimize_1 optim_problems wrap_f_dfdp test_wpolyfit Examples bfgsmin_example rosenbrock samin_example Documentation optim_doc optim-1.4.1/inst/0002755000175000017500000000000012477523726012267 5ustar olafolafoptim-1.4.1/inst/wrap_f_dfdp.m0000644000175000017500000000256612477523726014727 0ustar olafolaf%% Copyright (C) 2010 Olaf Till %% %% This program is free software; you can redistribute it and/or modify it under %% the terms of the GNU General Public License as published by the Free Software %% Foundation; either version 3 of the License, or (at your option) any later %% version. %% %% This program is distributed in the hope that it will be useful, but WITHOUT %% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or %% FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more %% details. %% %% You should have received a copy of the GNU General Public License along with %% this program; if not, see . %% [ret1, ret2] = wrap_f_dfdp (f, dfdp, varargin) %% %% f and dftp should be the objective function (or "model function" in %% curve fitting) and its jacobian, respectively, of an optimization %% problem. ret1: f (varagin{:}), ret2: dfdp (varargin{:}). ret2 is %% only computed if more than one output argument is given. This %% manner of calling f and dfdp is needed by some optimization %% functions. function [ret1, ret2] = wrap_f_dfdp (f, dfdp, varargin) if (nargin < 3) print_usage (); end if (ischar (f)) f = str2func (f); end if (ischar (dfdp)) dfdp = str2func (dfdp); end ret1 = f (varargin{:}); if (nargout > 1) ret2 = dfdp (varargin{:}); end end optim-1.4.1/inst/linprog.m0000644000175000017500000001050712477523726014120 0ustar olafolaf## Copyright (C) 2009 Luca Favatella ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}) ## @deftypefnx{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}, @var{Aeq}, @var{beq}) ## @deftypefnx{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}, @var{Aeq}, @var{beq}, @var{lb}, @var{ub}) ## @deftypefnx{Function File} {[@var{x}, @var{fval}] =} linprog (@dots{}) ## Solve a linear problem. ## ## Finds ## ## @example ## min (f' * x) ## @end example ## ## (both f and x are column vectors) subject to ## ## @example ## @group ## A * x <= b ## Aeq * x = beq ## lb <= x <= ub ## @end group ## @end example ## ## If not specified, @var{Aeq} and @var{beq} default to empty matrices. ## ## If not specified, the lower bound @var{lb} defaults to minus infinite ## and the upper bound @var{ub} defaults to infinite. ## ## @c Will be cut out in optims info file and replaced with the same ## @c refernces explicitely there, since references to core Octave ## @c functions are not automatically transformed from here to there. ## @c BEGIN_CUT_TEXINFO ## @seealso{glpk} ## @c END_CUT_TEXINFO ## @end deftypefn function [x fval] = linprog (f, A, b, Aeq = [], beq = [], lb = [], ub = []) if (((nargin != 3) && (nargin != 5) && (nargin != 7)) || (nargout > 2)) print_usage (); endif nr_f = rows(f); # Sanitize A and b if (isempty (A) && isempty (b)) A = zeros (0, nr_f); b = zeros (rows (A), 1); endif nr_A = rows (A); if (columns (f) != 1) error ("f must be a column vector"); elseif (columns (A) != nr_f) error ("columns (A) != rows (f)"); elseif (size (b) != [nr_A 1]) error ("size (b) != [(rows (A)) 1]"); else ## Sanitize Aeq if (isempty (Aeq)) Aeq = zeros (0, nr_f); endif if (columns (Aeq) != nr_f) error ("columns (Aeq) != rows (f)"); endif ## Sanitize beq if (isempty (beq)) beq = zeros (0, 1); endif nr_Aeq = rows (Aeq); if (size (beq) != [nr_Aeq 1]) error ("size (beq) != [(rows (Aeq)) 1]"); endif ## Sanitize lb if (isempty (lb)) lb = - Inf (nr_f, 1); endif if (size (lb) != [nr_f 1]) error ("size (lb) != [(rows (f)) 1]"); endif ## Sanitize ub if (isempty (ub)) ub = Inf (nr_f, 1); endif if (size (ub) != [nr_f 1]) error ("size (ub) != [(rows (f)) 1]"); endif ## Call glpk ctype = [(repmat ("U", nr_A, 1)); (repmat ("S", nr_Aeq, 1))]; [x(1:nr_f, 1) fval(1, 1)] = glpk (f, [A; Aeq], [b; beq], lb, ub, ctype); endif endfunction %!test %! f = [1; -1]; %! A = []; %! b = []; %! Aeq = [1, 0]; %! beq = [2]; %! lb = [0; Inf]; %! ub = [-Inf; 0]; %! x_exp = [2; 0]; %! assert (linprog (f, A, b, Aeq, beq, lb, ub), x_exp); %!shared f, A, b, lb, ub, x_exp, fval_exp %! f = [21 25 31 34 23 19 32 36 27 25 19]'; %! %! A1 = [ 1 0 0 0 1 0 0 1 0 0 0; %! 0 1 0 0 0 1 0 0 1 0 0; %! 0 0 1 0 0 0 0 0 0 1 0; %! 0 0 0 1 0 0 1 0 0 0 1]; %! A2 = [ 1 1 1 1 0 0 0 0 0 0 0; %! 0 0 0 0 1 1 1 0 0 0 0; %! 0 0 0 0 0 0 0 1 1 1 1]; %! A = [-A1; A2]; %! %! b1 = [40; 50; 50; 70]; %! b2 = [100; 60; 50]; %! b = [-b1; b2]; %! %! lb = zeros (rows (f), 1); %! ub = Inf (rows (f), 1); %! %! x_exp = [40 0 50 10 0 50 10 0 0 0 50]'; %! fval_exp = f' * x_exp; %! %!test %! Aeq = []; %! beq = []; %! [x_obs fval_obs] = linprog (f, A, b, Aeq, beq, lb, ub); %! assert ([x_obs; fval_obs], [x_exp; fval_exp]); %! %!test %! Aeq = zeros (1, rows (f)); %! beq = 0; %! assert (linprog (f, A, b, Aeq, beq, lb, ub), x_exp); optim-1.4.1/inst/optim_problems.m0000644000175000017500000014151012477523726015500 0ustar olafolaf%% Copyright (C) 2007 Paul Kienzle (sort-based lookup in ODE solver) %% Copyright (C) 2009 Thomas Treichl (ode23 code) %% Copyright (C) 2010-2012 Olaf Till %% %% This program is free software; you can redistribute it and/or modify it under %% the terms of the GNU General Public License as published by the Free Software %% Foundation; either version 3 of the License, or (at your option) any later %% version. %% %% This program is distributed in the hope that it will be useful, but WITHOUT %% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or %% FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more %% details. %% %% You should have received a copy of the GNU General Public License along with %% this program; if not, see . %% Problems for testing optimizers. Documentation is in the code. function ret = optim_problems () %% As little external code as possible is called. This leads to some %% duplication of external code. The advantages are that thus these %% problems do not change with evolving external code, and that %% optimization results in Octave can be compared with those in Matlab %% without influence of differences in external code (e.g. ODE %% solvers). Even calling 'interp1 (..., ..., ..., 'linear')' is %% avoided by using an internal subfunction, although this is possibly %% too cautious. %% %% For cross-program comparison of optimizers, the code of these %% problems is intended to be Matlab compatible. %% %% External data may be loaded, which should be supplied in the %% 'private/' subdirectory. Use the variable 'ddir', which contains %% the path to this directory. %% Note: The difficulty of problems with dynamic models often %% decisively depends on the the accuracy of the used ODE(DAE)-solver. %% Description of the returned structure %% %% According to 3 classes of problems, there are (or should be) three %% fields: 'curve' (curve fitting), 'general' (general optimization), %% and 'zero' (zero finding). The subfields are labels for the %% particular problems. %% %% Under the label fields, there are subfields mostly identical %% between the 3 classes of problems (some may contain empty values): %% %% .f: handle of an internally defined objective function (argument: %% column vector of parameters), meant for minimization, or to a %% 'model function' (arguments: independents, column vector of %% parameters) in the case of curve fitting, where .f should return a %% matrix of equal dimensions as .data.y below. %% %% .dfdp: handle of internally defined function for jacobian of %% objective function or 'model function', respectively. %% %% .hessian: (not for curve fitting) handle of internally defined %% function for hessian of objective function. Not always supplied. %% %% .init_p: initial parameters, column vector %% %% possibly .init_p_b: two column matrix of ranges to choose initial %% parameters from %% %% possibly .init_p_f: handle of internally defined function which %% returns a column vector of initial parameters unique to the index %% given as function argument; given '0' as function argument, %% .init_p_f returns the maximum index %% %% .result.p: parameters of best known result %% %% possibly .result.obj: value of objective function for .result.p (or %% sum of squared residuals in curve fitting). %% %% .data.x: matrix of independents (curve fitting) %% %% .data.y: matrix of observations, dimensions may be independent of %% .data.x (curve fitting) %% %% .data.wt: matrix of weights, same dimensions as .data.y (curve %% fitting) %% %% .data.cov: covariance matrix of .data.y(:) (not necessarily a %% diagonal matrix, which could be expressed in .data.wt) %% %% .strict_inequc, .non_strict_inequc, .equc: 'strict' inequality %% constraints (<, >), 'non-strict' inequality constraints (<=, >=), %% and equality constraints, respectively. Subfields are: .bounds %% (except in equality constraints): two-column matrix of ranges; %% .linear: cell-array {m, v}, meaning linear constraints m.' * %% parameters + v >|>=|== 0; .general: handle of internally defined %% function h with h (p) >|>=|== 0; possibly .general_dcdp: handle of %% internally defined function (argument: parameters) returning the %% jacobian of the constraints given in .general. For the sake of %% optimizers which can exploit this, the function in subfield %% .general may accept a logical index vector as an optional second %% argument, returning only the indexed constraint values. %% Please keep the following list of problems current. %% %% .curve.p_1, .curve.p_2, .curve.p_3_d2: from 'Comparison of gradient %% methods for the solution of nonlinear parameter estimation %% problems' (1970), Yonathan Bard, Siam Journal on Numerical Analysis %% 7(1), 157--186. The numbering of problems is the same as in the %% article. Since Bard used strict bounds, testing optimizers which %% used penalization for bounds, the bounds are changed here to allow %% testing with non-strict bounds (<= or >=). .curve.p_3_d2 involves %% dynamic modeling. These are not necessarily difficult problems. %% %% .curve.p_3_d2_noweights: problem .curve.p_3_d2 equivalently %% re-formulated without weights. %% %% .curve.p_r: A seemingly more difficult 'real life' problem with %% dynamic modeling. To assess optimizers, .init_p_f should be used %% with 1:64. There should be two groups of results, indicating the %% presence of two local minima. Olaf Till %% %% .....schittkowski_...: Klaus Schittkowski: 'More test examples for %% nonlinear programming codes.' Lecture Notes in Economics and %% Mathematical Systems 282, Berlin 1987. The published problems are %% numbered from 201 to 395 and may appear here under the fields %% .curve, .general, or .zero. %% %% .general.schittkowski_281: 10 parameters, unconstrained. Hessian %% supplied. %% %% .general.schittkowski_289: 30 parameters, unconstrained. %% %% .general.schittkowski_327 and %% %% .curve.schittkowski_327: Two parameters, one general inequality %% constraint, two bounds. The best solution given in the publication %% seems not very good (it probably has been achieved with general %% minimization, not curve fitting) and has been replaced here by a %% better (leasqr). %% %% .curve.schittkowski_372 and %% %% .general.schittkowski_372: 9 parameters, 12 general inequality %% constraints, 6 bounds. Infeasible initial parameters %% (.curve.schittkowski_372.init_p_f(1) provides a set of more or less %% feasible parameters). leasqr sticks at the (feasible) initial %% values. sqp has no problems. %% %% .curve.schittkowski_373: 9 parameters, 6 equality constraints. %% Infeasible initial parameters (.curve.schittkowski_373.init_p_f(1) %% provides a set of more or less feasible parameters). leasqr sticks %% at the (feasible) initial values. sqp has no problems. %% %% .general.schittkowski_391: 30 parameters, unconstrained. The best %% solution given in the publication seems not very good, obviously %% the used routine had not managed to get very far from the starting %% parameters; it has been replaced here by a better (Octaves %% fminunc). The result still varies widely (without much changes in %% objective function) with changes of starting values. Maybe not a %% very good test problem, no well defined minimum ... %% %% .general.rosenbrock: 2D Rosenbrock function. Hessian supplied. The %% parameters a and b of the Rosenbrock function are set to 1 and 100, %% respectively. %% needed for some anonymous functions if (exist ('ifelse') ~= 5) ifelse = @ scalar_ifelse; end if (~exist ('OCTAVE_VERSION')) NA = NaN; end %% determine the directory of this functions file fdir = fileparts (mfilename ('fullpath')); %% data directory ddir = sprintf ('%s%sprivate%s', fdir, filesep, filesep); ret.curve.p_1.dfdp = []; ret.curve.p_1.init_p = [1; 1; 1]; ret.curve.p_1.data.x = cat (2, ... (1:15).', ... (15:-1:1).', ... [(1:8).'; (7:-1:1).']); ret.curve.p_1.data.y = [.14; .18; .22; .25; .29; .32; .35; .39; ... .37; .58; .73; .96; 1.34; 2.10; 4.39]; ret.curve.p_1.data.wt = []; ret.curve.p_1.data.cov = []; ret.curve.p_1.result.p = [.08241040; 1.133033; 2.343697]; ret.curve.p_1.strict_inequc.bounds = [0, 100; 0, 100; 0, 100]; ret.curve.p_1.strict_inequc.linear = []; ret.curve.p_1.strict_inequc.general = []; ret.curve.p_1.non_strict_inequc.bounds = ... [eps, 100; eps, 100; eps, 100]; ret.curve.p_1.non_strict_inequc.linear = []; ret.curve.p_1.non_strict_inequc.general = []; ret.curve.p_1.equc.linear = []; ret.curve.p_1.equc.general = []; ret.curve.p_1.f = @ f_1; ret.curve.p_2.dfdp = []; ret.curve.p_2.init_p = [0; 0; 0; 0; 0]; ret.curve.p_2.data.x = [.871, .643, .550; ... .228, .669, .854; ... .528, .229, .170; ... .110, .354, .337; ... .911, .056, .493; ... .476, .154, .918; ... .655, .421, .077; ... .649, .140, .199; ... .995, .045, NA; ... .130, .016, .195; ... .823, .690, .690; ... .768, .992, .389; ... .203, .740, .120; ... .302, .519, .221; ... .991, .450, .249; ... .224, .030, .502; ... .428, .127, .772; ... .552, .494, .110; ... .461, .824, .714; ... .799, .494, .295]; ret.curve.p_2.data.y = zeros (20, 3); ret.curve.p_2.data.wt = []; ret.curve.p_2.data.cov = []; ret.curve.p_2.data.misc = [4.36, 5.21, 5.35; ... 4.99, 3.30, 3.10; ... 1.67, NA, 2.75; ... 2.17, 1.48, 1.49; ... 2.98, 4.69, 4.23; ... 4.46, 3.87, 3.15; ... 1.79, 3.18, 3.57; ... 1.71, 3.13, 3.07; ... 3.07, 5.01, 4.58; ... 0.94, 0.93, 0.74; ... 4.97, 5.37, 5.35; ... 4.32, 4.85, 5.46; ... 2.17, 1.78, 2.43; ... 2.22, 2.18, 2.44; ... 2.88, 4.90, 5.11; ... 2.29, 1.94, 1.46; ... 3.76, 3.39, 2.71; ... 1.99, 2.93, 3.31; ... 4.95, 4.08, 4.19; ... 2.96, 4.26, 4.48]; ret.curve.p_2.result.p = [.9925145; 2.005293; 3.999732; ... 2.680371; .4977683]; % from maximum % likelyhood optimization ret.curve.p_2.strict_inequc.bounds = []; ret.curve.p_2.strict_inequc.linear = []; ret.curve.p_2.strict_inequc.general = []; ret.curve.p_2.non_strict_inequc.bounds = []; ret.curve.p_2.non_strict_inequc.linear = []; ret.curve.p_2.non_strict_inequc.general = []; ret.curve.p_2.equc.linear = []; ret.curve.p_2.equc.general = []; ret.curve.p_2.f = @ (x, p) f_2 (x, p, ret.curve.p_2.data.misc); ret.curve.p_3_d2.dfdp = []; ret.curve.p_3_d2.init_p = [.01; .01; .001; .001; .02; .001]; ret.curve.p_3_d2.data.x = [0; 12.5; 25; 37.5; 50; ... 62.5; 75; 87.5; 100]; ret.curve.p_3_d2.data.y=[1 1 0 0 0 ; ... .945757 .961201 .494861 .154976 .111485; ... .926486 .928762 .690492 .314501 .236263; ... .917668 .915966 .751806 .709300 .311747; ... .928987 .917542 .771559 1.19224 .333096; ... .927782 .920075 .780903 1.68815 .340324; ... .925304 .912330 .790539 2.19539 .356787; ... .925083 .917684 .783933 2.74211 .358283; ... .917277 .907529 .779259 3.20025 .361969]; ret.curve.p_3_d2.data.y(:, 3) = ... ret.curve.p_3_d2.data.y(:, 3) / 10; ret.curve.p_3_d2.data.y(:, 4:5) = ... ret.curve.p_3_d2.data.y(:, 4:5) / 1000; ret.curve.p_3_d2.data.wt = repmat ([.1, .1, 1, 10, 100], 9, 1); ret.curve.p_3_d2.data.cov = []; ret.curve.p_3_d2.result.p = [.6358247e-2; ... .6774551e-1; ... .5914274e-4; ... .4944010e-3; ... .1018828; ... .4210526e-3]; ret.curve.p_3_d2.strict_inequc.bounds = [0, 1; ... 0, 1; ... 0, .1; ... 0, .1; ... 0, 2; ... 0, .1]; ret.curve.p_3_d2.strict_inequc.linear = []; ret.curve.p_3_d2.strict_inequc.general = []; ret.curve.p_3_d2.non_strict_inequc.bounds = [eps, 1; ... eps, 1; ... eps, .1; ... eps, .1; ... eps, 2; ... eps, .1]; ret.curve.p_3_d2.non_strict_inequc.linear = []; ret.curve.p_3_d2.non_strict_inequc.general = []; ret.curve.p_3_d2.equc.linear = []; ret.curve.p_3_d2.equc.general = []; ret.curve.p_3_d2.f = @ f_3; ret.curve.p_3_d2_noweights = ret.curve.p_3_d2; ret.curve.p_3_d2_noweights.data.wt = []; ret.curve.p_3_d2_noweights.data.y(:, 1:2) = ... ret.curve.p_3_d2_noweights.data.y(:, 1:2) * .1; ret.curve.p_3_d2_noweights.data.y(:, 4) = ... ret.curve.p_3_d2_noweights.data.y(:, 4) * 10; ret.curve.p_3_d2_noweights.data.y(:, 5) = ... ret.curve.p_3_d2_noweights.data.y(:, 5) * 100; ret.curve.p_3_d2_noweights.f = @ f_3_noweights; ret.curve.p_r.dfdp = []; ret.curve.p_r.init_p = [.3; .03; .003; .7; 1000; .0205]; ret.curve.p_r.init_p_b = [.3, .5; ... .03, .05; ... .003, .005; ... .7, .9; ... 1000, 1300; ... .0205, .023]; ret.curve.p_r.init_p_f = @ (id) pc2 (ret.curve.p_r.init_p_b, id); hook.ns = [84; 84; 85; 86; 84; 84; 84; 84]; xb = [0.2, 0.8640; ... 0.2, 0.5320; ... 0.2, 0.4856; ... 0.2, 0.4210; ... 0.2, 0.3328; ... 0.2, 0.2996; ... 0.2, 0.2664; ... 0.2, 0.2498]; ns = cat (1, 0, cumsum (hook.ns)); x = zeros (ns(end), 1); for id = 1:8 x(ns(id) + 1 : ns(id + 1)) = ... linspace (xb(id, 1), xb(id, 2), hook.ns(id)).'; end hook.t = x; ret.curve.p_r.data.x = x; ret.curve.p_r.data.y = ... load (sprintf ('%soptim_problems_p_r_y.data', ddir)); ret.curve.p_r.data.wt = []; ret.curve.p_r.data.cov = []; ret.curve.p_r.result.p = [4.742909e-01; ... 3.837951e-02; ... 3.652570e-03; ... 7.725986e-01; ... 1.180967e+03; ... 2.107000e-02]; ret.curve.p_r.result.obj = 0.2043396; ret.curve.p_r.strict_inequc.bounds = []; ret.curve.p_r.strict_inequc.linear = []; ret.curve.p_r.strict_inequc.general = []; ret.curve.p_r.non_strict_inequc.bounds = []; ret.curve.p_r.non_strict_inequc.linear = []; ret.curve.p_r.non_strict_inequc.general = []; ret.curve.p_r.equc.linear = []; ret.curve.p_r.equc.general = []; hook.mc = [2.0019999999999999e-01, 1.9939999999999999e-01, ... 1.9939999999999999e-01, 1.9780000000000000e-01, ... 2.0080000000000001e-01, 1.9960000000000000e-01, ... 1.9960000000000000e-01, 1.9980000000000001e-01; ... ... 2.0060000000000000e-01, 2.0160000000000000e-01, ... 2.0200000000000001e-01, 2.0200000000000001e-01, ... 2.0180000000000001e-01, 2.0899999999999999e-01, ... 2.0860000000000001e-01, 2.0820000000000000e-01; ... ... 2.1999144799999999e-02, 2.1998803099999999e-02, ... 2.2000449599999999e-02, 2.2000024399999998e-02, ... 2.1998160999999999e-02, 2.1999289000000002e-02, ... 2.1998038800000001e-02, 2.2000270999999998e-02; ... ... -6.8806551999999986e-03, -1.3768898999999999e-02, ... -1.6065479000000001e-02, -2.0657919600000001e-02, ... -3.4479971099999999e-02, -4.5934394099999998e-02, ... -6.9011619100000005e-02, -9.1971348400000000e-02; ... ... 2.3383865100000002e-02, 2.4768462500000001e-02, ... 2.5231915899999999e-02, 2.6155515300000001e-02, ... 2.8933514200000000e-02, 3.1235568599999999e-02, ... 3.5874086299999997e-02, 4.0490560699999997e-02; ... ... -1.8240616806039459e+05, -1.6895474269973661e+03, ... -8.1072652464694931e+02, -7.0113302985566395e+02, ... 1.0929964862867249e+04, 3.5665776039585688e+02, ... 5.7400262910547769e+02, 9.1737316974342252e+02; ... ... 1.0965398741890911e+05, 1.0131334821116490e+03, ... 4.8504892529762208e+02, 4.1801020186158411e+02, ... -6.6178457662355086e+03, -2.2103886018172699e+02, ... -3.5529578864017282e+02, -5.6690686490678263e+02; ... ... -2.1972917026209168e+04, -2.0250659086265861e+02, ... -9.6733175964156985e+01, -8.3069683020988421e+01, ... 1.3356173243752210e+03, 4.5610806266307627e+01, ... 7.3229009073208331e+01, 1.1667126232349770e+02; ... ... 1.4676952576063929e+03, 1.3514357622838521e+01, ... 6.4524906786197480e+00, 5.5245948033669476e+00, ... -8.9827382090060922e+01, -3.1118708128841241e+00, ... -5.0039950796246986e+00, -7.9749636293721071e+00]; ret.curve.p_r.f = @ (x, p) f_r (x, p, hook); ret.general.schittkowski_281.dfdp = ... @ (p) schittkowski_281_dfdp (p); ret.general.schittkowski_281.hessian = ... @ (p) schittkowski_281_hessian (p); ret.general.schittkowski_281.init_p = zeros (10, 1); ret.general.schittkowski_281.result.p = ones (10, 1); % 'theoretically' ret.general.schittkowski_281.result.obj = 0; % 'theoretically' ret.general.schittkowski_281.strict_inequc.bounds = []; ret.general.schittkowski_281.strict_inequc.linear = []; ret.general.schittkowski_281.strict_inequc.general = []; ret.general.schittkowski_281.non_strict_inequc.bounds = []; ret.general.schittkowski_281.non_strict_inequc.linear = []; ret.general.schittkowski_281.non_strict_inequc.general = []; ret.general.schittkowski_281.equc.linear = []; ret.general.schittkowski_281.equc.general = []; ret.general.schittkowski_281.f = ... @ (p) (sum (((1:10).') .^ 3 .* (p - 1) .^ 2)) ^ (1 / 3); ret.general.schittkowski_289.dfdp = ... @ (p) exp (- sum (p .^ 2) / 60) / 30 * p; ret.general.schittkowski_289.init_p = [-1.03; 1.07; -1.10; 1.13; ... -1.17; 1.20; -1.23; 1.27; ... -1.30; 1.33; -1.37; 1.40; ... -1.43; 1.47; -1.50; 1.53; ... -1.57; 1.60; -1.63; 1.67; ... -1.70; 1.73; -1.77; 1.80; ... -1.83; 1.87; -1.90; 1.93; ... -1.97; 2.00]; ret.general.schittkowski_289.result.p = zeros (30, 1); % 'theoretically' ret.general.schittkowski_289.result.obj = 0; % 'theoretically' ret.general.schittkowski_289.strict_inequc.bounds = []; ret.general.schittkowski_289.strict_inequc.linear = []; ret.general.schittkowski_289.strict_inequc.general = []; ret.general.schittkowski_289.non_strict_inequc.bounds = []; ret.general.schittkowski_289.non_strict_inequc.linear = []; ret.general.schittkowski_289.non_strict_inequc.general = []; ret.general.schittkowski_289.equc.linear = []; ret.general.schittkowski_289.equc.general = []; ret.general.schittkowski_289.f = @ (p) 1 - exp (- sum (p .^ 2) / 60); ret.general.rosenbrock.f = @ (p) (1 - p(1))^2 + 100 * (p(2) - ... p(1)^2)^2; ret.general.rosenbrock.dfdp = @ (p) ... [- 2 * (1 - p(1)) - 400 * p(1) * (p(2) - p(1)^2); ... 200 * (p(2) - p(1)^2)]; ret.general.rosenbrock.hessian = @ (p) ... [2 + 1200 * p(1)^2 - 400 * p(2), ... - 400 * p(1); ... - 400 * p(1), ... 200]; ret.general.rosenbrock.init_p = [-10; -10]; # arbitrary, take what you want ret.general.rosenbrock.result.p = [1; 1]; # exact solution ret.general.rosenbrock.result.obj = 0; # exact solution ret.general.rosenbrock.strict_inequc.bounds = []; ret.general.rosenbrock.strict_inequc.linear = []; ret.general.rosenbrock.strict_inequc.general = []; ret.general.rosenbrock.non_strict_inequc.bounds = []; ret.general.rosenbrock.non_strict_inequc.linear = []; ret.general.rosenbrock.non_strict_inequc.general = []; ret.curve.schittkowski_327.dfdp = ... @ (x, p) [1 + exp(-p(2) * (x - 8)), ... (p(1) + .49) * (8 - x) .* exp (-p(2) * (x - 8))]; ret.curve.schittkowski_327.init_p = [.42; 5]; ret.curve.schittkowski_327.data.x = ... [8; 8; 10; 10; 10; 10; 12; 12; 12; 12; 14; 14; 14; 16; 16; 16; ... 18; 18; 20; 20; 20; 22; 22; 22; 24; 24; 24; 26; 26; 26; 28; ... 28; 30; 30; 30; 32; 32; 34; 36; 36; 38; 38; 40; 42]; ret.curve.schittkowski_327.data.y= ... [.49; .49; .48; .47; .48; .47; .46; .46; .45; .43; .45; .43; ... .43; .44; .43; .43; .46; .45; .42; .42; .43; .41; .41; .40; ... .42; .40; .40; .41; .40; .41; .41; .40; .40; .40; .38; .41; ... .40; .40; .41; .38; .40; .40; .39; .39]; ret.curve.schittkowski_327.data.wt = []; ret.curve.schittkowski_327.data.cov = []; %% This result was given by Schittkowski. No constraint is active %% here. The second parameter is unchanged from initial value. %% %% ret.curve.schittkowski_327.result.p = [.4219; 5]; %% ret.curve.schittkowski_327.result.obj = .0307986; %% %% This is the result of leasqr of Octave Forge. The general %% constraint is active here. Both parameters are different from %% initial value. The value of the objective function is better. %% ret.curve.schittkowski_327.result.p = [.4199227; 1.2842958]; ret.curve.schittkowski_327.result.obj = .0284597; ret.curve.schittkowski_327.strict_inequc.bounds = []; ret.curve.schittkowski_327.strict_inequc.linear = []; ret.curve.schittkowski_327.strict_inequc.general = []; ret.curve.schittkowski_327.non_strict_inequc.bounds = [.4, Inf; ... .4, Inf]; ret.curve.schittkowski_327.non_strict_inequc.linear = []; ret.curve.schittkowski_327.non_strict_inequc.general = ... @ (p, varargin) apply_idx_if_given ... (-.09 - p(1) * p(2) + .49 * p(2), varargin{:}); ret.curve.schittkowski_327.equc.linear = []; ret.curve.schittkowski_327.equc.general = []; ret.curve.schittkowski_327.f = ... @ (x, p) p(1) + (.49 - p(1)) * exp (-p(2) * (x - 8)); ret.general.schittkowski_327.init_p = [.42; 5]; ret.general.schittkowski_327.data.x = ... [8; 8; 10; 10; 10; 10; 12; 12; 12; 12; 14; 14; 14; 16; 16; 16; ... 18; 18; 20; 20; 20; 22; 22; 22; 24; 24; 24; 26; 26; 26; 28; ... 28; 30; 30; 30; 32; 32; 34; 36; 36; 38; 38; 40; 42]; ret.general.schittkowski_327.data.y= ... [.49; .49; .48; .47; .48; .47; .46; .46; .45; .43; .45; .43; ... .43; .44; .43; .43; .46; .45; .42; .42; .43; .41; .41; .40; ... .42; .40; .40; .41; .40; .41; .41; .40; .40; .40; .38; .41; ... .40; .40; .41; .38; .40; .40; .39; .39]; x = ret.general.schittkowski_327.data.x; y = ret.general.schittkowski_327.data.y; ret.general.schittkowski_327.dfdp = ... @ (p) cat (2, ... 2 * sum ((exp (-p(2 * x - 8)) - 1) * ... (y + (p(1) - .49) * ... exp (-p(2) * (x - 8)) - p1)), ... 2 * (p(1) - .49) * ... sum ((8 - x) * exp (-p(2 * x - 8)) * ... (y + (p(1) - .49) * ... exp (-p(2) * (x - 8)) - p1))); %% This result was given by Schittkowski. No constraint is active %% here. The second parameter is unchanged from initial value. %% %% ret.general.schittkowski_327.result.p = [.4219; 5]; %% ret.general.schittkowski_327.result.obj = .0307986; %% %% This is the result of leasqr of Octave Forge. The general %% constraint is active here. Both parameters are different from %% initial value. The value of the objective function is better. sqp %% gives a similar result. ret.general.schittkowski_327.result.p = [.4199227; 1.2842958]; ret.general.schittkowski_327.result.obj = .0284597; ret.general.schittkowski_327.strict_inequc.bounds = []; ret.general.schittkowski_327.strict_inequc.linear = []; ret.general.schittkowski_327.strict_inequc.general = []; ret.general.schittkowski_327.non_strict_inequc.bounds = [.4, Inf; ... .4, Inf]; ret.general.schittkowski_327.non_strict_inequc.linear = []; ret.general.schittkowski_327.non_strict_inequc.general = ... @ (p, varargin) apply_idx_if_given ... (-.09 - p(1) * p(2) + .49 * p(2), varargin{:}); ret.general.schittkowski_327.equc.linear = []; ret.general.schittkowski_327.equc.general = []; ret.general.schittkowski_327.f = ... @ (p) sumsq (y - p(1) - (.49 - p(1)) * exp (-p(2) * (x - 8))); ret.curve.schittkowski_372.dfdp = ... @ (x, p) cat (2, zeros (6, 3), eye (6)); %% given by Schittkowski, not feasible ret.curve.schittkowski_372.init_p = [300; -100; -.1997; -127; ... -151; 379; 421; 460; 426]; %% computed with sqp and a constant objective function, (almost) %% feasible ret.curve.schittkowski_372.init_p_f = @ (id) ... ifelse (id == 0, 1, [2.951277e+02; ... -1.058720e+02; ... -9.535824e-02; ... 2.421108e+00; ... 3.191822e+00; ... 3.790000e+02; ... 4.210000e+02; ... 4.600000e+02; ... 4.260000e+02]); ret.curve.schittkowski_372.data.x = (1:6).'; % any different numbers ret.curve.schittkowski_372.data.y= zeros (6, 1); ret.curve.schittkowski_372.data.wt = []; ret.curve.schittkowski_372.data.cov = []; %% recomputed with sqp (i.e. not with curve fitting) ret.curve.schittkowski_372.result.p = [5.2330557804078126e+02; ... -1.5694790476454301e+02; ... -1.9966450018535931e-01; ... 2.9607990282984435e+01; ... 8.6615541706550545e+01; ... 4.7326722338555498e+01; ... 2.6235616534580515e+01; ... 2.2915996663200740e+01; ... 3.9470733973874445e+01]; ret.curve.schittkowski_372.result.obj = 13390.1; ret.curve.schittkowski_372.strict_inequc.bounds = []; ret.curve.schittkowski_372.strict_inequc.linear = []; ret.curve.schittkowski_372.strict_inequc.general = []; ret.curve.schittkowski_372.non_strict_inequc.bounds = [-Inf, Inf; ... -Inf, Inf; ... -Inf, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf]; ret.curve.schittkowski_372.non_strict_inequc.linear = []; ret.curve.schittkowski_372.non_strict_inequc.general = ... @ (p, varargin) apply_idx_if_given ... (cat (1, p(1) + p(2) * exp (-5 * p(3)) + p(4) - 127, ... p(1) + p(2) * exp (-3 * p(3)) + p(5) - 151, ... p(1) + p(2) * exp (-p(3)) + p(6) - 379, ... p(1) + p(2) * exp (p(3)) + p(7) - 421, ... p(1) + p(2) * exp (3 * p(3)) + p(8) - 460, ... p(1) + p(2) * exp (5 * p(3)) + p(9) - 426, ... -p(1) - p(2) * exp (-5 * p(3)) + p(4) + 127, ... -p(1) - p(2) * exp (-3 * p(3)) + p(5) + 151, ... -p(1) - p(2) * exp (-p(3)) + p(6) + 379, ... -p(1) - p(2) * exp (p(3)) + p(7) + 421, ... -p(1) - p(2) * exp (3 * p(3)) + p(8) + 460, ... -p(1) - p(2) * exp (5 * p(3)) + p(9) + 426), ... varargin{:}); ret.curve.schittkowski_372.equc.linear = []; ret.curve.schittkowski_372.equc.general = []; ret.curve.schittkowski_372.f = @ (x, p) p(4:9); ret.curve.schittkowski_373.dfdp = ... @ (x, p) cat (2, zeros (6, 3), eye (6)); %% not feasible ret.curve.schittkowski_373.init_p = [300; -100; -.1997; -127; ... -151; 379; 421; 460; 426]; %% feasible ret.curve.schittkowski_373.init_p_f = @ (id) ... ifelse (id == 0, 1, [2.5722721227695763e+02; ... -1.5126681606092043e+02; ... 8.3101871447778766e-02; ... -3.0390506000425454e+01; ... 1.1661334225083069e+01; ... 2.6097719374430665e+02; ... 3.2814725183082305e+02; ... 3.9686840023267564e+02; ... 3.9796353824451995e+02]); ret.curve.schittkowski_373.data.x = (1:6).'; % any different numbers ret.curve.schittkowski_373.data.y= zeros (6, 1); ret.curve.schittkowski_373.data.wt = []; ret.curve.schittkowski_373.data.cov = []; ret.curve.schittkowski_373.result.p = [523.31; ... -156.95; ... -.2; ... 29.61; ... -86.62; ... 47.33; ... 26.24; ... 22.92; ... -39.47]; ret.curve.schittkowski_373.result.obj = 13390.1; ret.curve.schittkowski_373.strict_inequc.bounds = []; ret.curve.schittkowski_373.strict_inequc.linear = []; ret.curve.schittkowski_373.strict_inequc.general = []; ret.curve.schittkowski_373.non_strict_inequc.bounds = []; ret.curve.schittkowski_373.non_strict_inequc.linear = []; ret.curve.schittkowski_373.non_strict_inequc.general = []; ret.curve.schittkowski_373.equc.linear = []; ret.curve.schittkowski_373.equc.general = ... @ (p, varargin) apply_idx_if_given ... (cat (1, p(1) + p(2) * exp (-5 * p(3)) + p(4) - 127, ... p(1) + p(2) * exp (-3 * p(3)) + p(5) - 151, ... p(1) + p(2) * exp (-p(3)) + p(6) - 379, ... p(1) + p(2) * exp (p(3)) + p(7) - 421, ... p(1) + p(2) * exp (3 * p(3)) + p(8) - 460, ... p(1) + p(2) * exp (5 * p(3)) + p(9) - 426), ... varargin{:}); ret.curve.schittkowski_373.f = @ (x, p) p(4:9); ret.general.schittkowski_372.dfdp = ... @ (p) cat (2, zeros (1, 3), 2 * p(4:9)); %% not feasible ret.general.schittkowski_372.init_p = [300; -100; -.1997; -127; ... -151; 379; 421; 460; 426]; %% recomputed with sqp ret.general.schittkowski_372.result.p = [5.2330557804078126e+02; ... -1.5694790476454301e+02; ... -1.9966450018535931e-01; ... 2.9607990282984435e+01; ... 8.6615541706550545e+01; ... 4.7326722338555498e+01; ... 2.6235616534580515e+01; ... 2.2915996663200740e+01; ... 3.9470733973874445e+01]; ret.general.schittkowski_372.result.obj = 13390.1; ret.general.schittkowski_372.strict_inequc.bounds = []; ret.general.schittkowski_372.strict_inequc.linear = []; ret.general.schittkowski_372.strict_inequc.general = []; ret.general.schittkowski_372.non_strict_inequc.bounds = [-Inf, Inf; ... -Inf, Inf; ... -Inf, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf; ... 0, Inf]; ret.general.schittkowski_372.non_strict_inequc.linear = []; ret.general.schittkowski_372.non_strict_inequc.general = ... @ (p, varargin) apply_idx_if_given ... (cat (1, p(1) + p(2) * exp (-5 * p(3)) + p(4) - 127, ... p(1) + p(2) * exp (-3 * p(3)) + p(5) - 151, ... p(1) + p(2) * exp (-p(3)) + p(6) - 379, ... p(1) + p(2) * exp (p(3)) + p(7) - 421, ... p(1) + p(2) * exp (3 * p(3)) + p(8) - 460, ... p(1) + p(2) * exp (5 * p(3)) + p(9) - 426, ... -p(1) - p(2) * exp (-5 * p(3)) + p(4) + 127, ... -p(1) - p(2) * exp (-3 * p(3)) + p(5) + 151, ... -p(1) - p(2) * exp (-p(3)) + p(6) + 379, ... -p(1) - p(2) * exp (p(3)) + p(7) + 421, ... -p(1) - p(2) * exp (3 * p(3)) + p(8) + 460, ... -p(1) - p(2) * exp (5 * p(3)) + p(9) + 426), ... varargin{:}); ret.general.schittkowski_372.equc.linear = []; ret.general.schittkowski_372.equc.general = []; ret.general.schittkowski_372.f = @ (p) sumsq (p(4:9)); ret.general.schittkowski_391.dfdp = []; ret.general.schittkowski_391.init_p = ... -2.8742711 * alpha_391 (zeros (30, 1), 1:30); %% computed with fminunc (Octave) ret.general.schittkowski_391.result.p = [-1.1986682e+18; ... -1.1474574e+07; ... -1.3715802e+07; ... -1.0772255e+07; ... -1.0634232e+07; ... -1.0622915e+07; ... -8.8775399e+06; ... -8.8201496e+06; ... -9.7729975e+06; ... -1.0431808e+07; ... -1.0415089e+07; ... -1.0350400e+07; ... -1.0325094e+07; ... -1.0278561e+07; ... -1.0275751e+07; ... -1.0276546e+07; ... -1.0292584e+07; ... -1.0289350e+07; ... -1.0192566e+07; ... -1.0058577e+07; ... -1.0096341e+07; ... -1.0242386e+07; ... -1.0615831e+07; ... -1.1142096e+07; ... -1.1617283e+07; ... -1.2005738e+07; ... -1.2282117e+07; ... -1.2301260e+07; ... -1.2051365e+07; ... -1.1704693e+07]; ret.general.schittkowski_391.result.obj = -5.1615468e+20; ret.general.schittkowski_391.strict_inequc.bounds = []; ret.general.schittkowski_391.strict_inequc.linear = []; ret.general.schittkowski_391.strict_inequc.general = []; ret.general.schittkowski_391.non_strict_inequc.bounds = []; ret.general.schittkowski_391.non_strict_inequc.linear = []; ret.general.schittkowski_391.non_strict_inequc.general = []; ret.general.schittkowski_391.equc.linear = []; ret.general.schittkowski_391.equc.general = []; ret.general.schittkowski_391.f = @ (p) sum (alpha_391 (p, 1:30)); function ret = f_1 (x, p) ret = p(1) + x(:, 1) ./ (p(2) * x(:, 2) + p(3) * x(:, 3)); function ret = f_2 (x, p, y) y(3, 2) = p(4); x(9, 3) = p(5); p = p(:); mp = cat (2, p([1, 2, 3]), p([3, 1, 2]), p([3, 2, 1])); ret = x * mp - y; function ret = f_3 (x, p) ret = fixed_step_rk4 (x.', [1, 1, 0, 0, 0], 1, ... @ (x, t) f_3_xdot (x, t, p)); ret = ret.'; function ret = f_3_noweights (x, p) ret = fixed_step_rk4 (x.', [.1, .1, 0, 0, 0], .2, ... @ (x, t) f_3_xdot_noweights (x, t, p)); ret = ret.'; function ret = f_3_xdot (x, t, p) ret = zeros (5, 1); tp = p(2) * x(3) - p(1) * x(1) * x(2); ret(1) = tp; ret(2) = tp - p(4) * x(2) * x(3) + p(5) * x(5) - p(6) * x(2) * x(4); ret(3) = - tp - p(3) * x(3) - p(4) * x(2) * x(3); ret(4) = p(3) * x(3) + p(5) * x(5) - p(6) * x(2) * x(4); ret(5) = p(4) * x(2) * x(3) - p(5) * x(5) + p(6) * x(2) * x(4); function ret = f_3_xdot_noweights (x, t, p) x(1:2) = x(1:2) / .1; x(4) = x(4) / 10; x(5) = x(5) / 100; ret = f_3_xdot (x, t, p); ret(1:2) = ret(1:2) * .1; ret(4) = ret(4) * 10; ret(5) = ret(5) * 100; function ret = f_r (x, p, hook) n = size (hook.mc, 2); ns = cat (1, 0, cumsum (hook.ns)); xdhook.p = p; ret = zeros (1, ns(end)); %% temporary variables dls = p(3) ^ 2; dmhp = p(5) * dls / p(4); mhp = dmhp / 2; %% for id = 1:n xdhook.c = hook.mc(:, id); l = xdhook.c(3); x0 = mhp - sqrt (max (0, mhp ^ 2 + dls + (p(6) - l) * dmhp)); ids = ns(id) + 1; ide = ns(id + 1); tp = odeset (); %% necessary in Matlab (7.1) tp.OutputSave = []; tp.Refine = 0; %% tp.RelTol = 1e-7; tp.AbsTol = 1e-7; [cx, Xcx] = essential_ode23 (@ (t, X) f_r_xdot (X, t, xdhook), ... x([ids, ide]).', x0, tp); X = lin_interp (cx.', Xcx.', x(ids:ide).'); X = X.'; [discarded, lr] = ... f_r_xdot (X, hook.t(ids:ide), xdhook); ret(ids:ide) = max (0, lr - p(6) - X) * p(5); end ret = ret.'; function [ret, l] = f_r_xdot (x, t, hook) %% keep this working with non-scalar x and t p = hook.p; c = hook.c; idl = t <= c(1); idg = t >= c(2); idb = ~ (idl | idg); l = zeros (size (t)); l(idl) = c(3); l(idg) = c(4) * t(idg) + c(5); l(idb) = polyval (c(6:9), t(idb)); dls = max (1e-6, l - p(6) - x); tf = x / p(3); ido = tf >= 1; idx = ~ido; ret(ido) = 0; ret(idx) = - ((p(4) + p(1)) * p(2)) ./ ... ((p(5) * dls(idx)) ./ (1 - tf(idx) .^ 2) + p(1)) + p(2); function ret = alpha_391 (p, id) %% for .general.schittkowski_391; id is a numeric index(-vector) %% into p p = p(:); n = size (p, 1); nid = length (id); id = reshape (id, 1, nid); v = sqrt (repmat (p .^ 2, 1, nid) + 1 ./ ((1:n).') * id); log_v = log (v); ret = 420 * p(id) + (id(:) - 15) .^ 3 + ... sum (v .* (sin (log_v) .^ 5 + cos (log_v) .^ 5)).'; function ret = schittkowski_281_dfdp (p) tp = (sum (((1:10).') .^ 3 .* (p - 1) .^ 2)) ^ (- 2 / 3) / 3; ret = 2 * ((1:10).') .^ 3 .* (p - 1) * tp; function ret = schittkowski_281_hessian (p) k3 = ((1:10).^3).'; p = p(:); p_1 = p - 1; s_k_p = sum (k3 .* p_1.^2); ret = - 8 / 9 * s_k_p^(- 5 / 3) * ... (k3 * k3.') .* (p_1 * p_1.') + ... diag (2 / 3 * s_k_p^(- 2 / 3) * k3); function state = fixed_step_rk4 (t, x0, step, f) %% minimalistic fourth order ODE-solver, as said to be a popular one %% by Wikipedia (to make these optimization tests self-contained; %% for the same reason 'lookup' and even 'interp1' are not used %% here) n = ceil ((t(end) - t(1)) / step) + 1; m = length (x0); tstate = zeros (m, n); tstate(:, 1) = x0; tt = linspace (t(1), t(1) + step * (n - 1), n); for id = 1 : n - 1 k1 = f (tstate(:, id), tt(id)); k2 = f (tstate(:, id) + .5 * step * k1, tt(id) + .5 * step); k3 = f (tstate(:, id) + .5 * step * k2, tt(id) + .5 * step); k4 = f (tstate(:, id) + step * k3, tt(id + 1)); tstate(:, id + 1) = tstate(:, id) + ... (step / 6) * (k1 + 2 * k2 + 2 * k3 + k4); end state = lin_interp (tt, tstate, t); function ret = pc2 (p, id) %% a combination out of 2 possible values for each parameter r = size (p, 1); n = 2 ^ r; if (id < 0 || id > n) error ('no parameter set for this index'); end if (id == 0) % return maximum id ret = n; return; end idx = dec2bin (id - 1, r) == '1'; nidx = ~idx; ret = zeros (r, 1); ret(nidx) = p(nidx, 1); ret(idx) = p(idx, 2); function [varargout] = essential_ode23 (vfun, vslot, vinit, vodeoptions) %% This code is taken from the ode23 solver of Thomas Treichl %% , some flexibility of the %% interface has been removed. The idea behind this duplication is %% to have a fixed version of the solver here which runs both in %% Octave and Matlab. %% Some of the option treatment has been left out. if (length (vslot) > 2) vstepsizefixed = true; else vstepsizefixed = false; end if (strcmp (vodeoptions.NormControl, 'on')) vnormcontrol = true; else vnormcontrol = false; end if (~isempty (vodeoptions.NonNegative)) if (isempty (vodeoptions.Mass)) vhavenonnegative = true; else vhavenonnegative = false; end else vhavenonnegative = false; end if (isempty (vodeoptions.OutputFcn) && nargout == 0) vodeoptions.OutputFcn = @odeplot; vhaveoutputfunction = true; elseif (isempty (vodeoptions.OutputFcn)) vhaveoutputfunction = false; else vhaveoutputfunction = true; end if (~isempty (vodeoptions.OutputSel)) vhaveoutputselection = true; else vhaveoutputselection = false; end if (isempty (vodeoptions.OutputSave)) vodeoptions.OutputSave = 1; end if (vodeoptions.Refine > 0) vhaverefine = true; else vhaverefine = false; end if (isempty (vodeoptions.InitialStep) && ~vstepsizefixed) vodeoptions.InitialStep = (vslot(1,2) - vslot(1,1)) / 10; vodeoptions.InitialStep = vodeoptions.InitialStep / ... 10^vodeoptions.Refine; end if (isempty (vodeoptions.MaxStep) && ~vstepsizefixed) vodeoptions.MaxStep = (vslot(1,2) - vslot(1,1)) / 10; end if (~isempty (vodeoptions.Events)) vhaveeventfunction = true; else vhaveeventfunction = false; end if (~isempty (vodeoptions.Mass) && ... isnumeric (vodeoptions.Mass) && ... numel (size (vodeoptions.Mass)) < 3) vhavemasshandle = false; vmass = vodeoptions.Mass; elseif (isa (vodeoptions.Mass, 'function_handle')) vhavemasshandle = true; else vhavemasshandle = false; end if (strcmp (vodeoptions.MStateDependence, 'none')) vmassdependence = false; else vmassdependence = true; end %% Starting the initialisation of the core solver ode23 vtimestamp = vslot(1,1); %% timestamp = start time vtimelength = length (vslot); %% length needed if fixed steps vtimestop = vslot(1,vtimelength); %% stop time = last value vdirection = sign (vtimestop); %% Flag for direction to solve if (~vstepsizefixed) vstepsize = vodeoptions.InitialStep; vminstepsize = (vtimestop - vtimestamp) / (1/eps); else %% If step size is given then use the fixed time steps vstepsize = vslot(1,2) - vslot(1,1); vminstepsize = sign (vstepsize) * eps; end vretvaltime = vtimestamp; %% first timestamp output vretvalresult = vinit; %% first solution output %% Initialize the OutputFcn if (vhaveoutputfunction) if (vhaveoutputselection) vretout = ... vretvalresult(vodeoptions.OutputSel); else vretout = vretvalresult; end feval (vodeoptions.OutputFcn, vslot.', ... vretout.', 'init'); end %% Initialize the EventFcn if (vhaveeventfunction) odepkg_event_handle (vodeoptions.Events, vtimestamp, ... vretvalresult.', 'init'); end vpow = 1/3; %% 20071016, reported by Luis Randez va = [ 0, 0, 0; %% The Runge-Kutta-Fehlberg 2(3) coefficients 1/2, 0, 0; %% Coefficients proved on 20060827 -1, 2, 0]; %% See p.91 in Ascher & Petzold vb2 = [0; 1; 0]; %% 2nd and 3rd order vb3 = [1/6; 2/3; 1/6]; %% b-coefficients vc = sum (va, 2); %% The solver main loop - stop if the endpoint has been reached vcntloop = 2; vcntcycles = 1; vu = vinit; vk = vu.' * zeros(1,3); vcntiter = 0; vunhandledtermination = true; vcntsave = 2; while ((vdirection * (vtimestamp) < vdirection * (vtimestop)) && ... (vdirection * (vstepsize) >= vdirection * (vminstepsize))) %% Hit the endpoint of the time slot exactely if ((vtimestamp + vstepsize) > vdirection * vtimestop) %% if (((vtimestamp + vstepsize) > vtimestop) || ... %% (abs(vtimestamp + vstepsize - vtimestop) < eps)) vstepsize = vtimestop - vdirection * vtimestamp; end %% Estimate the three results when using this solver for j = 1:3 vthetime = vtimestamp + vc(j,1) * vstepsize; vtheinput = vu.' + vstepsize * vk(:,1:j-1) * va(j,1:j-1).'; if (vhavemasshandle) %% Handle only the dynamic mass matrix, if (vmassdependence) %% constant mass matrices have already vmass = feval ... %% been set before (if any) (vodeoptions.Mass, vthetime, vtheinput); else %% if (vmassdependence == false) vmass = feval ... %% then we only have the time argument (vodeoptions.Mass, vthetime); end vk(:,j) = vmass \ feval ... (vfun, vthetime, vtheinput); else vk(:,j) = feval ... (vfun, vthetime, vtheinput); end end %% Compute the 2nd and the 3rd order estimation y2 = vu.' + vstepsize * (vk * vb2); y3 = vu.' + vstepsize * (vk * vb3); if (vhavenonnegative) vu(vodeoptions.NonNegative) = abs (vu(vodeoptions.NonNegative)); y2(vodeoptions.NonNegative) = abs (y2(vodeoptions.NonNegative)); y3(vodeoptions.NonNegative) = abs (y3(vodeoptions.NonNegative)); end vSaveVUForRefine = vu; %% Calculate the absolute local truncation error and the %% acceptable error if (~vstepsizefixed) if (~vnormcontrol) vdelta = abs (y3 - y2); vtau = max (vodeoptions.RelTol * abs (vu.'), ... vodeoptions.AbsTol); else vdelta = norm (y3 - y2, Inf); vtau = max (vodeoptions.RelTol * max (norm (vu.', Inf), ... 1.0), ... vodeoptions.AbsTol); end else %% if (vstepsizefixed == true) vdelta = 1; vtau = 2; end %% If the error is acceptable then update the vretval variables if (all (vdelta <= vtau)) vtimestamp = vtimestamp + vstepsize; vu = y3.'; %% MC2001: the higher order estimation as 'local %% extrapolation' Save the solution every vodeoptions.OutputSave %% steps if (mod (vcntloop-1,vodeoptions.OutputSave) == 0) vretvaltime(vcntsave,:) = vtimestamp; vretvalresult(vcntsave,:) = vu; vcntsave = vcntsave + 1; end vcntloop = vcntloop + 1; vcntiter = 0; %% Call plot only if a valid result has been found, therefore %% this code fragment has moved here. Stop integration if plot %% function returns false if (vhaveoutputfunction) for vcnt = 0:vodeoptions.Refine %% Approximation between told %% and t if (vhaverefine) %% Do interpolation vapproxtime = (vcnt + 1) * vstepsize / ... (vodeoptions.Refine + 2); vapproxvals = vSaveVUForRefine.' + vapproxtime * (vk * ... vb3); vapproxtime = (vtimestamp - vstepsize) + vapproxtime; else vapproxvals = vu.'; vapproxtime = vtimestamp; end if (vhaveoutputselection) vapproxvals = vapproxvals(vodeoptions.OutputSel); end vpltret = feval (vodeoptions.OutputFcn, vapproxtime, ... vapproxvals, []); if vpltret %% Leave refinement loop break; end end if (vpltret) %% Leave main loop vunhandledtermination = false; break; end end %% Call event only if a valid result has been found, therefore %% this code fragment has moved here. Stop integration if %% veventbreak is true if (vhaveeventfunction) vevent = ... odepkg_event_handle (vodeoptions.Events, vtimestamp, ... vu(:), []); if (~isempty (vevent{1}) && vevent{1} == 1) vretvaltime(vcntloop-1,:) = vevent{3}(end,:); vretvalresult(vcntloop-1,:) = vevent{4}(end,:); vunhandledtermination = false; break; end end end %% If the error is acceptable ... %% Update the step size for the next integration step if (~vstepsizefixed) %% 20080425, reported by Marco Caliari vdelta cannot be negative %% (because of the absolute value that has been introduced) but %% it could be 0, then replace the zeros with the maximum value %% of vdelta vdelta(find (vdelta == 0)) = max (vdelta); %% It could happen that max (vdelta) == 0 (ie. that the original %% vdelta was 0), in that case we double the previous vstepsize vdelta(find (vdelta == 0)) = max (vtau) .* (0.4 ^ (1 / vpow)); if (vdirection == 1) vstepsize = min (vodeoptions.MaxStep, ... min (0.8 * vstepsize * (vtau ./ vdelta) .^ ... vpow)); else vstepsize = max (vodeoptions.MaxStep, ... max (0.8 * vstepsize * (vtau ./ vdelta) .^ ... vpow)); end else %% if (vstepsizefixed) if (vcntloop <= vtimelength) vstepsize = vslot(vcntloop) - vslot(vcntloop-1); else %% Get out of the main integration loop break; end end %% Update counters that count the number of iteration cycles vcntcycles = vcntcycles + 1; %% Needed for cost statistics vcntiter = vcntiter + 1; %% Needed to find iteration problems %% Stop solving because the last 1000 steps no successful valid %% value has been found if (vcntiter >= 5000) error (['Solving has not been successful. The iterative', ... ' integration loop exited at time t = %f before endpoint at', ... ' tend = %f was reached. This happened because the iterative', ... ' integration loop does not find a valid solution at this time', ... ' stamp. Try to reduce the value of ''InitialStep'' and/or', ... ' ''MaxStep'' with the command ''odeset''.\n'], vtimestamp, vtimestop); end end %% The main loop %% Check if integration of the ode has been successful if (vdirection * vtimestamp < vdirection * vtimestop) if (vunhandledtermination == true) error ('OdePkg:InvalidArgument', ... ['Solving has not been successful. The iterative', ... ' integration loop exited at time t = %f', ... ' before endpoint at tend = %f was reached. This may', ... ' happen if the stepsize grows smaller than defined in', ... ' vminstepsize. Try to reduce the value of ''InitialStep'' and/or', ... ' ''MaxStep'' with the command ''odeset''.\n'], vtimestamp, vtimestop); else warning ('OdePkg:InvalidArgument', ... ['Solver has been stopped by a call of ''break'' in', ... ' the main iteration loop at time t = %f before endpoint at', ... ' tend = %f was reached. This may happen because the @odeplot', ... ' function returned ''true'' or the @event function returned ''true''.'], ... vtimestamp, vtimestop); end end %% Postprocessing, do whatever when terminating integration %% algorithm if (vhaveoutputfunction) %% Cleanup plotter feval (vodeoptions.OutputFcn, vtimestamp, ... vu.', 'done'); end if (vhaveeventfunction) %% Cleanup event function handling odepkg_event_handle (vodeoptions.Events, vtimestamp, ... vu.', 'done'); end %% Save the last step, if not already saved if (mod (vcntloop-2,vodeoptions.OutputSave) ~= 0) vretvaltime(vcntsave,:) = vtimestamp; vretvalresult(vcntsave,:) = vu; end varargout{1} = vretvaltime; %% Time stamps are first output argument varargout{2} = vretvalresult; %% Results are second output argument function yi = lin_interp (x, y, xi) %% Actually interp1 with 'linear' should behave equally in Octave %% and Matlab, but having this subset of functionality here is being %% on the safe side. n = size (x, 2); m = size (y, 1); %% This elegant lookup is from an older version of 'lookup' by Paul %% Kienzle, and had been suggested by Kai Habel . [v, p] = sort ([x, xi]); idx(p) = cumsum (p <= n); idx = idx(n + 1 : n + size (xi, 2)); %% idx(idx == n) = n - 1; yi = y(:, idx) + ... repmat (xi - x(idx), m, 1) .* ... (y(:, idx + 1) - y(:, idx)) ./ ... repmat (x(idx + 1) - x(idx), m, 1); function ret = apply_idx_if_given (ret, idx) if (nargin > 1) ret = ret(idx); end function fval = scalar_ifelse (cond, tval, fval) %% needed for some anonymous functions, builtin ifelse only available %% in Octave > 3.2; we need only the scalar case here if (cond) fval = tval; end %!demo %! p_t = optim_problems ().curve.p_1; %! global verbose; %! verbose = false; %! [cy, cp, cvg, iter] = leasqr (p_t.data.x, p_t.data.y, p_t.init_p, p_t.f) %! disp (p_t.result.p) %! sumsq (cy - p_t.data.y) optim-1.4.1/inst/test_min_4.m0000644000175000017500000000635112477523726014515 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## test_bfgs - Test that bfgs works ## ## Check that bfgs treats struct options correctly ## ## Sets 'ok' to 1 if success, 0 otherwise ## The name of the optimizing function if ! exist ("optim_func"), optim_func = "bfgsmin"; end ok = 1; cnt = 0; if ! exist ("verbose"), verbose = 0; end N = 2; ## Make test reproducible ## x0 = randn(N,1) ; ## y0 = randn(N,1) ; x0 = (1:N)'/N; y0 = (N:-1:1)'/N; function v = ff(x,y,t) A = [1 -1;1 1]; M = A'*diag([100,1])*A; v = (x(1:2) - y(1:2))'*M*(x(1:2)-y(1:2)) + 1; endfunction function dv = dff(x,y,t) if nargin < 3, t = 1; end if t == 1, N = length (x); else N = length (y); end A = [1 -1;1 1]; M = A'*diag([100,1])*A; dv = 2*(x(1:2)-y(1:2))'*M; if N>2, dv = [dv, zeros(1,N-2)]; end if t == 2, dv = -dv; end endfunction if verbose printf ("\n Testing that %s accepts struct control variable\n\n",... optim_func); printf ([" Set 'optim_func' to the name of the optimization\n",... " function you want to test (must have same synopsis\n",... " as 'bfgsmin')\n\n"]); printf (" Nparams = N = %i\n",N); fflush (stdout); end ## Plain run, just to make sure ###################################### ## Minimum wrt 'x' is y0 ## [xlev,vlev,nlev] = feval (optim_func, "ff", "dff", {x0,y0,1}); ## ctl.df = "dff"; [xlev,vlev,nlev] = feval (optim_func, "ff", {x0,y0,1}); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose printf ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, printf ("ok %i\n",cnt); end ## Minimize wrt 2nd arg ############################################## ## Minimum wrt 'y' is x0 ## ctl = struct ("narg", 2,"df","dff"); ## ctl = [nan,nan,2]; ## [xlev,vlev,nlev] = feval (optim_func, "ff", list (x0,y0,2),ctl); [xlev,vlev,nlev] = feval (optim_func, "ff", {x0,y0,2},{inf,0,1,2}); cnt++; if max (abs (xlev-x0)) > 100*sqrt (eps) if verbose printf ("Error is too big : %8.3g\n", max (abs (xlev-x0))); end ok = 0; elseif verbose, printf ("ok %i\n",cnt); end ## Set the verbose option ############################################ ## Minimum wrt 'x' is y0 ## ctl = struct ("narg", 1,"verbose",verbose, "df", "dff"); ## ctl = [nan,nan,2]; ## [xlev,vlev,nlev] = feval (optim_func, "ff", "dff", {x0,y0,1},ctl); [xlev,vlev,nlev] = feval (optim_func, "ff", {x0,y0,1},{inf,1,1,1}); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose printf ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, printf ("ok %i\n",cnt); end if verbose && ok printf ( "All tests ok\n"); end optim-1.4.1/inst/nmsmax.m0000644000175000017500000001636612477523726013762 0ustar olafolaf%% Copyright (C) 2002 N.J.Higham %% Copyright (C) 2003 Andy Adler %% %% This program is free software; you can redistribute it and/or modify it under %% the terms of the GNU General Public License as published by the Free Software %% Foundation; either version 3 of the License, or (at your option) any later %% version. %% %% This program is distributed in the hope that it will be useful, but WITHOUT %% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or %% FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more %% details. %% %% You should have received a copy of the GNU General Public License along with %% this program; if not, see . %%NMSMAX Nelder-Mead simplex method for direct search optimization. %% [x, fmax, nf] = NMSMAX(FUN, x0, STOPIT, SAVIT) attempts to %% maximize the function FUN, using the starting vector x0. %% The Nelder-Mead direct search method is used. %% Output arguments: %% x = vector yielding largest function value found, %% fmax = function value at x, %% nf = number of function evaluations. %% The iteration is terminated when either %% - the relative size of the simplex is <= STOPIT(1) %% (default 1e-3), %% - STOPIT(2) function evaluations have been performed %% (default inf, i.e., no limit), or %% - a function value equals or exceeds STOPIT(3) %% (default inf, i.e., no test on function values). %% The form of the initial simplex is determined by STOPIT(4): %% STOPIT(4) = 0: regular simplex (sides of equal length, the default) %% STOPIT(4) = 1: right-angled simplex. %% Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). %% STOPIT(6) indicates the direction (ie. minimization or %% maximization.) Default is 1, maximization. %% set STOPIT(6)=-1 for minimization %% If a non-empty fourth parameter string SAVIT is present, then %% `SAVE SAVIT x fmax nf' is executed after each inner iteration. %% NB: x0 can be a matrix. In the output argument, in SAVIT saves, %% and in function calls, x has the same shape as x0. %% NMSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional %% arguments to be passed to fun, via feval(fun,x,P1,P2,...). %% References: %% N. J. Higham, Optimization by direct search in matrix computations, %% SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. %% C. T. Kelley, Iterative Methods for Optimization, Society for Industrial %% and Applied Mathematics, Philadelphia, PA, 1999. % From Matrix Toolbox % Copyright (C) 2002 N.J.Higham % www.maths.man.ac.uk/~higham/mctoolbox % Modifications for octave by A.Adler 2003 function [x, fmax, nf] = nmsmax(fun, x, stopit, savit, varargin) x0 = x(:); % Work with column vector internally. n = length(x0); % Set up convergence parameters etc. if (nargin < 3 || isempty(stopit)) stopit(1) = 1e-3; end tol = stopit(1); % Tolerance for cgce test based on relative size of simplex. if length(stopit) == 1, stopit(2) = inf; end % Max no. of f-evaluations. if length(stopit) == 2, stopit(3) = inf; end % Default target for f-values. if length(stopit) == 3, stopit(4) = 0; end % Default initial simplex. if length(stopit) == 4, stopit(5) = 1; end % Default: show progress. trace = stopit(5); if length(stopit) == 5, stopit(6) = 1; end % Default: maximize dirn= stopit(6); if nargin < 4, savit = []; end % File name for snapshots. V = [zeros(n,1) eye(n)]; f = zeros(n+1,1); V(:,1) = x0; f(1) = dirn*feval(fun,x,varargin{:}); fmax_old = f(1); if trace, fprintf('f(x0) = %9.4e\n', f(1)), end k = 0; m = 0; % Set up initial simplex. scale = max(norm(x0,inf),1); if stopit(4) == 0 % Regular simplex - all edges have same length. % Generated from construction given in reference [18, pp. 80-81] of [1]. alpha = scale / (n*sqrt(2)) * [ sqrt(n+1)-1+n sqrt(n+1)-1 ]; V(:,2:n+1) = (x0 + alpha(2)*ones(n,1)) * ones(1,n); for j=2:n+1 V(j-1,j) = x0(j-1) + alpha(1); x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end else % Right-angled simplex based on co-ordinate axes. alpha = scale*ones(n+1,1); for j=2:n+1 V(:,j) = x0 + alpha(j)*V(:,j); x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end end nf = n+1; how = 'initial '; [temp,j] = sort(f); j = j(n+1:-1:1); f = f(j); V = V(:,j); alpha = 1; beta = 1/2; gamma = 2; while 1 %%%%%% Outer (and only) loop. k = k+1; fmax = f(1); if fmax > fmax_old if ~isempty(savit) x(:) = V(:,1); eval(['save ' savit ' x fmax nf']) end end if trace fprintf('Iter. %2.0f,', k) fprintf([' how = ' how ' ']); fprintf('nf = %3.0f, f = %9.4e (%2.1f%%)\n', nf, fmax, ... 100*(fmax-fmax_old)/(abs(fmax_old)+eps)) end fmax_old = fmax; %%% Three stopping tests from MDSMAX.M % Stopping Test 1 - f reached target value? if fmax >= stopit(3) msg = ['Exceeded target...quitting\n']; break % Quit. end % Stopping Test 2 - too many f-evals? if nf >= stopit(2) msg = ['Max no. of function evaluations exceeded...quitting\n']; break % Quit. end % Stopping Test 3 - converged? This is test (4.3) in [1]. v1 = V(:,1); size_simplex = norm(V(:,2:n+1)-v1(:,ones(1,n)),1) / max(1, norm(v1,1)); if size_simplex <= tol msg = sprintf('Simplex size %9.4e <= %9.4e...quitting\n', ... size_simplex, tol); break % Quit. end % One step of the Nelder-Mead simplex algorithm % NJH: Altered function calls and changed CNT to NF. % Changed each `fr < f(1)' type test to `>' for maximization % and re-ordered function values after sort. vbar = (sum(V(:,1:n)')/n)'; % Mean value vr = (1 + alpha)*vbar - alpha*V(:,n+1); x(:) = vr; fr = dirn*feval(fun,x,varargin{:}); nf = nf + 1; vk = vr; fk = fr; how = 'reflect, '; if fr > f(n) if fr > f(1) ve = gamma*vr + (1-gamma)*vbar; x(:) = ve; fe = dirn*feval(fun,x,varargin{:}); nf = nf + 1; if fe > f(1) vk = ve; fk = fe; how = 'expand, '; end end else vt = V(:,n+1); ft = f(n+1); if fr > ft vt = vr; ft = fr; end vc = beta*vt + (1-beta)*vbar; x(:) = vc; fc = dirn*feval(fun,x,varargin{:}); nf = nf + 1; if fc > f(n) vk = vc; fk = fc; how = 'contract,'; else for j = 2:n V(:,j) = (V(:,1) + V(:,j))/2; x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end nf = nf + n-1; vk = (V(:,1) + V(:,n+1))/2; x(:) = vk; fk = dirn*feval(fun,x,varargin{:}); nf = nf + 1; how = 'shrink, '; end end V(:,n+1) = vk; f(n+1) = fk; [temp,j] = sort(f); j = j(n+1:-1:1); f = f(j); V = V(:,j); end %%%%%% End of outer (and only) loop. % Finished. if trace, fprintf(msg), end x(:) = V(:,1); optim-1.4.1/inst/nelder_mead_min.m0000644000175000017500000002616712477523726015561 0ustar olafolaf## Copyright (C) 2002-2008 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [x0,v,nev] = nelder_mead_min (f,args,ctl) - Nelder-Mead minimization ## ## Minimize 'f' using the Nelder-Mead algorithm. This function is inspired ## from the that found in the book "Numerical Recipes". ## ## ARGUMENTS ## --------- ## f : string : Name of function. Must return a real value ## args : list : Arguments passed to f. ## or matrix : f's only argument ## ctl : vector : (Optional) Control variables, described below ## or struct ## ## RETURNED VALUES ## --------------- ## x0 : matrix : Local minimum of f ## v : real : Value of f in x0 ## nev : number : Number of function evaluations ## ## CONTROL VARIABLE : (optional) may be named arguments (i.e. "name",value ## ------------------ pairs), a struct, or a vector of length <= 6, where ## NaN's are ignored. Default values are written . ## OPT. VECTOR ## NAME POS ## ftol,f N/A : Stopping criterion : stop search when values at simplex ## vertices are all alike, as tested by ## ## f > (max_i (f_i) - min_i (f_i)) /max(max(|f_i|),1) ## ## where f_i are the values of f at the vertices. <10*eps> ## ## rtol,r N/A : Stop search when biggest radius of simplex, using ## infinity-norm, is small, as tested by : ## ## ctl(2) > Radius <10*eps> ## ## vtol,v N/A : Stop search when volume of simplex is small, tested by ## ## ctl(2) > Vol ## ## crit,c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'rtol' (c=2) ## or 'vtol' (c=3) to the value of the 'tol' option. <1> ## ## tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> ## ## narg ctl(3) : Position of the minimized argument in args <1> ## maxev ctl(4) : Maximum number of function evaluations. This number ## may be slightly exceeded. ## isz ctl(5) : Size of initial simplex, which is : <1> ## ## { x + e_i | i in 0..N } ## ## Where x == args{narg} is the initial value ## e_0 == zeros (size (x)), ## e_i(j) == 0 if j != i and e_i(i) == ctl(5) ## e_i has same size as x ## ## Set ctl(5) to the distance you expect between the starting ## point and the minimum. ## ## rst ctl(6) : When a minimum is found the algorithm restarts next to ## it until the minimum does not improve anymore. ctl(6) is ## the maximum number of restarts. Set ctl(6) to zero if ## you know the function is well-behaved or if you don't ## mind not getting a true minimum. <0> ## ## verbose, v Be more or less verbose (quiet=0) <0> function [x,v,nev] = nelder_mead_min (f, args, varargin) verbose = 0; # Default control variables ftol = rtol = 10*eps; # Stop either by likeness of values or vtol = nan; # radius, but don't care about volume. crit = 0; # Stopping criterion ctl(1) tol = 10*eps; # Stopping test's threshold ctl(2) narg = 1; # Position of minimized arg ctl(3) maxev = inf; # Max num of func evaluations ctl(4) isz = 1; # Initial size ctl(5) rst = 0; # Max # of restarts if nargin >= 3, # Read control arguments va_arg_cnt = 1; if nargin > 3, ctl = struct (varargin{:}); else ctl = varargin{va_arg_cnt++}; end if isnumeric (ctl) if length (ctl)>=1 && !isnan (ctl(1)), crit = ctl(1); end if length (ctl)>=2 && !isnan (ctl(2)), tol = ctl(2); end if length (ctl)>=3 && !isnan (ctl(3)), narg = ctl(3); end if length (ctl)>=4 && !isnan (ctl(4)), maxev = ctl(4); end if length (ctl)>=5 && !isnan (ctl(5)), isz = ctl(5); end if length (ctl)>=6 && !isnan (ctl(6)), rst = ctl(6); end else if isfield (ctl, "crit") && ! isnan (ctl.crit ), crit = ctl.crit ; end if isfield (ctl, "tol") && ! isnan (ctl.tol ), tol = ctl.tol ; end if isfield (ctl, "ftol") && ! isnan (ctl.ftol ), ftol = ctl.ftol ; end if isfield (ctl, "rtol") && ! isnan (ctl.rtol ), rtol = ctl.rtol ; end if isfield (ctl, "vtol") && ! isnan (ctl.vtol ), vtol = ctl.vtol ; end if isfield (ctl, "narg") && ! isnan (ctl.narg ), narg = ctl.narg ; end if isfield (ctl,"maxev") && ! isnan (ctl.maxev), maxev = ctl.maxev; end if isfield (ctl, "isz") && ! isnan (ctl.isz ), isz = ctl.isz ; end if isfield (ctl, "rst") && ! isnan (ctl.rst ), rst = ctl.rst ; end if isfield(ctl,"verbose")&& !isnan(ctl.verbose),verbose=ctl.verbose;end end end if crit == 1, ftol = tol; elseif crit == 2, rtol = tol; elseif crit == 3, vtol = tol; elseif crit, error ("crit is %i. Should be 1,2 or 3.\n"); end if iscell (args) x = args{1}; else # Single argument x = args; args = {args}; endif if narg > length (args) # Check error ("nelder_mead_min : narg==%i, length (args)==%i\n", narg, length (args)); end [R,C] = size (x); N = R*C; # Size of argument x = x(:); # Initial simplex u = isz * eye (N+1,N) + ones(N+1,1)*x'; y = zeros (N+1,1); for i = 1:N+1, y(i) = feval (f, args{1:narg-1},reshape(u(i,:),R,C),args{narg+1:end}); end ; nev = N+1; [ymin,imin] = min(y); ymin0 = ymin; ## y nextprint = 0 ; v = nan; while nev <= maxev, ## ymin, ymax, ymx2 : lowest, highest and 2nd highest function values ## imin, imax, imx2 : indices of vertices with these values [ymin,imin] = min(y); [ymax,imax] = max(y) ; y(imax) = ymin ; [ymx2,imx2] = max(y) ; y(imax) = ymax ; ## ymin may be > ymin0 after restarting ## if ymin > ymin0 , ## "nelder-mead : Whoa 'downsimplex' Should be renamed 'upsimplex'" ## keyboard ## end # Compute stopping criterion done = 0; if ! isnan (ftol), done |= ((max(y)-min(y)) / max(1,max(abs(y))) < ftol); end if ! isnan (rtol), done |= (2*max (max (u) - min (u)) < rtol); end if ! isnan (vtol) done |= (abs (det (u(1:N,:)-ones(N,1)*u(N+1,:)))/factorial(N) < vtol); end ## [ 2*max (max (u) - min (u)), abs (det (u(1:N,:)-ones(N,1)*u(N+1,:)))/factorial(N);\ ## rtol, vtol] # Eventually print some info if verbose && nev > nextprint && ! done printf("nev=%-5d imin=%-3d ymin=%-8.3g done=%i\n",... nev,imin,ymin,done) ; nextprint = nextprint + 100 ; end if done # Termination test if (rst > 0) && (isnan (v) || v > ymin) rst--; if verbose if isnan (v), printf ("Restarting next to minimum %10.3e\n",ymin); else printf ("Restarting next to minimum %10.3e\n",ymin-v); end end # Keep best minimum x = reshape (u(imin,:), R, C) ; v = ymin ; jumplen = 10 * max (max (u) - min (u)); u += jumplen * randn (size (u)); for i = 1:N+1, y(i) = ... feval (f, args{1:narg-1},reshape(u(i,:),R,C),args{narg+1:length(args)}); end nev += N+1; [ymin,imin] = min(y); [ymax,imax] = max(y); y(imax) = ymin; [ymx2,imx2] = max(y); y(imax) = ymax ; else if isnan (v), x = reshape (u(imin,:), R, C) ; v = ymin ; end if verbose, printf("nev=%-5d imin=%-3d ymin=%-8.3g done=%i. Done\n",... nev,imin,ymin,done) ; end return end end ## [ y' u ] tra = 0 ; # 'trace' debug var contains flags if verbose > 1, str = sprintf (" %i : %10.3e --",done,ymin); end # Look for a new point xsum = sum(u) ; # Consider reflection of worst vertice # around centroid. ## f1 = (1-(-1))/N = 2/N; ## f2 = f1 - (-1) = 2/N + 1 = (N+2)/N xnew = (2*xsum - (N+2)*u(imax,:)) / N; ## xnew = (2*xsum - N*u(imax,:)) / N; ynew = feval (f, args{1:narg-1},reshape(xnew,R,C),args{narg+1:length(args)}); nev++; if ynew <= ymin , # Reflection is good tra += 1 ; if verbose > 1 str = [str,sprintf(" %3i : %10.3e good refl >>",nev,ynew-ymin)]; end y(imax) = ynew; u(imax,:) = xnew ; ## ymin = ynew; ## imin = imax; xsum = sum(u) ; ## f1 = (1-2)/N = -1/N ## f2 = f1 - 2 = -1/N - 2 = -(2*N+1)/N xnew = ( -xsum + (2*N+1)*u(imax,:) ) / N; ynew = feval (f, args{1:narg-1},reshape(xnew,R,C),args{narg+1:length(args)}); nev++; if ynew <= ymin , # expansion improves tra += 2 ; ## 'expanded reflection' y(imax) = ynew ; u(imax,:) = xnew ; xsum = sum(u) ; if verbose > 1 str = [str,sprintf(" %3i : %10.3e expd refl",nev,ynew-ymin)]; end else tra += 4 ; ## 'plain reflection' ## Updating of y and u has already been done if verbose > 1 str = [str,sprintf(" %3i : %10.3e plain ref",nev,ynew-ymin)]; end end # Reflexion is really bad elseif ynew >= ymax , tra += 8 ; if verbose > 1 str = [str,sprintf(" %3i : %10.3e intermedt >>",nev,ynew-ymin)]; end ## look for intermediate point # Bring worst point closer to centroid ## f1 = (1-0.5)/N = 0.5/N ## f2 = f1 - 0.5 = 0.5*(1 - N)/N xnew = 0.5*(xsum + (N-1)*u(imax,:)) / N; ynew = feval (f, args{1:narg-1},reshape(xnew,R,C),args{narg+1:length(args)}); nev++; if ynew >= ymax , # New point is even worse. Contract whole # simplex nev += N + 1 ; ## u0 = u; u = (u + ones(N+1,1)*u(imin,:)) / 2; ## keyboard ## Code that doesn't care about value of empty_list_elements_ok if imin == 1 , ii = 2:N+1; elseif imin == N+1, ii = 1:N; else ii = [1:imin-1,imin+1:N+1]; end for i = ii y(i) = ... ynew = feval (f, args{1:narg-1},reshape(u(i,:),R,C),args{narg+1:length(args)}); end ## 'contraction' tra += 16 ; if verbose > 1 str = [str,sprintf(" %3i contractn",nev)]; end else # Replace highest point y(imax) = ynew ; u(imax,:) = xnew ; xsum = sum(u) ; ## 'intermediate' tra += 32 ; if verbose > 1 str = [str,sprintf(" %3i : %10.3e intermedt",nev,ynew-ymin)]; end end else # Reflexion is neither good nor bad y(imax) = ynew ; u(imax,:) = xnew ; xsum = sum(u) ; ## 'plain reflection (2)' tra += 64 ; if verbose > 1 str = [str,sprintf(" %3i : %10.3e keep refl",nev,ynew-ymin)]; end end if verbose > 1, printf ("%s\n",str); end end if verbose >= 0 printf ("nelder_mead : Too many iterations. Returning\n"); end if isnan (v) || v > ymin, x = reshape (u(imin,:), R, C) ; v = ymin ; end optim-1.4.1/inst/dfpdp.m0000644000175000017500000000400712477523726013541 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## function jac = dfpdp (p, func[, hook]) ## ## Returns Jacobian of func (p) with respect to p with finite ## differencing. The optional argument hook is a structure which can ## contain the following fields at the moment: ## ## hook.f: value of func(p) for p as given in the arguments ## ## hook.diffp: positive vector of fractional steps from given p in ## finite differencing (actual steps may be smaller if bounds are ## given). The default is .001 * ones (size (p)). ## ## hook.diff_onesided: logical vector, indexing elements of p for ## which only one-sided differences should be computed (faster); even ## if not one-sided, differences might not be exactly central if ## bounds are given. The default is false (size (p)). ## ## hook.fixed: logical vector, indexing elements of p for which zero ## should be returned instead of the guessed partial derivatives ## (useful in optimization if some parameters are not optimized, but ## are 'fixed'). ## ## hook.lbound, hook.ubound: vectors of lower and upper parameter ## bounds (or -Inf or +Inf, respectively) to be respected in finite ## differencing. The consistency of bounds is not checked. function ret = dfpdp (varargin) ## This is an interface to __dfdp__.m. if (ischar (varargin{2})) varargin{2} = str2func (varargin{2}); endif ret = __dfdp__ (varargin{:}); endfunction optim-1.4.1/inst/nonlin_residmin.m0000644000175000017500000001027512477523726015637 0ustar olafolaf## Copyright (C) 2010-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{p}, @var{resid}, @var{cvg}, @var{outp}] =} nonlin_residmin (@var{f}, @var{pin}) ## @deftypefnx {Function File} {[@var{p}, @var{resid}, @var{cvg}, @var{outp}] =} nonlin_residmin (@var{f}, @var{pin}, @var{settings}) ## Frontend for nonlinear minimization of residuals returned by a model ## function. ## ## The functions supplied by the user have a minimal ## interface; any additionally needed constants (e.g. observed values) ## can be supplied by wrapping the user functions into anonymous ## functions. ## ## The following description applies to usage with vector-based ## parameter handling. Differences in usage for structure-based ## parameter handling will be explained separately. ## ## @var{f}: function returning the array of residuals. It gets a column ## vector of real parameters as argument. In gradient determination, ## this function may be called with an informational second argument, ## whose content depends on the function for gradient determination. ## ## @var{pin}: real column vector of initial parameters. ## ## @var{settings}: structure whose fields stand for optional settings ## referred to below. The fields can be set by @code{optimset()}. ## ## The returned values are the column vector of final parameters ## @var{p}, the final array of residuals @var{resid}, an integer ## @var{cvg} indicating if and how optimization succeeded or failed, and ## a structure @var{outp} with additional information, curently with the ## fields: @code{niter}, the number of iterations and ## @code{user_interaction}, information on user stops (see settings). ## The backend may define additional fields. @var{cvg} is greater than ## zero for success and less than or equal to zero for failure; its ## possible values depend on the used backend and currently can be ## @code{0} (maximum number of iterations exceeded), @code{2} (parameter ## change less than specified precision in two consecutive iterations), ## or @code{3} (improvement in objective function -- e.g. sum of squares ## -- less than specified), or @code{-1} (algorithm aborted by a user ## function). ## ## @c The following block will be cut out in the package info file. ## @c BEGIN_CUT_TEXINFO ## ## For settings, type @code{optim_doc ("nonlin_residmin")}. ## ## For desription of structure-based parameter handling, type ## @code{optim_doc ("parameter structures")}. ## ## For description of individual backends (currently only one), type ## @code{optim_doc ("residual optimization")} and choose the backend in ## the menu. ## ## @c END_CUT_TEXINFO ## ## @seealso {nonlin_curvefit} ## @end deftypefn ## PKG_ADD: __all_opts__ ("nonlin_residmin"); function [p, resid, cvg, outp] = nonlin_residmin (varargin) if (nargin == 1) p = __nonlin_residmin__ (varargin{1}); return; endif if (nargin < 2 || nargin > 3) print_usage (); endif if (nargin == 2) varargin{3} = struct (); endif varargin{4} = struct (); [p, resid, cvg, outp] = __nonlin_residmin__ (varargin{:}); endfunction %!demo %! ## Example for linear inequality constraints %! ## (see also the same example in 'demo nonlin_curvefit') %! %! ## independents %! indep = 1:5; %! ## residual function: %! f = @ (p) p(1) * exp (p(2) * indep) - [1, 2, 4, 7, 14]; %! ## initial values: %! init = [.25; .25]; %! ## linear constraints, A.' * parametervector + B >= 0 %! A = [1; -1]; B = 0; # p(1) >= p(2); %! settings = optimset ("inequc", {A, B}); %! %! ## start optimization %! [p, residuals, cvg, outp] = nonlin_residmin (f, init, settings) optim-1.4.1/inst/polyfitinf.m0000644000175000017500000006545612477523726014646 0ustar olafolaf## Copyright (c) 1998-2011 Andrew V. Knyazev ## All rights reserved. ## ## Redistribution and use in source and binary forms, with or without", ## modification, are permitted provided that the following conditions are met: ## ## 1 Redistributions of source code must retain the above copyright notice, ## this list of conditions and the following disclaimer. ## 2 Redistributions in binary form must reproduce the above copyright ## notice, this list of conditions and the following disclaimer in the ## documentation and/or other materials provided with the distribution. ## 3 Neither the name of the author nor the names of its contributors may be ## used to endorse or promote products derived from this software without ## specific prior written permission. ## ## THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ''AS IS'' ## AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE ## IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ## ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE FOR ## ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL ## DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR ## SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER ## CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, ## OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE ## OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. % function [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT,REF0) % % Best polynomial approximation in discrete uniform norm % % INPUT VARIABLES: % % M : degree of the fitting polynomial % N : number of data points % X(N) : x-coordinates of data points % Y(N) : y-coordinates of data points % K : character of the polynomial: % K = 0 : mixed parity polynomial % K = 1 : odd polynomial ( X(1) must be > 0 ) % K = 2 : even polynomial ( X(1) must be >= 0 ) % EPSH : tolerance for leveling. A useful value for 24-bit % mantissa is EPSH = 2.0E-7 % MAXIT : upper limit for number of exchange steps % REF0(M2): initial alternating set ( N-vector ). This is an % OPTIONAL argument. The length M2 is given by: % M2 = M + 2 , if K = 0 % M2 = integer part of (M+3)/2 , if K = 1 % M2 = 2 + M/2 (M must be even) , if K = 2 % % OUTPUT VARIABLES: % % A : polynomial coefficients of the best approximation % in order of increasing powers: % p*(x) = A(1) + A(2)*x + A(3)*x^2 + ... % REF : selected alternating set of points % HMAX : maximum deviation ( uniform norm of p* - f ) % H : pointwise approximation errors % R : total number of iterations % EQUAL : success of failure of algorithm % EQUAL=1 : succesful % EQUAL=0 : convergence not acheived % EQUAL=-1: input error % EQUAL=-2: algorithm failure % % Relies on function EXCH, provided below. % % Example: % M = 5; N = 10000; K = 0; EPSH = 10^-12; MAXIT = 10; % X = linspace(-1,1,N); % uniformly spaced nodes on [-1,1] % k=1; Y = abs(X).^k; % the function Y to approximate % [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); % p = polyval(A,X); plot(X,Y,X,p) % p is the best approximation % % Note: using an even value of M, e.g., M=2, in the example above, makes % the algorithm to fail with EQUAL=-2, because of collocation, which % appears because both the appriximating function and the polynomial are % even functions. The way aroung it is to approximate only the right half % of the function, setting K = 2 : even polynomial. For example: % % N = 10000; K = 2; EPSH = 10^-12; MAXIT = 10; X = linspace(0,1,N); % for i = 1:2 % k = 2*i-1; Y = abs(X).^k; % for j = 1:4 % M = 2^j; % [~,~,HMAX] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); % approxerror(i,j) = HMAX; % end % end % disp('Table 3.1 from Approximation theory and methods, M.J.D.POWELL, p. 27'); % disp(' '); % disp(' n K=1 K=3'); % disp(' '); format short g; % disp([(2.^(1:4))' approxerror']); % % ALGORITHM: % % Computation of the polynomial that best approximates the data (X,Y) % in the discrete uniform norm, i.e. the polynomial with the minimum % value of max{ | p(x_i) - y_i | , x_i in X } . That polynomial, also % known as minimax polynomial, is obtained by the exchange algorithm, % a finite iterative process requiring, at most, % n % ( ) iterations ( usually p = M + 2. See also function EXCH ). % p % since this number can be very large , the routine may not converge % within MAXIT iterations . The other possibility of failure occurs % when there is insufficient floating point precision for the input % data chosen. % % CREDITS: This routine was developed and modified as % computer assignments in Approximation Theory courses by % Prof. Andrew Knyazev, University of Colorado Denver, USA. % % Team Fall 98 (Revision 1.0): % Chanchai Aniwathananon % Crhistopher Mehl % David A. Duran % Saulo P. Oliveira % % Team Spring 11 (Revision 1.1): Manuchehr Aminian % % The algorithm and the comments are based on a FORTRAN code written % by Joseph C. Simpson. The code is available on Netlib repository: % http://www.netlib.org/toms/501 % See also: Communications of the ACM, V14, pp.355-356(1971) % % NOTES: % % 1) A may contain the collocation polynomial % 2) If MAXIT is exceeded, REF contains a new reference set % 3) M, EPSH and REF can be altered during the execution % 4) To keep consistency to the original code , EPSH can be % negative. However, the use of REF0 is *NOT* determined by % EPSH< 0, but only by its inclusion as an input parameter. % % Some parts of the code can still take advantage of vectorization. % % Revision 1.0 from 1998 is a direct human translation of % the FORTRAN code http://www.netlib.org/toms/501 % Revision 1.1 is a clean-up and technical update. % Tested on MATLAB Version 7.11.0.584 (R2010b) and % GNU Octave Version 3.2.4 % $Revision: 1.1 $ $Date: 2011/08/3 $ % ************************************ beginning of POLYFITINF function [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT,REF0) % Preassign output variables A,REF,HMAX,H,R,EQUAL in case of error return A = []; REF = []; HMAX = []; H = []; R = 0; EQUAL = -2; %%%% end preassignment % Setting M with respect to K MOLD = M; switch K case 1 K0 = 0; K1 = 1; Q1 = 1; Q2 = 2; M = (M-Q1)/2; case 2 K0 = 0; K1 = 0; Q1 = 0; Q2 = 2; % If the user has input odd M, but wants an even polynomial, % subtract 1 from M to prevent errors later. The outputs should be % mathematically equivalent. if mod(M,2) == 1 M = M-1; end M = (M-Q1)/2; otherwise if (K ~= 0) warning('polyfitinf:MixedParity','Using mixed parity polynomial...'); end K0 = 1; K1 = 0; Q1 = 0; Q2 = 1; end P = M + 2; % Check input data consistency if ( (length(X) ~= N) || (length(Y) ~= N) ) error('Input Error: check data lengths'); end if (P > N) error('Input Error: insufficient data points'); end if (M < 0) error('Input Error: insufficient degree'); end if ( (K == 2) && (X(1) < 0) ) || ( (K == 1) && (X(1) <= 0) ) error('Input Error: X(1) inconsistent with parity'); end if any(diff(X)<0) error('Input Error: Abscissae out of order'); end ITEMP = MOLD + 1; A = zeros(1,ITEMP); ITEMP = P + 2; Z = zeros(1,ITEMP); Z(1) = 0; Z(ITEMP) = N + 1; EPSH = abs(EPSH); % Read initial reference set into Z, if available. if (nargin == 8) J = 0; Z(2:(P+1))= REF0(1:P); % Check if REF is monotonically increasing if ( any(diff(REF0) < 0) || any(REF0 > J) ) error('Input Error : Bad initial reference set'); end else % Loads Z with the points closest to the Chebychev abscissas X1 = X(1); XE = X(N); % Setting parity-dependent parameters if (K0 == 1) XA = XE + X1; XE = XE - X1; Q = pi/(M + 1.0); else XA = 0.; XE = XE + XE; ITEMP = 2*(M+1) + Q1; Q = pi/(ITEMP); end % Calculate the J-th Chebyshev abcissa and load Z(J+1) % with the appropriate index from the data abcissas for JJ = 1:P J = P + 1 - JJ; X1 = XA + XE*( cos(Q*(P-J)) ); ITEMP = J + 2; R = Z(ITEMP); HIGH = R - 1; FLAG = 1; if (HIGH >= 2) II = 2; while ( (II <= HIGH) && (FLAG == 1) ) I = HIGH + 2 - II; ITEMP = I - 1; % If the Chebyschev abscissa is bracketed by % two input abcissas, get out of the while loop if (X(I)+X(ITEMP) <= X1) FLAG = 0; end II = II + 1; end end if (FLAG == 1) I = 1; end ITEMP = J + 1; if (I < R) Z(ITEMP) = I; else Z(ITEMP) = R - 1; end end % If the lower Chebyshev abcissas are less than X(1), % load the lower elements of Z with the lowest points IND = find(Z(2:end) >= (1:(length(Z)-1))); try TEMP = IND(1); % If IND is empty, do nothing. catch exception % The catch will be that IND is an empty array. if strcmpi(exception.identifier,'MATLAB:badsubscript') % This will be the exception. Do nothing. end end if TEMP~=1 Z(2:TEMP) = (1:(TEMP-1))'; end end % M1 entry. Initialize variables to prepare for exchange iteration ITEMP = M + 1; % Zero the AA array AA = zeros(1,ITEMP); % Load H with the ordinates and XX(I) with the abscissas if the % polynomial is mixed . If it is even or odd , load XX with the % squares of the abscissas. H(1:N) = Y(1:N); if (K0 <=0) XX(1:N) = X(1:N).^2; else XX(1:N) = X(1:N); end B1 = 0; B2 = 0; B3 = 0; R = -1; T = 0.; % Iteration entry. R is the iteration index C = zeros(1,P); D = zeros(1,P); DAA = zeros(1,M+1); FLAG = 1; while ( (R < MAXIT) && (FLAG == 1) ) R = R + 1; % LABEL 350 %S = 1.; % Computation of div. differences schemes if (K1 > 0) % If the polynomial is mixed or even: %for I = 1:P % S = -S; % ITEMP = I + 1; % J = Z(ITEMP); % Q = X(J); % C(I) = (H(J) + S*T)/Q; % D(I) = S/Q; %end I = (1:P); S = (-1).^I; ITEMP = I+1; J = Z(ITEMP); C(I) = (H(J) + S*T)./X(J); D(I) = S./Q; clear I ITEMP S J else % If the polynomial is odd: %for I = 1:P % S = -S; % ITEMP = I + 1; % ITEMP = Z(ITEMP); % C(I) = H(ITEMP) + S*T; % D(I) = S; %end I = (1:P); S = (-1).^I; ITEMP = I+1; C(I) = H( Z(ITEMP) ) + S.*T; D(I) = S; clear I ITEMP S end for I = 2:P for JJ = I:P J = P + I - JJ; ITEMP = J + 1; ITEMP = Z(ITEMP); QD = XX(ITEMP); ITEMP = 2 + J - I; ITEMP = Z(ITEMP); QD = QD - XX(ITEMP); ITEMP = J - 1; C(J) = (C(J)-C(ITEMP))/QD; D(J) = (D(J)-D(ITEMP))/QD; end end DT = -C(P)/D(P); T = T + DT; % Computation of polynomial coefficients HIGH = M + 1; for II = 1:HIGH I = HIGH - II; ITEMP = I + 1; DAA(ITEMP) = C(ITEMP) + DT*D(ITEMP); ITEMP = I + 2; ITEMP = Z(ITEMP); QD = XX(ITEMP); LOW = I + 1; if (M >= LOW) DAA(LOW:M) = DAA(LOW:M) - QD*DAA(((LOW:M)+1)); end end AA(1:HIGH) = AA(1:HIGH) + DAA(1:HIGH); % Evaluation of the polynomial to get the approximation errors MAXX = 0.; H = zeros(1,N); for I = 1:N SD = AA(HIGH); QD = XX(I); if (M > 0) for J = M:-1:1 SD = SD*QD + AA(J); end end if (K1 > 0) % If the polynomial is odd, multiply SD by X(I) SD = SD*X(I); end QD = Y(I) - SD; H(I) = Y(I) - SD; if (abs(QD) > MAXX) % Load MAXX with the largest magnitude % of the approximation array MAXX = abs(QD); end end % Test for alternating signs ITEMP = Z(2); if (H(ITEMP) == 0.) % This represents a case where the polynomial % exactly predicts a data point warning('polyfitinf:Collocation','Collocation has occured.'); if (B3 > 0) B3 = -1; FLAG = 0; else B3 = 1; if (EPSH < MAXX) warning('polyfitinf:AnotherTry','1 more attempt with middle points'); LOW = (N+1)/2 - (P+1)/2 + 1; HIGH = LOW + P; Z(LOW:HIGH) = ( (LOW:HIGH) -1); else disp('Normal Exit.'); FLAG = 0; end end else if (H(ITEMP) > 0.) J = -1; else J = 1; end I = 2; FLAG2 = 1; while ( (I <= P) && (FLAG2 == 1) ) ITEMP = I + 1; ITEMP = Z(ITEMP); if (H(ITEMP) == 0.) J = 0; warning('polyfitinf:Collocation','Collocation has occured.'); if (B3 > 0) B3 = -1; FLAG = 0; else B3 = 1; if (EPSH < MAXX) warning('polyfitinf:AnotherTry','1 more attempt with middle points'); LOW = (N+1)/2 - (P+1)/2 + 1; HIGH = LOW + P; Z(LOW:HIGH) = ( (LOW:HIGH) -1); else disp('Normal Exit.'); FLAG = 0; end end FLAG2 = 0; else if (H(ITEMP) < 0) JJ = -1; else JJ = 1; end if (J~=JJ) % Error entry: bad accuracy for calculation B1 = 1; FLAG2 = 0; FLAG = 0; else J = -J; end end I = I + 1; end % end of while % Search for another reference if (FLAG2*FLAG == 1) [H,Z,EQUAL] = exch(N, P, EPSH, H, Z); if (EQUAL > 0) FLAG = 0; else if (R >= MAXIT) B2 = 1; FLAG = 0; end end end end % end of if over H(ITEMP) end; % end of iteration loop % M2 entry; load output variables and return HIGH = M + 1; % Load the coefficients into A array A(Q1 + Q2*(((1:HIGH)-1)) + 1) = AA(1:HIGH); % Load REF with the final reference points REF(1:P) = Z((1:P) + 1); HMAX = MAXX; if (B3 < 0) EQUAL = -2; warning('polyfitinf:Collocation','polyfitinf terminates'); end if (B1 > 0) EQUAL = -2; warning('polyfitinf:NoAlternatingSigns','Alternating signs not observed'); end if (B2 > 0) EQUAL = 0; warning('polyfitinf:MaxIterationsReached','MAXIT was reached, current ref. set saved in REF.'); end % Reverse the order of A to make it compatible with MATLAB'S polyval() function. A = A(end:-1:1); endfunction % ****************************************** end of POLYFITINF function [H,Z,EQUAL] = exch(N, P, EPSH, H, Z) % function [H,Z,EQUAL] = exch(N, P, EPSH, H, Z) % % EXCH: exchange algorithm % % INPUT VARIABLES: % N : number of data points % P : number of reference points % EPSH : tolerance for leveling. % Z : old reference indices % % OUTPUT VARIABLES: % H : pointwise approximation errors % Z : new reference indices % EQUAL : EQUAL=1 : normal exchange % EQUAL=0 : old and new references are equal % % CREDITS: This routine was developed and modified as % computer assignments in Approximation Theory courses by % Prof. Andrew Knyazev, University of Colorado Denver, USA. % % Team Fall 98 (Revision 1.0): % Chanchai Aniwathananon % Crhistopher Mehl % David A. Duran % Saulo P. Oliveira % % Team Spring 11 (Revision 1.1): Manuchehr Aminian % % The algorithm and the comments are based on a FORTRAN code written % by Joseph C. Simpson. The code is available on Netlib repository: % http://www.netlib.org/toms/501 % See also: Communications of the ACM, V14, pp.355-356(1971) % % Revision 1.0 from 1998 is a direct human translation of % the FORTRAN code http://www.netlib.org/toms/501 % Revision 1.1 is a clean-up and technical update. % Tested on MATLAB Version 7.11.0.584 (R2010b) and % GNU Octave Version 3.2.4 % License: BSD % Copyright 1998-2011 Andrew V. Knyazev % $Revision: 1.1 $ $Date: 2011/05/17 $ % ************************************ beginning of exch EQUAL = 0; L = 0; ITEMP = Z(2); % SIG is arbitrarily chosen equal to the sign of the input % point. This will be adjusted later if necessary. if (H(ITEMP) <= 0) SIG = 1.; else SIG = -1.; end % The next loop prescans Z to insure it is a proper choice, i.e % resets Z if necessary so that maximum error points are chosen, % given the sign convention mentioned above. In order to work % properly, this section requires Z(1) = 0 and Z(P+2) = N + 1 . for I = 1:P MAXX = 0.; SIG = -SIG; ITEMP = I + 2; ZE = Z(ITEMP) - 1; LOW = Z(I) + 1; % Scan the open point interval containing only the 1th initial % reference point. In the interval pick the point with largest % magnitude and correct sign. Most of the sorting occurs in % this section. SIG contains the sign assumed for H(I). for J = LOW:ZE if (SIG*(H(J)-MAXX) > 0) MAXX = H(J); INDEX = J; end end ITEMP = I + 1; ITEMP = Z(ITEMP); MAXL = abs(MAXX); % If the MAX error is significantly greater than the % input point, switch to this point. if (abs( MAXX - H(ITEMP) )/MAXL > EPSH) ITEMP = I + 1; Z(ITEMP) = INDEX; L = 1; end end % MAXL = 0.; MAXR = 0.; ITEMP = P + 1; LOW = Z(ITEMP) + 1; % if (LOW <= N) % Find the error with largest abs value and proper sign % from among the points above the last reference point. % This section is necessary because the set of points % chosen may begin with the wrong sign alternation. for J = LOW:N if (SIG*(MAXR-H(J)) > 0) MAXR = H(J); INDR = J; end end end % Find the error with largest abs value and proper sign % from among the points below the 1st reference point. % This section is necessary by the same reason as above. ITEMP = Z(2); HZ1 = H(ITEMP); HIGH = ITEMP -1; if (HIGH > 0) if (HZ1 < 0) SIG = -1.; elseif (HZ1 == 0) SIG = 0.; else SIG = 1.; end for J = 1:HIGH if (SIG*(MAXL-H(J)) > 0) MAXL = H(J); INDL = J; end end end % MAXL and MAXR contain the magnitude of the significant % errors outside the reference point set. If either is % zero, the reference point set extends to the end point % on that side of the interval. MAXL = abs(MAXL); MAXR = abs(MAXR); HZ1 = abs(HZ1); ITEMP = P + 1; ITEMP = Z(ITEMP); HZP = abs(H(ITEMP)); % L = 0 implies that the previous prescan did not change % any points. If L = 0 and MAXL, MAXR are not significant % if compared with upper and lower reference point errors, % respectively, use the EQUAL exit. FLAG1 = 1; if (L == 0) if ( (MAXL == 0) || (EPSH >= (MAXL-HZP)/MAXL) ) if ( (MAXR == 0) || (EPSH >= (MAXR-HZ1)/MAXR) ) FLAG1 = 0; EQUAL = 1; end end end if ( (MAXL == 0) && (MAXR == 0) ) FLAG1 = 0; end if ( (MAXL > MAXR) && (MAXL <= HZP) && (MAXR < HZ1) ) FLAG1 = 0; end if ( (MAXL <= MAXR) && (MAXR <= HZ1) && (MAXL < HZP) ) FLAG1 = 0; end % If a point outside the present reference set must be % included, (i.e. the sign of the 1st point previously % assumed is wrong) shift to the side of largest % relative error first. if (FLAG1 == 1) FLAG2 = 1; if ( (MAXL > MAXR) && (MAXL > HZP) ) FLAG2 = 0; end if ( (MAXL <= MAXR) && (MAXR <= HZ1) ) FLAG2 = 0; end if (FLAG2 == 1) % SHR entry. This section inserts a point from % above the prescan point set INDEX = Z(2); % shift point set down, dropping the lowest point Z(2:P) = Z((2:P)+1); ITEMP = P + 1; % add the next high point Z(ITEMP)=INDR; % if MAXL > 0 replace reference points from the left, % stopping the 1st time the candidate for replacement % is greater than in magnitude than the prospective % replacee. Alternation of signs is preserved because % non-replacement immediately terminates the process. if (MAXL > 0) I = 2; FLAG3 = 1; while ( (I <= P) && (FLAG3 == 1) ) ITEMP = Z(I); if ( abs(H(INDL)) >= abs(H(ITEMP)) ) J = Z(I); Z(I) = INDL; INDL = INDEX; INDEX = J; else FLAG3 = 0; end I = I + 1; end end else % SHL entry. This section inserts a point from below the % prescan point set. ITEMP = P + 1 ; INDEX = Z(ITEMP); Z((2:P)+1) = Z(2:P); % store lowest point in Z(2) Z(2) = INDL; % if MAXR > 0 replace reference points from the right, % stopping the 1st time the candidate for replacement % is greater than in magnitude than the prospective % replacee. if (MAXR > 0) II = 2; FLAG3 = 1; while ( (II <= P) && (FLAG3 == 1) ) I = P + 2 - II; ITEMP = I + 1; HIGH = Z(ITEMP); if ( abs(H(INDR)) >= abs(H(HIGH)) ) J = Z(ITEMP); Z(ITEMP) = INDR; INDR = INDEX; INDEX = J; else FLAG3 = 0; end II = II + 1; end end end end endfunction % ****************************************** end of exch optim-1.4.1/inst/fmin.m0000644000175000017500000000200312477523726013367 0ustar olafolaf## Copyright (C) 2001 Paul Kienzle ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . function ret=fmin(varargin) persistent warned = false; if (! warned) warned = true; warning ("Octave:deprecated-function", "`fmin' has been deprecated, and will be removed in the future. Use `fminbnd' directly instead."); endif ## alias for fminbnd ret = fminbnd(varargin{:}); endfunction optim-1.4.1/inst/cg_min.m0000644000175000017500000002074412477523726013706 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2009 Levente Torok ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{x0},@var{v},@var{nev}]} cg_min ( @var{f},@var{df},@var{args},@var{ctl} ) ## NonLinear Conjugate Gradient method to minimize function @var{f}. ## ## @subheading Arguments ## @itemize @bullet ## @item @var{f} : string : Name of function. Return a real value ## @item @var{df} : string : Name of f's derivative. Returns a (R*C) x 1 vector ## @item @var{args}: cell : Arguments passed to f.@* ## @item @var{ctl} : 5-vec : (Optional) Control variables, described below ## @end itemize ## ## @subheading Returned values ## @itemize @bullet ## @item @var{x0} : matrix : Local minimum of f ## @item @var{v} : real : Value of f in x0 ## @item @var{nev} : 1 x 2 : Number of evaluations of f and of df ## @end itemize ## ## @subheading Control Variables ## @itemize @bullet ## @item @var{ctl}(1) : 1 or 2 : Select stopping criterion amongst : ## @item @var{ctl}(1)==0 : Default value ## @item @var{ctl}(1)==1 : Stopping criterion : Stop search when value doesn't ## improve, as tested by @math{ ctl(2) > Deltaf/max(|f(x)|,1) } ## where Deltaf is the decrease in f observed in the last iteration ## (each iteration consists R*C line searches). ## @item @var{ctl}(1)==2 : Stopping criterion : Stop search when updates are small, ## as tested by @math{ ctl(2) > max @{ dx(i)/max(|x(i)|,1) | i in 1..N @}} ## where dx is the change in the x that occured in the last iteration. ## @item @var{ctl}(2) : Threshold used in stopping tests. Default=10*eps ## @item @var{ctl}(2)==0 : Default value ## @item @var{ctl}(3) : Position of the minimized argument in args Default=1 ## @item @var{ctl}(3)==0 : Default value ## @item @var{ctl}(4) : Maximum number of function evaluations Default=inf ## @item @var{ctl}(4)==0 : Default value ## @item @var{ctl}(5) : Type of optimization: ## @item @var{ctl}(5)==1 : "Fletcher-Reves" method ## @item @var{ctl}(5)==2 : "Polak-Ribiere" (Default) ## @item @var{ctl}(5)==3 : "Hestenes-Stiefel" method ## @end itemize ## ## @var{ctl} may have length smaller than 4. Default values will be used if ctl is ## not passed or if nan values are given. ## @subheading Example: ## ## function r=df( l ) b=[1;0;-1]; r = -( 2*l@{1@} - 2*b + rand(size(l@{1@}))); endfunction @* ## function r=ff( l ) b=[1;0;-1]; r = (l@{1@}-b)' * (l@{1@}-b); endfunction @* ## ll = @{ [10; 2; 3] @}; @* ## ctl(5) = 3; @* ## [x0,v,nev]=cg_min( "ff", "df", ll, ctl ) @* ## ## Comment: In general, BFGS method seems to be better performin in many cases but requires more computation per iteration ## See also http://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient. ## @seealso{bfgsmin} ## @end deftypefn function [x,v,nev] = cg_min (f, dfn, args, ctl) verbose = 0; crit = 1; # Default control variables tol = 10*eps; narg = 1; maxev = inf; method = 2; if nargin >= 4, # Read arguments if !isnan (ctl(1)) && ctl(1) ~= 0, crit = ctl(1); end if length (ctl)>=2 && !isnan (ctl(2)) && ctl(2) ~= 0, tol = ctl(2); end if length (ctl)>=3 && !isnan (ctl(3)) && ctl(3) ~= 0, narg = ctl(3); end if length (ctl)>=4 && !isnan (ctl(4)) && ctl(4) ~= 0, maxev = ctl(4); end if length (ctl)>=5 && !isnan (ctl(5)) && ctl(5) ~= 0, method= ctl(5); end end if iscell (args), # List of arguments x = args{narg}; else # Single argument x = args; args = {args}; end if narg > length (args), # Check error ("cg_min : narg==%i, length (args)==%i\n", narg, length (args)); end [R, C] = size(x); N = R*C; x = reshape (x,N,1) ; nev = [0, 0]; v = feval (f, args); nev(1)++; dxn = lxn = dxn_1 = -feval( dfn, args ); nev(2)++; done = 0; ## TEMP ## tb = ts = zeros (1,100); # Control params for line search ctlb = [10*sqrt(eps), narg, maxev]; if crit == 2, ctlb(1) = tol; end x0 = x; v0 = v; nline = 0; while nev(1) <= maxev , ## xprev = x ; ctlb(3) = maxev - nev(1); # Update # of evals ## wiki alg 4. [alpha, vnew, nev0] = brent_line_min (f, dxn, args, ctlb); nev += nev0; ## wiki alg 5. x = x + alpha * dxn; if nline >= N, if crit == 1, done = tol > (v0 - vnew) / max (1, abs (v0)); else done = tol > norm ((x-x0)(:)); end nline = 1; x0 = x; v0 = vnew; else nline++; end if done || nev(1) >= maxev, return end if vnew > v + eps , printf("cg_min: step increased cost function\n"); keyboard end # if abs(1-(x-xprev)'*dxn/norm(dxn)/norm(x-xprev))>1000*eps, # printf("cg_min: step is not in the right direction\n"); # keyboard # end # update x at the narg'th position of args cellarray args{narg} = reshape (x, R, C); v = feval (f, args); nev(1)++; if verbose, printf("cg_min : nev=%4i, v=%8.3g\n",nev(1),v) ; end ## wiki alg 1: dxn = -feval (dfn, args); nev(2)++; # wiki alg 2: switch method case 1 # Fletcher-Reenves method nu = dxn' * dxn; de = dxn_1' * dxn_1; case 2 # Polak-Ribiere method nu = (dxn-dxn_1)' * dxn; de = dxn_1' * dxn_1; case 3 # Hestenes-Stiefel method nu = (dxn-dxn_1)' * dxn; de = (dxn-dxn_1)' * lxn; otherwise error("No method like this"); endswitch if nu == 0, return endif if de == 0, error("Numerical instability!"); endif beta = nu / de; beta = max( 0, beta ); ## wiki alg 3. update dxn, lxn, point dxn_1 = dxn; dxn = lxn = dxn_1 + beta*lxn ; end if verbose, printf ("cg_min: Too many evaluatiosn!\n"); end endfunction %!demo %! P = 15; # Number of parameters %! R = 20; # Number of observations (must have R >= P) %! %! obsmat = randn (R, P); %! truep = randn (P, 1); %! xinit = randn (P, 1); %! obses = obsmat * truep; %! %! msq = @(x) mean (x (!isnan(x)).^2); %! ff = @(x) msq (obses - obsmat * x{1}) + 1; %! dff = @(x) 2 / rows (obses) * obsmat.' * (-obses + obsmat * x{1}); %! %! tic; %! [xlev,vlev,nlev] = cg_min (ff, dff, xinit) ; %! toc; %! %! printf (" Costs : init=%8.3g, final=%8.3g, best=%8.3g\n", ... %! ff ({xinit}), vlev, ff ({truep})); %! %! if (max (abs (xlev-truep)) > 100*sqrt (eps)) %! printf ("Error is too big : %8.3g\n", max (abs (xlev-truep))); %! else %! printf ("All tests ok\n"); %! endif %!demo %! N = 1 + floor (30 * rand ()); %! truemin = randn (N, 1); %! offset = 100 * randn (); %! metric = randn (2 * N, N); %! metric = metric.' * metric; %! %! if (N > 1) %! [u,d,v] = svd (metric); %! d = (0.1+[0:(1/(N-1)):1]).^2; %! metric = u * diag (d) * u.'; %! endif %! %! testfunc = @(x) sum((x{1}-truemin)'*metric*(x{1}-truemin)) + offset; %! dtestf = @(x) metric' * 2*(x{1}-truemin); %! %! xinit = 10 * randn (N, 1); %! %! [x, v, niter] = cg_min (testfunc, dtestf, xinit); %! %! if (any (abs (x-truemin) > 100 * sqrt(eps))) %! printf ("NOT OK 1\n"); %! else %! printf ("OK 1\n"); %! endif %! %! if (v-offset > 1e-8) %! printf ("NOT OK 2\n"); %! else %! printf ("OK 2\n"); %! endif %! %! printf ("nev=%d N=%d errx=%8.3g errv=%8.3g\n",... %! niter (1), N, max (abs (x-truemin)), v-offset); %!demo %! P = 2; # Number of parameters %! R = 3; # Number of observations %! %! obsmat = randn (R, P); %! truep = randn (P, 1); %! xinit = randn (P, 1); %! %! obses = obsmat * truep; %! %! msq = @(x) mean (x (!isnan(x)).^2); %! ff = @(xx) msq (xx{3} - xx{2} * xx{1}) + 1; %! dff = @(xx) 2 / rows(xx{3}) * xx{2}.' * (-xx{3} + xx{2}*xx{1}); %! %! tic; %! x = {xinit, obsmat, obses}; %! [xlev, vlev, nlev] = cg_min (ff, dff, x); %! toc; %! %! xinit_ = {xinit, obsmat, obses}; %! xtrue_ = {truep, obsmat, obses}; %! printf (" Costs : init=%8.3g, final=%8.3g, best=%8.3g\n", ... %! ff (xinit_), vlev, ff (xtrue_)); %! %! if (max (abs(xlev-truep)) > 100*sqrt (eps)) %! printf ("Error is too big : %8.3g\n", max (abs (xlev-truep))); %! else %! printf ("All tests ok\n"); %! endif optim-1.4.1/inst/mdsmax.m0000644000175000017500000001665512477523726013751 0ustar olafolaf%% Copyright (C) 2002 N.J.Higham %% Copyright (C) 2003 Andy Adler %% %% This program is free software; you can redistribute it and/or modify it under %% the terms of the GNU General Public License as published by the Free Software %% Foundation; either version 3 of the License, or (at your option) any later %% version. %% %% This program is distributed in the hope that it will be useful, but WITHOUT %% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or %% FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more %% details. %% %% You should have received a copy of the GNU General Public License along with %% this program; if not, see . %%MDSMAX Multidirectional search method for direct search optimization. %% [x, fmax, nf] = MDSMAX(FUN, x0, STOPIT, SAVIT) attempts to %% maximize the function FUN, using the starting vector x0. %% The method of multidirectional search is used. %% Output arguments: %% x = vector yielding largest function value found, %% fmax = function value at x, %% nf = number of function evaluations. %% The iteration is terminated when either %% - the relative size of the simplex is <= STOPIT(1) %% (default 1e-3), %% - STOPIT(2) function evaluations have been performed %% (default inf, i.e., no limit), or %% - a function value equals or exceeds STOPIT(3) %% (default inf, i.e., no test on function values). %% The form of the initial simplex is determined by STOPIT(4): %% STOPIT(4) = 0: regular simplex (sides of equal length, the default), %% STOPIT(4) = 1: right-angled simplex. %% Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). %% If a non-empty fourth parameter string SAVIT is present, then %% `SAVE SAVIT x fmax nf' is executed after each inner iteration. %% NB: x0 can be a matrix. In the output argument, in SAVIT saves, %% and in function calls, x has the same shape as x0. %% MDSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional %% arguments to be passed to fun, via feval(fun,x,P1,P2,...). %% %% This implementation uses 2n^2 elements of storage (two simplices), where x0 %% is an n-vector. It is based on the algorithm statement in [2, sec.3], %% modified so as to halve the storage (with a slight loss in readability). %% %% References: %% [1] V. J. Torczon, Multi-directional search: A direct search algorithm for %% parallel machines, Ph.D. Thesis, Rice University, Houston, Texas, 1989. % [2] V. J. Torczon, On the convergence of the multidirectional search %% algorithm, SIAM J. Optimization, 1 (1991), pp. 123-145. %% [3] N. J. Higham, Optimization by direct search in matrix computations, %% SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. %% [4] N. J. Higham, Accuracy and Stability of Numerical Algorithms, %% Second edition, Society for Industrial and Applied Mathematics, %% Philadelphia, PA, 2002; sec. 20.5. % From Matrix Toolbox % Copyright (C) 2002 N.J.Higham % www.maths.man.ac.uk/~higham/mctoolbox % Modifications for octave by A.Adler 2003 function [x, fmax, nf] = mdsmax(fun, x, stopit, savit, varargin) x0 = x(:); % Work with column vector internally. n = length(x0); mu = 2; % Expansion factor. theta = 0.5; % Contraction factor. % Set up convergence parameters etc. if nargin < 3 stopit(1) = 1e-3; elseif isempty(stopit) stopit(1) = 1e-3; endif tol = stopit(1); % Tolerance for cgce test based on relative size of simplex. if length(stopit) == 1, stopit(2) = inf; end % Max no. of f-evaluations. if length(stopit) == 2, stopit(3) = inf; end % Default target for f-values. if length(stopit) == 3, stopit(4) = 0; end % Default initial simplex. if length(stopit) == 4, stopit(5) = 1; end % Default: show progress. trace = stopit(5); if length(stopit) == 5, stopit(6) = 1; end % Default: maximize dirn= stopit(6); if nargin < 4, savit = []; end % File name for snapshots. V = [zeros(n,1) eye(n)]; T = V; f = zeros(n+1,1); ft = f; V(:,1) = x0; f(1) = dirn*feval(fun,x,varargin{:}); fmax_old = f(1); if trace, fprintf('f(x0) = %9.4e\n', f(1)), end k = 0; m = 0; % Set up initial simplex. scale = max(norm(x0,inf),1); if stopit(4) == 0 % Regular simplex - all edges have same length. % Generated from construction given in reference [18, pp. 80-81] of [1]. alpha = scale / (n*sqrt(2)) * [ sqrt(n+1)-1+n sqrt(n+1)-1 ]; V(:,2:n+1) = (x0 + alpha(2)*ones(n,1)) * ones(1,n); for j=2:n+1 V(j-1,j) = x0(j-1) + alpha(1); x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end else % Right-angled simplex based on co-ordinate axes. alpha = scale*ones(n+1,1); for j=2:n+1 V(:,j) = x0 + alpha(j)*V(:,j); x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end end nf = n+1; size = 0; % Integer that keeps track of expansions/contractions. flag_break = 0; % Flag which becomes true when ready to quit outer loop. while 1 %%%%%% Outer loop. k = k+1; % Find a new best vertex x and function value fmax = f(x). [fmax,j] = max(f); V(:,[1 j]) = V(:,[j 1]); v1 = V(:,1); if ~isempty(savit), x(:) = v1; eval(['save ' savit ' x fmax nf']), end f([1 j]) = f([j 1]); if trace fprintf('Iter. %2.0f, inner = %2.0f, size = %2.0f, ', k, m, size) fprintf('nf = %3.0f, f = %9.4e (%2.1f%%)\n', nf, fmax, ... 100*(fmax-fmax_old)/(abs(fmax_old)+eps)) end fmax_old = fmax; % Stopping Test 1 - f reached target value? if fmax >= stopit(3) msg = ['Exceeded target...quitting\n']; break % Quit. end m = 0; while 1 %%% Inner repeat loop. m = m+1; % Stopping Test 2 - too many f-evals? if nf >= stopit(2) msg = ['Max no. of function evaluations exceeded...quitting\n']; flag_break = 1; break % Quit. end % Stopping Test 3 - converged? This is test (4.3) in [1]. size_simplex = norm(V(:,2:n+1)- v1(:,ones(1,n)),1) / max(1, norm(v1,1)); if size_simplex <= tol msg = sprintf('Simplex size %9.4e <= %9.4e...quitting\n', ... size_simplex, tol); flag_break = 1; break % Quit. end for j=2:n+1 % ---Rotation (reflection) step. T(:,j) = 2*v1 - V(:,j); x(:) = T(:,j); ft(j) = dirn*feval(fun,x,varargin{:}); end nf = nf + n; replaced = ( max(ft(2:n+1)) > fmax ); if replaced for j=2:n+1 % ---Expansion step. V(:,j) = (1-mu)*v1 + mu*T(:,j); x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end nf = nf + n; % Accept expansion or rotation? if max(ft(2:n+1)) > max(f(2:n+1)) V(:,2:n+1) = T(:,2:n+1); f(2:n+1) = ft(2:n+1); % Accept rotation. else size = size + 1; % Accept expansion (f and V already set). end else for j=2:n+1 % ---Contraction step. V(:,j) = (1+theta)*v1 - theta*T(:,j); x(:) = V(:,j); f(j) = dirn*feval(fun,x,varargin{:}); end nf = nf + n; replaced = ( max(f(2:n+1)) > fmax ); % Accept contraction (f and V already set). size = size - 1; end if replaced, break, end if (trace && rem(m, 10) == 0) fprintf(' ...inner = %2.0f...\n', m); end end %%% Of inner repeat loop. if flag_break, break, end end %%%%%% Of outer loop. % Finished. if trace, fprintf(msg), end x(:) = v1; optim-1.4.1/inst/residmin_stat.m0000644000175000017500000000537412477523726015321 0ustar olafolaf## Copyright (C) 2011-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {@var{info} =} residmin_stat (@var{f}, @var{p}, @var{settings}) ## Frontend for computation of statistics for a residual-based ## minimization. ## ## @var{settings} is a structure whose fields can be set by ## @code{optimset}. With @var{settings} the computation of certain ## statistics is requested by setting the fields ## @code{ret_} to @code{true}. The respective ## statistics will be returned in a structure as fields with name ## @code{}. Depending on the requested statistic and ## on the additional information provided in @var{settings}, @var{f} and ## @var{p} may be empty. Otherwise, @var{f} is the model function of an ## optimization (the interface of @var{f} is described e.g. in ## @code{nonlin_residmin}, please see there), and @var{p} is a real ## column vector with parameters resulting from the same optimization. ## ## Currently, the following statistics (or general information) can be ## requested: ## ## @code{dfdp}: Jacobian of model function with respect to parameters. ## ## @code{covd}: Covariance matrix of data (typically guessed by applying ## a factor to the covariance matrix of the residuals). ## ## @code{covp}: Covariance matrix of final parameters. ## ## @code{corp}: Correlation matrix of final parameters. ## ## @c The following block will be cut out in the package info file. ## @c BEGIN_CUT_TEXINFO ## ## For further settings, type @code{optim_doc ("residmin_stat")}. ## ## For desription of structure-based parameter handling, type ## @code{optim_doc ("parameter structures")}. ## ## For backend information, type @code{optim_doc ("residual ## optimization")} and choose the backends type in the menu. ## ## @c END_CUT_TEXINFO ## ## @seealso {curvefit_stat} ## @end deftypefn ## PKG_ADD: __all_opts__ ("residmin_stat"); function ret = residmin_stat (varargin) if (nargin == 1) ret = __residmin_stat__ (varargin{1}); return; endif if (nargin != 3) print_usage (); endif varargin{4} = struct (); ret = __residmin_stat__ (varargin{:}); endfunction optim-1.4.1/inst/dfdp.m0000644000175000017500000000511412477523726013361 0ustar olafolaf## Copyright (C) 1992-1994 Richard Shrager ## Copyright (C) 1992-1994 Arthur Jutan ## Copyright (C) 1992-1994 Ray Muzic ## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## function prt = dfdp (x, f, p, dp, func[, bounds]) ## numerical partial derivatives (Jacobian) df/dp for use with leasqr ## --------INPUT VARIABLES--------- ## x=vec or matrix of indep var(used as arg to func) x=[x0 x1 ....] ## f=func(x,p) vector initialsed by user before each call to dfdp ## p= vec of current parameter values ## dp= fractional increment of p for numerical derivatives ## dp(j)>0 central differences calculated ## dp(j)<0 one sided differences calculated ## dp(j)=0 sets corresponding partials to zero; i.e. holds p(j) fixed ## func=function (string or handle) to calculate the Jacobian for, ## e.g. to calc Jacobian for function expsum prt=dfdp(x,f,p,dp,'expsum') ## bounds=two-column-matrix of lower and upper bounds for parameters ## If no 'bounds' options is specified to leasqr, it will call ## dfdp without the 'bounds' argument. ##----------OUTPUT VARIABLES------- ## prt= Jacobian Matrix prt(i,j)=df(i)/dp(j) ##================================ ## ## dfxpdp is more general and is meant to be used instead of dfdp in ## optimization. function prt = dfdp (x, f, p, dp, func, bounds) ## This is just an interface. The original code has been moved to ## __dfdp__.m, which is used with two different interfaces by ## leasqr.m. ## if (ischar (varargin{5})) ## varargin{5} = @ (p) str2func (varargin{5}) (varargin{1}, p); ## else ## varargin{5} = @ (p) varargin{5} (varargin{1}, p); ## endif if (ischar (func)) func = @ (p) str2func (func) (x, p); else func = @ (p) func (x, p); endif hook.f = f; if (nargin > 5) hook.lbound = bounds(:, 1); hook.ubound = bounds(:, 2); endif hook.diffp = abs (dp); hook.fixed = dp == 0; hook.diff_onesided = dp < 0; prt = __dfdp__ (p, func, hook); endfunction optim-1.4.1/inst/nrm.m0000644000175000017500000000261212477523726013240 0ustar olafolaf## Copyright (C) 2000 Ben Sapp ## Copyright (C) 2002 Paul Kienzle ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {@var{xmin} =} nrm (@var{f},@var{x0}) ## Using @var{x0} as a starting point find a minimum of the scalar ## function @var{f}. The Newton-Raphson method is used. ## @end deftypefn ## Reference: David G Luenberger's Linear and Nonlinear Programming function x = nrm(f,x,varargin) velocity = 1; acceleration = 1; h = 0.01; while(abs(velocity) > 0.0001) fx = feval(f,x,varargin{:}); fxph = feval(f,x+h,varargin{:}); fxmh = feval(f,x-h,varargin{:}); velocity = (fxph - fxmh)/(2*h); acceleration = (fxph - 2*fx + fxmh)/(h^2); x = x - velocity/abs(acceleration); endwhile endfunction optim-1.4.1/inst/polyconf.m0000644000175000017500000001265112477523726014301 0ustar olafolaf## Author: Paul Kienzle ## This program is granted to the public domain. ## [y,dy] = polyconf(p,x,s) ## ## Produce prediction intervals for the fitted y. The vector p ## and structure s are returned from polyfit or wpolyfit. The ## x values are where you want to compute the prediction interval. ## ## polyconf(...,['ci'|'pi']) ## ## Produce a confidence interval (range of likely values for the ## mean at x) or a prediction interval (range of likely values ## seen when measuring at x). The prediction interval tells ## you the width of the distribution at x. This should be the same ## regardless of the number of measurements you have for the value ## at x. The confidence interval tells you how well you know the ## mean at x. It should get smaller as you increase the number of ## measurements. Error bars in the physical sciences usually show ## a 1-alpha confidence value of erfc(1/sqrt(2)), representing ## one standandard deviation of uncertainty in the mean. ## ## polyconf(...,1-alpha) ## ## Control the width of the interval. If asking for the prediction ## interval 'pi', the default is .05 for the 95% prediction interval. ## If asking for the confidence interval 'ci', the default is ## erfc(1/sqrt(2)) for a one standard deviation confidence interval. ## ## Example: ## [p,s] = polyfit(x,y,1); ## xf = linspace(x(1),x(end),150); ## [yf,dyf] = polyconf(p,xf,s,'ci'); ## plot(xf,yf,'g-;fit;',xf,yf+dyf,'g.;;',xf,yf-dyf,'g.;;',x,y,'xr;data;'); ## plot(x,y-polyval(p,x),';residuals;',xf,dyf,'g-;;',xf,-dyf,'g-;;'); function [y,dy] = polyconf(p,x,varargin) alpha = s = []; typestr = 'pi'; for i=1:length(varargin) v = varargin{i}; if isstruct(v), s = v; elseif ischar(v), typestr = v; elseif isscalar(v), alpha = v; else s = []; end end if (nargout>1 && (isempty(s)||nargin<3)) || nargin < 2 print_usage; end if isempty(s) y = polyval(p,x); else ## For a polynomial fit, x is the set of powers ( x^n ; ... ; 1 ). n=length(p)-1; k=length(x(:)); if columns(s.R) == n, ## fit through origin A = (x(:) * ones (1, n)) .^ (ones (k, 1) * (n:-1:1)); p = p(1:n); else A = (x(:) * ones (1, n+1)) .^ (ones (k, 1) * (n:-1:0)); endif y = dy = x; [y(:),dy(:)] = confidence(A,p,s,alpha,typestr); end end %!test %! # data from Hocking, RR, "Methods and Applications of Linear Models" %! temperature=[40;40;40;45;45;45;50;50;50;55;55;55;60;60;60;65;65;65]; %! strength=[66.3;64.84;64.36;69.70;66.26;72.06;73.23;71.4;68.85;75.78;72.57;76.64;78.87;77.37;75.94;78.82;77.13;77.09]; %! [p,s] = polyfit(temperature,strength,1); %! [y,dy] = polyconf(p,40,s,0.05,'ci'); %! assert([y,dy],[66.15396825396826,1.71702862681486],200*eps); %! [y,dy] = polyconf(p,40,s,0.05,'pi'); %! assert(dy,4.45345484470743,200*eps); ## [y,dy] = confidence(A,p,s) ## ## Produce prediction intervals for the fitted y. The vector p ## and structure s are returned from wsolve. The matrix A is ## the set of observation values at which to evaluate the ## confidence interval. ## ## confidence(...,['ci'|'pi']) ## ## Produce a confidence interval (range of likely values for the ## mean at x) or a prediction interval (range of likely values ## seen when measuring at x). The prediction interval tells ## you the width of the distribution at x. This should be the same ## regardless of the number of measurements you have for the value ## at x. The confidence interval tells you how well you know the ## mean at x. It should get smaller as you increase the number of ## measurements. Error bars in the physical sciences usually show ## a 1-alpha confidence value of erfc(1/sqrt(2)), representing ## one standandard deviation of uncertainty in the mean. ## ## confidence(...,1-alpha) ## ## Control the width of the interval. If asking for the prediction ## interval 'pi', the default is .05 for the 95% prediction interval. ## If asking for the confidence interval 'ci', the default is ## erfc(1/sqrt(2)) for a one standard deviation confidence interval. ## ## Confidence intervals for linear system are given by: ## x' p +/- sqrt( Finv(1-a,1,df) var(x' p) ) ## where for confidence intervals, ## var(x' p) = sigma^2 (x' inv(A'A) x) ## and for prediction intervals, ## var(x' p) = sigma^2 (1 + x' inv(A'A) x) ## ## Rather than A'A we have R from the QR decomposition of A, but ## R'R equals A'A. Note that R is not upper triangular since we ## have already multiplied it by the permutation matrix, but it ## is invertible. Rather than forming the product R'R which is ## ill-conditioned, we can rewrite x' inv(A'A) x as the equivalent ## x' inv(R) inv(R') x = t t', for t = x' inv(R) ## Since x is a vector, t t' is the inner product sumsq(t). ## Note that LAPACK allows us to do this simultaneously for many ## different x using sqrt(sumsq(X/R,2)), with each x on a different row. ## ## Note: sqrt(F(1-a;1,df)) = T(1-a/2;df) ## ## For non-linear systems, use x = dy/dp and ignore the y output. function [y,dy] = confidence(A,p,S,alpha,typestr) if nargin < 4, alpha = []; end if nargin < 5, typestr = 'ci'; end y = A*p(:); switch typestr, case 'ci', pred = 0; default_alpha=erfc(1/sqrt(2)); case 'pi', pred = 1; default_alpha=0.05; otherwise, error("use 'ci' or 'pi' for interval type"); end if isempty(alpha), alpha = default_alpha; end s = tinv(1-alpha/2,S.df)*S.normr/sqrt(S.df); dy = s*sqrt(pred+sumsq(A/S.R,2)); end optim-1.4.1/inst/cdiff.m0000644000175000017500000001271412477523726013523 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## c = cdiff (func,wrt,N,dfunc,stack,dx) - Code for num. differentiation ## = "function df = dfunc (var1,..,dvar,..,varN) .. endfunction ## ## Returns a string of octave code that defines a function 'dfunc' that ## returns the derivative of 'func' with respect to it's 'wrt'th ## argument. ## ## The derivatives are obtained by symmetric finite difference. ## ## dfunc()'s return value is in the same format as that of ndiff() ## ## func : string : name of the function to differentiate ## ## wrt : int : position, in argument list, of the differentiation ## variable. Default:1 ## ## N : int : total number of arguments taken by 'func'. ## If N=inf, dfunc will take variable argument list. ## Default:wrt ## ## dfunc : string : Name of the octave function that returns the ## derivatives. Default:['d',func] ## ## stack : string : Indicates whether 'func' accepts vertically ## (stack="rstack") or horizontally (stack="cstack") ## arguments. Any other string indicates that 'func' ## does not allow stacking. Default:'' ## ## dx : real : Step used in the symmetric difference scheme. ## Default:10*sqrt(eps) ## ## See also : ndiff, eval, todisk function c = cdiff (func,wrt,nargs,dfunc,stack,dx) if nargin<2, wrt = 1 ; end if nargin<3, nargs = wrt ; end if nargin<4 || strcmp(dfunc,""), dfunc = ["d",func] ; if exist(dfunc)>=2, printf(["cdiff : Warning : name of derivative not specified\n",... " and canonic name '%s' is already taken\n"],... dfunc); ## keyboard end end if nargin<5, stack = "" ; end if nargin<6, dx = 10*sqrt(eps) ; end ## verbose = 0 ; ## build argstr = "var1,..,dvar,...var_nargs" if isfinite (nargs) argstr = sprintf("var%i,",1:nargs); else argstr = [sprintf("var%i,",1:wrt),"...,"]; end argstr = strrep(argstr,sprintf("var%i",wrt),"dvar") ; argstr = argstr(1:length(argstr)-1) ; if strcmp("cstack",stack) , # Horizontal stacking ################ calstr = "reshape (kron(ones(1,2*ps), dvar(:))+[-dx*eye(ps),dx*eye(ps)], sz.*[1,2*ps])"; calstr = strrep(argstr,"dvar",calstr) ; calstr = sprintf("%s(%s)",func,calstr) ; calstr = sprintf(strcat(" res = %s;\n", " pr = prod (size(res)) / (2*ps);\n", " res = reshape (res,pr,2*ps);\n", " df = (res(:,ps+1:2*ps)-res(:,1:ps)) / (2*dx);\n"), calstr) ; elseif strcmp("rstack",stack), # Vertical stacking ################## calstr = "kron(ones(2*ps,1),dvar)+dx*[-dv;dv]" ; calstr = strrep(argstr,"dvar",calstr) ; calstr = sprintf("%s(%s)",func,calstr) ; calstr = sprintf(strcat(" dv = kron (eye(sz(2)), eye(sz(1))(:));\n",... " res = %s;\n",... " sr = size(res)./[2*ps,1];\n",... " pr = prod (sr);\n",... " df = (res(sr(1)*ps+1:2*sr(1)*ps,:)-res(1:sr(1)*ps,:))/(2*dx);\n",... " scramble = reshape (1:pr,sr(2),sr(1))';\n",... " df = reshape (df',pr,ps)(scramble(:),:);\n"),... calstr) ; ## sayif(verbose,"cdiff : calstr='%s'\n",calstr) ; else # No stacking ######################## calstr = sprintf("%s (%s)",func,argstr) ; ## "func(var1,dvar%sdv(:,%d:%d),...,varN)," ## calstr = strrep(calstr,"dvar","dvar%sdv(:,(i-1)*sz(2)+1:i*sz(2))")(:)'; calstr = strrep(calstr,"dvar","dvar%sdv")(:)'; ## func(..,dvar+dv(:,1:sz(2)),..) - func(..) calstr = strcat(calstr,"-",calstr) ; ## strcat(calstr,"-",calstr) ; calstr = sprintf(calstr,"+","-") ; tmp = calstr ; ## sayif(verbose,"cdiff : calstr='%s'\n",calstr) ; calstr = sprintf(strcat(" dv = zeros (sz); dv(1) = dx;\n",... " df0 = %s;\n",... " sr = size (df0);\n",... " df = zeros(prod (sr),ps); df(:,1) = df0(:);\n",... " for i = 2:ps,\n",... " dv(i) = dx; dv(i-1) = 0;\n",... " df(:,i) = (%s)(:);\n",... " end;\n",... " df ./= 2*dx;\n" ), calstr, tmp) ; ## sayif(verbose,"cdiff : calstr='%s'\n",calstr) ; ## "func(var1,reshape(dvar(1:NV,1),SZ1,SZ2),...,varN)," , ## "func(var1,reshape(dvar(1:NV,2),SZ1,SZ2),...,varN)," , ... ## "func(var1,reshape(dvar(1:NV,NP),SZ1,SZ2),...,varN)" ## sayif(verbose,"cdiff : calstr='%s'\n",calstr) ; end argstr = strrep (argstr, "...", "varargin"); calstr = strrep (calstr, "...", "varargin{:}"); c = sprintf(strcat("function df = %s (%s)\n",... " ## Numerical differentiation of '%s' wrt to it's %d'th argument\n",... " ## This function has been written by 'cdiff()'\n",... " dx = %e;\n",... " sz = size (dvar);\n",... " ps = prod (sz);\n",... "%s",... "endfunction\n"),... dfunc,argstr,... func,wrt,... dx,... calstr) ; optim-1.4.1/inst/LinearRegression.m0000644000175000017500000001142012477523726015714 0ustar olafolaf## Copyright (C) 2007-2013 Andreas Stahel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ##@deftypefn {Function File} {[@var{p},@var{e_var},@var{r},@var{p_var},@var{y_var}] =} LinearRegression (@var{F},@var{y}) ##@deftypefnx {Function File} {[@var{p},@var{e_var},@var{r},@var{p_var},@var{y_var}] =} LinearRegression (@var{F},@var{y},@var{w}) ## ## ## general linear regression ## ## determine the parameters p_j (j=1,2,...,m) such that the function ## f(x) = sum_(i=1,...,m) p_j*f_j(x) is the best fit to the given values y_i = f(x_i) ## ## parameters: ## @itemize ## @item @var{F} is an n*m matrix with the values of the basis functions at ## the support points. In column j give the values of f_j at the points ## x_i (i=1,2,...,n) ## @item @var{y} is a column vector of length n with the given values ## @item @var{w} is n column vector of of length n vector with the weights of data points ##@end itemize ## ## return values: ## @itemize ## @item @var{p} is the vector of length m with the estimated values of the parameters ## @item @var{e_var} is the estimated variance of the difference between fitted and measured values ## @item @var{r} is the weighted norm of the residual ## @item @var{p_var} is the estimated variance of the parameters p_j ## @item @var{y_var} is the estimated variance of the dependend variables ##@end itemize ## ## Caution: ## do NOT request @var{y_var} for large data sets, as a n by n matrix is ## generated ## ## @c Will be cut out in optims info file and replaced with the same ## @c refernces explicitely there, since references to core Octave ## @c functions are not automatically transformed from here to there. ## @c BEGIN_CUT_TEXINFO ## @seealso{regress,leasqr,nonlin_curvefit,polyfit,wpolyfit,expfit} ## @c END_CUT_TEXINFO ## @end deftypefn function [p,e_var,r,p_var,y_var] = LinearRegression (F,y,weight) if (nargin < 2 || nargin >= 4) usage ('wrong number of arguments in [p,e_var,r,p_var,y_var] = LinearRegression(F,y)'); endif [rF, cF] = size (F); [ry, cy] = size (y); if (rF != ry || cy > 1) error ('LinearRegression: incorrect matrix dimensions'); endif if (nargin == 2) % set uniform weights if not provided weight = ones (size (y)); endif wF = diag (weight) * F; % this now efficent with the diagonal matrix %wF = F; %for j = 1:cF % wF(:,j) = weight.*F(:,j); %end [Q,R] = qr (wF,0); % estimate the values of the parameters p = R \ (Q' * (weight.*y)); # Compute the residual vector and its weighted norm residual = F * p - y; r = norm (weight .* residual); # Variance of the weighted residuals e_var = sum ((residual.^2) .* (weight.^4)) / (rF-cF); # Compute variance of parameters, only if requested if nargout > 3 M = inv (R) * Q' * diag(weight); # compute variance of the dependent variable, only if requested if nargout > 4 %% WARNING the nonsparse matrix M2 is of size rF by rF, rF = number of data points M2 = F * M; M2 = M2 .* M2; % square each entry in the matrix M2 y_var = e_var ./ (weight.^4) + M2 * (e_var./(weight.^4)); % variance of y values endif M = M .* M; % square each entry in the matrix M p_var = M * (e_var./(weight.^4)); % variance of the parameters endif endfunction %!demo %! n = 100; %! x = sort(rand(n,1)*5-1); %! y = 1+0.05*x + 0.1*randn(size(x)); %! F = [ones(n,1),x(:)]; %! [p,e_var,r,p_var,y_var] = LinearRegression(F,y); %! yFit = F*p; %! figure() %! plot(x,y,'+b',x,yFit,'-g',x,yFit+1.96*sqrt(y_var),'--r',x,yFit-1.96*sqrt(y_var),'--r') %! title('straight line by linear regression') %! legend('data','fit','+/-95%') %! grid on %!demo %! n = 100; %! x = sort(rand(n,1)*5-1); %! y = 1+0.5*sin(x) + 0.1*randn(size(x)); %! F = [ones(n,1),sin(x(:))]; %! [p,e_var,r,p_var,y_var] = LinearRegression(F,y); %! yFit = F*p; %! figure() %! plot(x,y,'+b',x,yFit,'-g',x,yFit+1.96*sqrt(y_var),'--r',x,yFit-1.96*sqrt(y_var),'--r') %! title('y = p1 + p2*sin(x) by linear regression') %! legend('data','fit','+/-95%') %! grid on optim-1.4.1/inst/battery.m0000644000175000017500000000323712477523726014122 0ustar olafolaf## Copyright (C) 2004 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## battery.m: repeatedly call bfgs using a battery of ## start values, to attempt to find global min ## of a nonconvex function ## ## INPUTS: ## func: function to mimimize ## args: args of function ## minarg: argument to minimize w.r.t. (usually = 1) ## startvals: kxp matrix of values to try for sure (don't include all zeros, that's automatic) ## max iters per start value ## number of additional random start values to try ## # OUTPUT: theta - the best value found - NOT iterated to convergence function theta = battery(func, args, minarg, startvals, maxiters) # setup [k,trials] = size(startvals); bestobj = inf; besttheta = zeros(k,1); bfgscontrol = {maxiters,0,0,1}; # now try the supplied start values, and optionally the random start values for i = 1:trials args{minarg} = startvals(:,i); [theta, obj_value, convergence] = bfgsmin (func, args, bfgscontrol); if obj_value < bestobj besttheta = theta; bestobj = obj_value; endif endfor theta = besttheta; endfunction optim-1.4.1/inst/wpolyfit.m0000644000175000017500000001742112477523726014325 0ustar olafolaf## Author: Paul Kienzle ## This program is granted to the public domain. ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{p}, @var{s}] =} wpolyfit (@var{x}, @var{y}, @var{dy}, @var{n}) ## Return the coefficients of a polynomial @var{p}(@var{x}) of degree ## @var{n} that minimizes ## @c ######################################################## ## @c These lines must be written without space at start to work around ## @c a bug in html generation. ##@iftex ##@tex ##$$ ##\sum_{i=1}^N (p(x_i) - y_i)^2 ##$$ ##@end tex ##@end iftex ##@ifnottex ##@code{sumsq (p(x(i)) - y(i))}, ##@end ifnottex ## @c ######################################################## ## to best fit the data in the least squares sense. The standard error ## on the observations @var{y} if present are given in @var{dy}. ## ## The returned value @var{p} contains the polynomial coefficients ## suitable for use in the function polyval. The structure @var{s} returns ## information necessary to compute uncertainty in the model. ## ## To compute the predicted values of y with uncertainty use ## @example ## [y,dy] = polyconf(p,x,s,'ci'); ## @end example ## You can see the effects of different confidence intervals and ## prediction intervals by calling the wpolyfit internal plot ## function with your fit: ## @example ## feval('wpolyfit:plt',x,y,dy,p,s,0.05,'pi') ## @end example ## Use @var{dy}=[] if uncertainty is unknown. ## ## You can use a chi^2 test to reject the polynomial fit: ## @example ## p = 1-chi2cdf(s.normr^2,s.df); ## @end example ## p is the probability of seeing a chi^2 value higher than that which ## was observed assuming the data are normally distributed around the fit. ## If p < 0.01, you can reject the fit at the 1% level. ## ## You can use an F test to determine if a higher order polynomial ## improves the fit: ## @example ## [poly1,S1] = wpolyfit(x,y,dy,n); ## [poly2,S2] = wpolyfit(x,y,dy,n+1); ## F = (S1.normr^2 - S2.normr^2)/(S1.df-S2.df)/(S2.normr^2/S2.df); ## p = 1-f_cdf(F,S1.df-S2.df,S2.df); ## @end example ## p is the probability of observing the improvement in chi^2 obtained ## by adding the extra parameter to the fit. If p < 0.01, you can reject ## the lower order polynomial at the 1% level. ## ## You can estimate the uncertainty in the polynomial coefficients ## themselves using ## @example ## dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr; ## @end example ## but the high degree of covariance amongst them makes this a questionable ## operation. ## @end deftypefn ## ## @deftypefn {Function File} {[@var{p}, @var{s}, @var{mu}] =} wpolyfit (...) ## ## If an additional output @code{mu = [mean(x),std(x)]} is requested then ## the @var{x} values are centered and normalized prior to computing the fit. ## This will give more stable numerical results. To compute a predicted ## @var{y} from the returned model use ## @code{y = polyval(p, (x-mu(1))/mu(2)} ## @end deftypefn ## ## @deftypefn {Function File} {} wpolyfit (...) ## ## If no output arguments are requested, then wpolyfit plots the data, ## the fitted line and polynomials defining the standard error range. ## ## Example ## @example ## x = linspace(0,4,20); ## dy = (1+rand(size(x)))/2; ## y = polyval([2,3,1],x) + dy.*randn(size(x)); ## wpolyfit(x,y,dy,2); ## @end example ## @end deftypefn ## ## @deftypefn {Function File} {} wpolyfit (..., 'origin') ## ## If 'origin' is specified, then the fitted polynomial will go through ## the origin. This is generally ill-advised. Use with caution. ## ## Hocking, RR (2003). Methods and Applications of Linear Models. ## New Jersey: John Wiley and Sons, Inc. ## ## @c Will be cut out in optims info file and replaced with the same ## @c refernces explicitely there, since references to core Octave ## @c functions are not automatically transformed from here to there. ## @c BEGIN_CUT_TEXINFO ## @seealso{polyfit} ## @c END_CUT_TEXINFO ## @seealso{polyconf} ## @end deftypefn function [p_out, s, mu] = wpolyfit (varargin) ## strip 'origin' of the end args = length(varargin); if args>0 && ischar(varargin{args}) origin = varargin{args}; args--; else origin=''; endif ## strip polynomial order off the end if args>0 n = varargin{args}; args--; else n = []; end ## interpret the remainder as x,y or x,y,dy or [x,y] or [x,y,dy] if args == 3 x = varargin{1}; y = varargin{2}; dy = varargin{3}; elseif args == 2 x = varargin{1}; y = varargin{2}; dy = []; elseif args == 1 A = varargin{1}; [nr,nc]=size(A); if all(nc!=[2,3]) error("wpolyfit expects vectors x,y,dy or matrix [x,y,dy]"); endif dy = []; if nc == 3, dy = A(:,3); endif y = A(:,2); x = A(:,1); else usage ("wpolyfit (x, y [, dy], n [, 'origin'])"); end if (length(origin) == 0) through_origin = 0; elseif strcmp(origin,'origin') through_origin = 1; else error ("wpolyfit: expected 'origin' but found '%s'", origin) endif if any(size (x) != size (y)) error ("wpolyfit: x and y must be vectors of the same size"); endif if length(dy)>1 && length(y) != length(dy) error ("wpolyfit: dy must be a vector the same length as y"); endif if (! (isscalar (n) && n >= 0 && ! isinf (n) && n == round (n))) error ("wpolyfit: n must be a nonnegative integer"); endif if nargout == 3 mu = [mean(x), std(x)]; x = (x - mu(1))/mu(2); endif k = length (x); ## observation matrix if through_origin ## polynomial through the origin y = ax + bx^2 + cx^3 + ... A = (x(:) * ones(1,n)) .^ (ones(k,1) * (n:-1:1)); else ## polynomial least squares y = a + bx + cx^2 + dx^3 + ... A = (x(:) * ones (1, n+1)) .^ (ones (k, 1) * (n:-1:0)); endif [p,s] = wsolve(A,y(:),dy(:)); if through_origin p(n+1) = 0; endif if nargout == 0 good_fit = 1-chi2cdf(s.normr^2,s.df); printf("Polynomial: %s [ p(chi^2>observed)=%.2f%% ]\n", polyout(p,'x'), good_fit*100); plt(x,y,dy,p,s,'ci'); else p_out = p'; endif function plt(x,y,dy,p,s,varargin) if iscomplex(p) # XXX FIXME XXX how to plot complex valued functions? # Maybe using hue for phase and saturation for magnitude # e.g., Frank Farris (Santa Cruz University) has this: # http://www.maa.org/pubs/amm_complements/complex.html # Could also look at the book # Visual Complex Analysis by Tristan Needham, Oxford Univ. Press # but for now we punt return end ## decorate the graph grid('on'); xlabel('abscissa X'); ylabel('data Y'); title('Least-squares Polynomial Fit with Error Bounds'); ## draw fit with estimated error bounds xf = linspace(min(x),max(x),150)'; [yf,dyf] = polyconf(p,xf,s,varargin{:}); plot(xf,yf+dyf,"g.;;", xf,yf-dyf,"g.;;", xf,yf,"g-;fit;"); ## plot the data hold on; if (isempty(dy)) plot(x,y,"x;data;"); else if isscalar(dy), dy = ones(size(y))*dy; end errorbar (x, y, dy, "~;data;"); endif hold off; if strcmp(deblank(input('See residuals? [y,n] ','s')),'y') clf; if (isempty(dy)) plot(x,y-polyval(p,x),"x;data;"); else errorbar(x,y-polyval(p,x),dy, '~;data;'); endif hold on; grid on; ylabel('Residuals'); xlabel('abscissa X'); plot(xf,dyf,'g.;;',xf,-dyf,'g.;;'); hold off; endif %!demo % #1 %! x = linspace(0,4,20); %! dy = (1+rand(size(x)))/2; %! y = polyval([2,3,1],x) + dy.*randn(size(x)); %! wpolyfit(x,y,dy,2); %!demo % #2 %! x = linspace(-i,+2i,20); %! noise = ( randn(size(x)) + i*randn(size(x)) )/10; %! P = [2-i,3,1+i]; %! y = polyval(P,x) + noise; %! wpolyfit(x,y,2) %!demo %! pin = [3; -1; 2]; %! x = -3:0.1:3; %! y = polyval (pin, x); %! %! ## Poisson weights %! # dy = sqrt (abs (y)); %! ## Uniform weights in [0.5,1] %! dy = 0.5 + 0.5 * rand (size (y)); %! %! y = y + randn (size (y)) .* dy; %! printf ("Original polynomial: %s\n", polyout (pin, 'x')); %! wpolyfit (x, y, dy, length (pin)-1); optim-1.4.1/inst/optim_doc.m0000644000175000017500000000506112477523726014422 0ustar olafolaf## Copyright (C) 2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {} optim_doc () ## @deftypefnx {Function File} {} optim_doc (@var{keyword}) ## Show optim package documentation. ## ## Runs the info viewer Octave is configured with on the documentation ## in info format of the installed optim package. Without argument, the ## top node of the documentation is displayed. With an argument, the ## respective index entry is searched for and its node displayed. ## ## @end deftypefn function optim_doc (keyword) if ((nargs = nargin ()) > 1) print_usage () endif ## locate installed documentation persistent infopath = ""; if (isempty (infopath)) [local_list, global_list] = pkg ("list"); if (! isempty (idx = ... find (strcmp ("optim", {structcat(1, local_list{:}).name}), 1))) idir = local_list{idx}.dir; elseif (! isempty (idx = ... find (strcmp ("optim", {structcat(1, global_list{:}).name}), 1))) idir = global_list{idx}.dir; else error ("no installed optim package found"); endif infopath = fullfile (idir, "doc/", "optim.info"); ## allow for .gz if (! exist (infopath, "file")) infopath = strcat (infopath, ".gz"); endif endif ## display info INFO = info_program (); if (nargs) error_hint = ", maybe the keyword was not found in the index"; status = system (sprintf ("%s %s --index-search \"%s\"", INFO, infopath, keyword)); else error_hint = ""; status = system (sprintf ("%s %s", INFO, infopath)); endif if (status) if (status == 127) error ("unable to find info program `%s'", INFO); else error ("info program `%s' returned error code %i%s", INFO, status, error_hint); endif endif endfunction optim-1.4.1/inst/test_wpolyfit.m0000644000175000017500000003753712477523726015376 0ustar olafolaf## Author: Paul Kienzle ## This program is granted to the public domain. ## Tests for wpolyfit. ## ## Test cases are taken from the NIST Statistical Reference Datasets ## http://www.itl.nist.gov/div898/strd/ 1; function do_test(n,x,y,p,dp,varargin) [myp,s] = wpolyfit(x,y,n,varargin{:}); %if length(varargin)==0, [myp,s] = polyfit(x,y,n); else return; end mydp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr; if length(varargin)>0, mydp = [mydp;0]; end %origin %[svdp,j,svddp] = svdfit(x,y,n); disp('parameter certified value rel. error'); [myp(:), p, abs((myp(:)-p)./p)] %, svdp, abs((svdp-p)./p) ] disp('p-error certified value rel. error'); [mydp(:), dp, abs((mydp(:) - dp)./dp)] %, svdp, abs((svddp - dp)./dp)] input('Press to proceed to the next test'); endfunction ## x y dy data = [0.0013852 0.2144023 0.0020470 0.0018469 0.2516856 0.0022868 0.0023087 0.3070443 0.0026362 0.0027704 0.3603186 0.0029670 0.0032322 0.4260864 0.0033705 0.0036939 0.4799956 0.0036983 ]; x=data(:,1); y=data(:,2); dy=data(:,3); wpolyfit(x,y,dy,1); disp('computing parameter uncertainty from monte carlo simulation...'); fflush(stdout); n=100; p=zeros(2,n); for i=1:n, p(:,i)=(polyfit(x,y+randn(size(y)).*dy,1)).'; end printf("%15s %15s\n", "Coefficient", "Error"); printf("%15g %15g\n", [mean(p'); std(p')]); input('Press to see some sample regression lines: '); t = [x(1), x(length(x))]; [p,s] = wpolyfit(x,y,dy,1); dp=sqrt(sumsq(inv(s.R'))'/s.df)*s.normr; hold off; for i=1:15, plot(t,polyval(p(:)+randn(size(dp)).*dp,t),'-g;;'); hold on; end errorbar(x,y,dy,"~b;;"); [yf,dyf]=polyconf(p,x,s,0.05,'ci'); plot(x,yf-dyf,"-r;;",x,yf+dyf,'-r;95% confidence interval;') hold off; input('Press to continue with the tests: '); ##Procedure: Linear Least Squares Regression ##Reference: Filippelli, A., NIST. ##Model: Polynomial Class ## 11 Parameters (B0,B1,...,B10) ## ## y = B0 + B1*x + B2*(x**2) + ... + B9*(x**9) + B10*(x**10) + e ##Data: ## y x data = [ 0.8116 -6.860120914 0.9072 -4.324130045 0.9052 -4.358625055 0.9039 -4.358426747 0.8053 -6.955852379 0.8377 -6.661145254 0.8667 -6.355462942 0.8809 -6.118102026 0.7975 -7.115148017 0.8162 -6.815308569 0.8515 -6.519993057 0.8766 -6.204119983 0.8885 -5.853871964 0.8859 -6.109523091 0.8959 -5.79832982 0.8913 -5.482672118 0.8959 -5.171791386 0.8971 -4.851705903 0.9021 -4.517126416 0.909 -4.143573228 0.9139 -3.709075441 0.9199 -3.499489089 0.8692 -6.300769497 0.8872 -5.953504836 0.89 -5.642065153 0.891 -5.031376979 0.8977 -4.680685696 0.9035 -4.329846955 0.9078 -3.928486195 0.7675 -8.56735134 0.7705 -8.363211311 0.7713 -8.107682739 0.7736 -7.823908741 0.7775 -7.522878745 0.7841 -7.218819279 0.7971 -6.920818754 0.8329 -6.628932138 0.8641 -6.323946875 0.8804 -5.991399828 0.7668 -8.781464495 0.7633 -8.663140179 0.7678 -8.473531488 0.7697 -8.247337057 0.77 -7.971428747 0.7749 -7.676129393 0.7796 -7.352812702 0.7897 -7.072065318 0.8131 -6.774174009 0.8498 -6.478861916 0.8741 -6.159517513 0.8061 -6.835647144 0.846 -6.53165267 0.8751 -6.224098421 0.8856 -5.910094889 0.8919 -5.598599459 0.8934 -5.290645224 0.894 -4.974284616 0.8957 -4.64454848 0.9047 -4.290560426 0.9129 -3.885055584 0.9209 -3.408378962 0.9219 -3.13200249 0.7739 -8.726767166 0.7681 -8.66695597 0.7665 -8.511026475 0.7703 -8.165388579 0.7702 -7.886056648 0.7761 -7.588043762 0.7809 -7.283412422 0.7961 -6.995678626 0.8253 -6.691862621 0.8602 -6.392544977 0.8809 -6.067374056 0.8301 -6.684029655 0.8664 -6.378719832 0.8834 -6.065855188 0.8898 -5.752272167 0.8964 -5.132414673 0.8963 -4.811352704 0.9074 -4.098269308 0.9119 -3.66174277 0.9228 -3.2644011]; ##Certified values: ## p dP target = [ -1467.48961422980 298.084530995537 -2772.17959193342 559.779865474950 -2316.37108160893 466.477572127796 -1127.97394098372 227.204274477751 -354.478233703349 71.6478660875927 -75.1242017393757 15.2897178747400 -10.8753180355343 2.23691159816033 -1.06221498588947 0.221624321934227 -0.670191154593408E-01 0.142363763154724E-01 -0.246781078275479E-02 0.535617408889821E-03 -0.402962525080404E-04 0.896632837373868E-05]; if 1 disp("Filippelli, A., NIST."); do_test(10, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif ##Procedure: Linear Least Squares Regression ## ##Reference: Pontius, P., NIST. ## Load Cell Calibration. ## ##Model: Quadratic Class ## 3 Parameters (B0,B1,B2) ## y = B0 + B1*x + B2*(x**2) ##Data: y x data = [ ... .11019 150000 .21956 300000 .32949 450000 .43899 600000 .54803 750000 .65694 900000 .76562 1050000 .87487 1200000 .98292 1350000 1.09146 1500000 1.20001 1650000 1.30822 1800000 1.41599 1950000 1.52399 2100000 1.63194 2250000 1.73947 2400000 1.84646 2550000 1.95392 2700000 2.06128 2850000 2.16844 3000000 .11052 150000 .22018 300000 .32939 450000 .43886 600000 .54798 750000 .65739 900000 .76596 1050000 .87474 1200000 .98300 1350000 1.09150 1500000 1.20004 1650000 1.30818 1800000 1.41613 1950000 1.52408 2100000 1.63159 2250000 1.73965 2400000 1.84696 2550000 1.95445 2700000 2.06177 2850000 2.16829 3000000 ]; ## Certified Regression Statistics ## ## Standard Deviation ## Estimate of Estimate target = [ ... 0.673565789473684E-03 0.107938612033077E-03 0.732059160401003E-06 0.157817399981659E-09 -0.316081871345029E-14 0.486652849992036E-16 ]; if 1 disp("Pontius, P., NIST"); do_test(2, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif #Procedure: Linear Least Squares Regression #Reference: Eberhardt, K., NIST. #Model: Linear Class # 1 Parameter (B1) # # y = B1*x + e #Data: y x data =[... 130 60 131 61 132 62 133 63 134 64 135 65 136 66 137 67 138 68 139 69 140 70 ]; # Certified Regression Statistics # # Standard Deviation # Estimate of Estimate target = [ ... 0 0 2.07438016528926 0.165289256198347E-01 ]; if 1 disp("Eberhardt, K., NIST"); do_test(1, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2)),'origin'); endif #Reference: Wampler, R. H. (1970). # A Report of the Accuracy of Some Widely-Used Least # Squares Computer Programs. # Journal of the American Statistical Association, 65, 549-565. # #Model: Polynomial Class # 6 Parameters (B0,B1,...,B5) # # y = B0 + B1*x + B2*(x**2) + B3*(x**3)+ B4*(x**4) + B5*(x**5) # # Certified Regression Statistics # # Standard Deviation # Parameter Estimate of Estimate target = [... 1.00000000000000 0.000000000000000 1.00000000000000 0.000000000000000 1.00000000000000 0.000000000000000 1.00000000000000 0.000000000000000 1.00000000000000 0.000000000000000 1.00000000000000 0.000000000000000 ]; #Data: y x data = [... 1 0 6 1 63 2 364 3 1365 4 3906 5 9331 6 19608 7 37449 8 66430 9 111111 10 177156 11 271453 12 402234 13 579195 14 813616 15 1118481 16 1508598 17 2000719 18 2613660 19 3368421 20 ]; if 1 disp("Wampler1"); do_test(5, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif ##Reference: Wampler, R. H. (1970). ## A Report of the Accuracy of Some Widely-Used Least ## Squares Computer Programs. ## Journal of the American Statistical Association, 65, 549-565. ##Model: Polynomial Class ## 6 Parameters (B0,B1,...,B5) ## ## y = B0 + B1*x + B2*(x**2) + B3*(x**3)+ B4*(x**4) + B5*(x**5) ## ## Certified Regression Statistics ## Standard Deviation ## Parameter Estimate of Estimate target = [ ... 1.00000000000000 0.000000000000000 0.100000000000000 0.000000000000000 0.100000000000000E-01 0.000000000000000 0.100000000000000E-02 0.000000000000000 0.100000000000000E-03 0.000000000000000 0.100000000000000E-04 0.000000000000000 ]; #Data: y x data = [ ... 1.00000 0 1.11111 1 1.24992 2 1.42753 3 1.65984 4 1.96875 5 2.38336 6 2.94117 7 3.68928 8 4.68559 9 6.00000 10 7.71561 11 9.92992 12 12.75603 13 16.32384 14 20.78125 15 26.29536 16 33.05367 17 41.26528 18 51.16209 19 63.00000 20 ]; if 1 disp("Wampler2"); do_test(5, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif ##Reference: Wampler, R. H. (1970). ## A Report of the Accuracy of Some Widely-Used Least ## Squares Computer Programs. ## Journal of the American Statistical Association, 65, 549-565. ## ##Model: Polynomial Class ## 6 Parameters (B0,B1,...,B5) ## ## y = B0 + B1*x + B2*(x**2) + B3*(x**3)+ B4*(x**4) + B5*(x**5) ## ## Certified Regression Statistics ## ## Standard Deviation ## Parameter Estimate of Estimate target = [... 1.00000000000000 2152.32624678170 1.00000000000000 2363.55173469681 1.00000000000000 779.343524331583 1.00000000000000 101.475507550350 1.00000000000000 5.64566512170752 1.00000000000000 0.112324854679312 ]; #Data: y x data = [ ... 760. 0 -2042. 1 2111. 2 -1684. 3 3888. 4 1858. 5 11379. 6 17560. 7 39287. 8 64382. 9 113159. 10 175108. 11 273291. 12 400186. 13 581243. 14 811568. 15 1121004. 16 1506550. 17 2002767. 18 2611612. 19 3369180. 20 ]; if 1 disp("Wampler3"); do_test(5, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif ##Model: Polynomial Class ## 6 Parameters (B0,B1,...,B5) ## ## y = B0 + B1*x + B2*(x**2) + B3*(x**3)+ B4*(x**4) + B5*(x**5) ## ## Certified Regression Statistics ## ## Standard Deviation ## Parameter Estimate of Estimate target = [... 1.00000000000000 215232.624678170 1.00000000000000 236355.173469681 1.00000000000000 77934.3524331583 1.00000000000000 10147.5507550350 1.00000000000000 564.566512170752 1.00000000000000 11.2324854679312 ]; #Data: y x data = [... 75901 0 -204794 1 204863 2 -204436 3 253665 4 -200894 5 214131 6 -185192 7 221249 8 -138370 9 315911 10 -27644 11 455253 12 197434 13 783995 14 608816 15 1370781 16 1303798 17 2205519 18 2408860 19 3444321 20 ]; if 1 disp("Wampler4"); do_test(5, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif ##Model: Polynomial Class ## 6 Parameters (B0,B1,...,B5) ## ## y = B0 + B1*x + B2*(x**2) + B3*(x**3)+ B4*(x**4) + B5*(x**5) ## ## Certified Regression Statistics ## ## Standard Deviation ## Parameter Estimate of Estimate target = [... 1.00000000000000 21523262.4678170 1.00000000000000 23635517.3469681 1.00000000000000 7793435.24331583 1.00000000000000 1014755.07550350 1.00000000000000 56456.6512170752 1.00000000000000 1123.24854679312 ]; ##Data: y x data = [ ... 7590001 0 -20479994 1 20480063 2 -20479636 3 25231365 4 -20476094 5 20489331 6 -20460392 7 18417449 8 -20413570 9 20591111 10 -20302844 11 18651453 12 -20077766 13 21059195 14 -19666384 15 26348481 16 -18971402 17 22480719 18 -17866340 19 10958421 20 ]; if 1 disp("Wampler5"); do_test(5, data(:,2),data(:,1),flipud(target(:,1)),flipud(target(:,2))); endif optim-1.4.1/inst/test_min_1.m0000644000175000017500000000511612477523726014510 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2004 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## Test an optimization function with the same synopsis as bfgs.m if ! exist ("optim_func"), optim_func = "bfgsmin"; end ok = 1; if ! exist ("verbose"), verbose = 0; end if verbose printf ("\n Testing '%s' on a quadratic programming problem\n\n",... optim_func); printf ([" Set 'optim_func' to the name of the optimization\n",... " function you want to test (must have same synopsis\n",... " as 'bfgs')\n\n"]); end N = 1+floor(30*rand(1)) ; global truemin ; truemin = randn(N,1) ; global offset ; offset = 10*randn(1) ; global metric ; metric = randn(2*N,N) ; metric = metric'*metric ; if N>1, [u,d,v] = svd(metric); d = (0.1+[0:(1/(N-1)):1]).^2 ; metric = u*diag(d)*u' ; end function v = testfunc(x) global offset ; global truemin ; global metric ; v = sum((x-truemin)'*metric*(x-truemin))+offset ; end function df = dtestf(x) global truemin ; global metric ; df = 2*(x-truemin)'*metric ; end xinit = 10*randn(N,1) ; if verbose, printf ([" Dimension is %i\n",... " Condition is %f\n"],... N, cond (metric)); fflush (stdout); end ## [x,v,niter] = feval (optim_func, "testfunc","dtestf", xinit); ctl.df = "dtestf"; if strcmp(optim_func,"bfgsmin") ctl = {-1,2,1,1}; xinit2 = {xinit}; else xinit2 = xinit; endif [x,v,niter] = feval (optim_func, "testfunc", xinit2, ctl); if verbose printf ("nev=%d N=%d errx=%8.3g errv=%8.3g\n",... niter(1),N,max(abs( x-truemin )),v-offset); end if any (abs (x-truemin) > 1e-4) ok = 0; if verbose, printf ("not ok 1 (best argument is wrong)\n"); end elseif verbose, printf ("ok 1\n"); end if v-offset > 1e-8 ok = 0; if verbose, printf ("not ok 2 (best function value is wrong)\n"); end elseif verbose, printf ("ok 2\n"); end if verbose if ok, printf ("All tests ok\n"); else printf ("Whoa!! Some test(s) failed\n"); end end optim-1.4.1/inst/rosenbrock.m0000644000175000017500000000250612477523726014615 0ustar olafolaf## Copyright (C) 2004 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## Rosenbrock function - used to create example obj. fns. ## ## Function value and gradient vector of the rosenbrock function ## The minimizer is at the vector (1,1,..,1), ## and the minimized value is 0. function [obj_value, gradient] = rosenbrock(x); dimension = length(x); obj_value = sum(100*(x(2:dimension)-x(1:dimension-1).^2).^2 + (1-x(1:dimension-1)).^2); if nargout > 1 gradient = zeros(dimension, 1); gradient(1:dimension-1) = - 400*x(1:dimension-1).*(x(2:dimension)-x(1:dimension-1).^2) - 2*(1-x(1:dimension-1)); gradient(2:dimension) = gradient(2:dimension) + 200*(x(2:dimension)-x(1:dimension-1).^2); endif endfunction optim-1.4.1/inst/deriv.m0000644000175000017500000001016712477523726013561 0ustar olafolaf## Copyright (C) 2000 Ben Sapp ## Copyright (C) 2011 Joaquín Ignacio Aramendía ## Copyright (C) 2011 Carnë Draug ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}) ## @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}) ## @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}, @var{O}) ## @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}, @var{O}, @var{N}) ## Calculate derivate of function @var{f}. ## ## @var{f} must be a function handle or the name of a function that takes @var{x0} ## and returns a variable of equal length and orientation. @var{x0} must be a ## numeric vector or scalar. ## ## @var{h} defines the step taken for the derivative calculation. Defaults to 1e-7. ## ## @var{O} defines the order of the calculation. Supported values are 2 (h^2 order) ## or 4 (h^4 order). Defaults to 2. ## ## @var{N} defines the derivative order. Defaults to the 1st derivative of the ## function. Can be up to the 4th derivative. ## ## Reference: Numerical Methods for Mathematics, Science, and Engineering by ## John H. Mathews. ## @end deftypefn function dx = deriv (f, x0, h = 0.0000001, O = 2, N = 1) if (ischar(f)) f = str2func(f); # let's also support a string with str2func endif if (nargin < 2) error ("Not enough arguments."); elseif (!isa (f, 'function_handle')) error ("The first argument 'f' must be a function handle."); elseif (!isvector (x0) || !isnumeric (x0)) ## a scalar is 1x1 therefore counts as a vector too error ("The second argument 'x0' must be a numeric vector."); elseif (!isscalar (h) || !isnumeric (h)) error ("The third argument 'h' must be a scalar."); elseif (!isscalar (O) || !isnumeric (O)) error ("The fourth argument 'O' must be a scalar."); elseif (O != 2 && O != 4) error ("Only order 2 or 4 is supported."); elseif (!isscalar (N) || !isnumeric (N)) error ("The fifth argument 'N' must be a scalar."); elseif ((N > 4) || (N < 1)) error("Only 1st,2nd,3rd or 4th order derivatives are acceptable."); elseif (nargin > 5) warning("Ignoring arguements beyond the 5th."); endif switch O case (2) switch N case (1) dx = (feval(f,x0+h)-feval(f,x0-h))/(2*h); case (2) dx = (feval(f,x0+h)-2*feval(f,x0)+feval(f,x0-h))/(h^2); case (3) dx = (feval(f,x0+2*h)-2*feval(f,x0+h)+2*feval(f,x0-h)-feval(f,x0-2*h))/(2*h^3); case (4) dx = (feval(f,x0+2*h)-4*feval(f,x0+h)+6*feval(f,x0)-4*feval(f,x0-h)+feval(f,x0-2*h))/(h^4); otherwise error("Only 1st,2nd,3rd or 4th order derivatives are acceptable."); endswitch case (4) switch N case (1) dx = (-feval(f,x0+2*h)+8*feval(f,x0+h)-8*feval(f,x0-h)+feval(f,x0-2*h))/(12*h); case (2) dx = (-feval(f,x0+2*h)+16*feval(f,x0+h)-30*feval(f,x0)+16*feval(f,x0-h)-feval(f,x0-2*h))/(12*h^2); case (3) dx = (-feval(f,x0+3*h)+8*feval(f,x0+2*h)-13*feval(f,x0+h)+13*feval(f,x0-h)-8*feval(f,x0-2*h)+feval(f,x0-3*h))/(8*h^3); case (4) dx = (-feval(f,x0+3*h)+12*feval(f,x0+2*h)-39*feval(f,x0+h)+56*feval(f,x0)-39*feval(f,x0-h)+12*feval(f,x0-2*h)-feval(f,x0-3*h))/(6*h^4); otherwise error("Only 1st,2nd,3rd or 4th order derivatives are acceptable."); endswitch otherwise error ("Only order 2 or 4 is supported."); endswitch endfunction optim-1.4.1/inst/test_nelder_mead_min_2.m0000644000175000017500000000773012477523726017034 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## Checks wether the function 'nelder_mead_min' accepts options properly ok = 1; cnt = 1; if ! exist ("verbose"), verbose = 0; end if ! exist ("inspect"), inspect = 0; end if verbose, printf (["test_nelder_mead_2\n",... " Check whether nelder_mead_min accepts options properly\n\n"]); end N = 2; x1 = zeros (1,N); small = 1e-3; vol = (small^N) / factorial (N); ## Define simple 2D function : [x,y] -> x^2, start from [0,0] ## function c = my_func (x) c = x(1)^2; end ###################################################################### ## Test using volume ################################################# ## Choose vtol and initial simplex so that algo should stop immediately. ctl = struct ("verbose",verbose, "isz",small, "vtol",vol*1.01, "rst",0); [x2,v,nev] = nelder_mead_min ("my_func", x1, ctl); if nev != N+1 if verbose || inspect, printf ("not ok %i\n",cnt); end if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n",cnt); end end cnt++; ## Choose vtol and initial simplex so that algo should stop after one ## iteration (should be a reflexion and a tentative extension). Total is 5 ## evaluations. ctl = struct ("verbose",verbose, "isz",small, "vtol",vol*0.99, "rst",0); x1 = [0,0]; [x2,v,nev] = nelder_mead_min ("my_func", x1, ctl); if nev != N+3 if verbose || inspect, printf ("not ok %i\n",cnt); end if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n",cnt); end end cnt++; ###################################################################### ## Test using radius ################################################# ## Choose rtol and initial simplex so that algo stops immediately. ctl = struct ("verbose",verbose, "isz",small, "rtol",small*2.01, "rst",0); [x2,v,nev] = nelder_mead_min ("my_func", x1, ctl); if nev != N+1 if verbose || inspect, printf ("not ok %i\n",cnt); end if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n",cnt); end end cnt++; ## Choose rtol and initial simplex so that algo does not stop immediately. ctl = struct ("verbose",verbose, "isz",small, "rtol",small*1.99, "rst",0); [x2,v,nev] = nelder_mead_min ("my_func", x1, ctl); if nev <= N+1 if verbose || inspect, printf ("not ok %i\n",cnt); end if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n",cnt); end end cnt++; ###################################################################### ## Test using values ################################################# ## Choose rtol and initial simplex so that algo should stop immediately. ctl = struct ("verbose",verbose, "isz",small, "ftol",1.01*small^2, "rst",0); [x2,v,nev] = nelder_mead_min ("my_func", x1, ctl); if nev != N+1 if verbose || inspect, printf ("not ok %i\n",cnt); end if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n",cnt); end end cnt++; ## Choose rtol and initial simplex so that algo does not stop immediately. ctl = struct ("verbose",verbose, "isz",small, "ftol",0.99*small^2, "rst",0); [x2,v,nev] = nelder_mead_min ("my_func", x1, ctl); if nev <= N+1 if verbose || inspect, printf ("not ok %i\n",cnt); end if inspect, keyboard; end ok = 0 ; else if verbose printf ("ok %i\n",cnt); end end cnt++; cnt--; if verbose && ok printf ("All %i tests ok\n", cnt); end optim-1.4.1/inst/adsmax.m0000644000175000017500000001362612477523726013730 0ustar olafolaf%% Copyright (C) 2002 N.J.Higham %% Copyright (C) 2003 Andy Adler %% %% This program is free software; you can redistribute it and/or modify it under %% the terms of the GNU General Public License as published by the Free Software %% Foundation; either version 3 of the License, or (at your option) any later %% version. %% %% This program is distributed in the hope that it will be useful, but WITHOUT %% ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or %% FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more %% details. %% %% You should have received a copy of the GNU General Public License along with %% this program; if not, see . %%ADSMAX Alternating directions method for direct search optimization. %% [x, fmax, nf] = ADSMAX(FUN, x0, STOPIT, SAVIT, P) attempts to %% maximize the function FUN, using the starting vector x0. %% The alternating directions direct search method is used. %% Output arguments: %% x = vector yielding largest function value found, %% fmax = function value at x, %% nf = number of function evaluations. %% The iteration is terminated when either %% - the relative increase in function value between successive %% iterations is <= STOPIT(1) (default 1e-3), %% - STOPIT(2) function evaluations have been performed %% (default inf, i.e., no limit), or %% - a function value equals or exceeds STOPIT(3) %% (default inf, i.e., no test on function values). %% Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). %% If a non-empty fourth parameter string SAVIT is present, then %% `SAVE SAVIT x fmax nf' is executed after each inner iteration. %% By default, the search directions are the co-ordinate directions. %% The columns of a fifth parameter matrix P specify alternative search %% directions (P = EYE is the default). %% NB: x0 can be a matrix. In the output argument, in SAVIT saves, %% and in function calls, x has the same shape as x0. %% ADSMAX(fun, x0, STOPIT, SAVIT, P, P1, P2,...) allows additional %% arguments to be passed to fun, via feval(fun,x,P1,P2,...). %% Reference: %% N. J. Higham, Optimization by direct search in matrix computations, %% SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. %% N. J. Higham, Accuracy and Stability of Numerical Algorithms, %% Second edition, Society for Industrial and Applied Mathematics, %% Philadelphia, PA, 2002; sec. 20.5. % From Matrix Toolbox % Copyright (C) 2002 N.J.Higham % www.maths.man.ac.uk/~higham/mctoolbox % Modifications for octave by A.Adler 2003 function [x, fmax, nf] = adsmax(f, x, stopit, savit, P, varargin) x0 = x(:); % Work with column vector internally. n = length(x0); mu = 1e-4; % Initial percentage change in components. nstep = 25; % Max number of times to double or decrease h. % Set up convergence parameters. if nargin < 3 stopit(1) = 1e-3; elseif isempty(stopit) stopit(1) = 1e-3; endif tol = stopit(1); % Required rel. increase in function value over one iteration. if length(stopit) == 1, stopit(2) = inf; end % Max no. of f-evaluations. if length(stopit) == 2, stopit(3) = inf; end % Default target for f-values. if length(stopit) < 5, stopit(5) = 1; end % Default: show progress. trace = stopit(5); if length(stopit) == 5, stopit(6) = 1; end % Default: maximize dirn= stopit(6); if nargin < 4, savit = []; end % File name for snapshots. % Matrix of search directions. if nargin < 5 P = eye(n); elseif isempty(P) P = eye(n); else if ~isequal(size(P),[n n]) % Check for common error. error('P must be of dimension the number of elements in x0.') end end fmax = dirn*feval(f,x,varargin{:}); nf = 1; if trace, fprintf('f(x0) = %9.4e\n', fmax), end steps = zeros(n,1); it = 0; y = x0; while 1 % Outer loop. it = it+1; if trace, fprintf('Iter %2.0f (nf = %2.0f)\n', it, nf), end fmax_old = fmax; for i=1:n % Loop over search directions. pi = P(:,i); flast = fmax; yi = y; h = sign(pi'*yi)*norm(pi.*yi)*mu; % Initial step size. if h == 0, h = max(norm(yi,inf),1)*mu; end y = yi + h*pi; x(:) = y; fnew = dirn*feval(f,x,varargin{:}); nf = nf + 1; if fnew > fmax fmax = fnew; if fmax >= stopit(3) if trace fprintf('Comp. = %2.0f, steps = %2.0f, f = %9.4e*\n', i,0,fmax) fprintf('Exceeded target...quitting\n') end x(:) = y; return end h = 2*h; lim = nstep; k = 1; else h = -h; lim = nstep+1; k = 0; end for j=1:lim y = yi + h*pi; x(:) = y; fnew = dirn*feval(f,x,varargin{:}); nf = nf + 1; if fnew <= fmax, break, end fmax = fnew; k = k + 1; if fmax >= stopit(3) if trace fprintf('Comp. = %2.0f, steps = %2.0f, f = %9.4e*\n', i,j,fmax) fprintf('Exceeded target...quitting\n') end x(:) = y; return end h = 2*h; end steps(i) = k; y = yi + 0.5*h*pi; if k == 0, y = yi; end if trace fprintf('Comp. = %2.0f, steps = %2.0f, f = %9.4e', i, k, fmax) fprintf(' (%2.1f%%)\n', 100*(fmax-flast)/(abs(flast)+eps)) end if nf >= stopit(2) if trace fprintf('Max no. of function evaluations exceeded...quitting\n') end x(:) = y; return end if (fmax > flast && ~isempty (savit)) x(:) = y; eval(['save ' savit ' x fmax nf']) end end % Loop over search directions. if isequal(steps,zeros(n,1)) if trace, fprintf('Stagnated...quitting\n'), end x(:) = y; return end if fmax-fmax_old <= tol*abs(fmax_old) if trace, fprintf('Function values ''converged''...quitting\n'), end x(:) = y; return end end %%%%%% Of outer loop. optim-1.4.1/inst/bfgsmin.m0000644000175000017500000001331612477523726014074 0ustar olafolaf## Copyright (C) 2006 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## bfgsmin: bfgs or limited memory bfgs minimization of function ## ## Usage: [x, obj_value, convergence, iters] = bfgsmin(f, args, control) ## ## The function must be of the form ## [value, return_2,..., return_m] = f(arg_1, arg_2,..., arg_n) ## By default, minimization is w.r.t. arg_1, but it can be done ## w.r.t. any argument that is a vector. Numeric derivatives are ## used unless analytic derivatives are supplied. See bfgsmin_example.m ## for methods. ## ## Arguments: ## * f: name of function to minimize (string) ## * args: a cell array that holds all arguments of the function ## The argument with respect to which minimization is done ## MUST be a vector ## * control: an optional cell array of 1-8 elements. If a cell ## array shorter than 8 elements is provided, the trailing elements ## are provided with default values. ## * elem 1: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) ## * elem 2: verbosity ## 0 = no screen output (default) ## 1 = only final results ## 2 = summary every iteration ## 3 = detailed information ## * elem 3: convergence criterion ## 1 = strict (function, gradient and param change) (default) ## 0 = weak - only function convergence required ## * elem 4: arg in f_args with respect to which minimization is done (default is first) ## * elem 5: (optional) Memory limit for lbfgs. If it's a positive integer ## then lbfgs will be use. Otherwise ordinary bfgs is used ## * elem 6: function change tolerance, default 1e-12 ## * elem 7: parameter change tolerance, default 1e-6 ## * elem 8: gradient tolerance, default 1e-5 ## ## Returns: ## * x: the minimizer ## * obj_value: the value of f() at x ## * convergence: 1 if normal conv, other values if not ## * iters: number of iterations performed ## ## Example: see bfgsmin_example.m function [parameter, obj, convergence, iters] = bfgsmin(f, f_args, control) # check number and types of arguments if ((nargin < 2) || (nargin > 3)) usage("bfgsmin: you must supply 2 or 3 arguments"); endif if (!ischar(f)) usage("bfgsmin: first argument must be string holding objective function name"); endif if (!iscell(f_args)) usage("bfgsmin: second argument must cell array of function arguments"); endif if (nargin > 2) if (!iscell(control)) usage("bfgsmin: 3rd argument must be a cell array of 1-8 elements"); endif if (length(control) > 8) usage("bfgsmin: 3rd argument must be a cell array of 1-8 elements"); endif else control = {}; endif # provide defaults for missing controls if (length(control) == 0) control{1} = -1; endif # limit on iterations if (length(control) == 1) control{2} = 0; endif # level of verbosity if (length(control) == 2) control{3} = 1; endif # strong (function, gradient and parameter change) convergence required? if (length(control) == 3) control{4} = 1; endif # argument with respect to which minimization is done if (length(control) == 4) control{5} = 0; endif # memory for lbfgs: 0 uses ordinary bfgs if (length(control) == 5) control{6} = 1e-10; endif # tolerance for function convergence if (length(control) == 6) control{7} = 1e-6; endif # tolerance for parameter convergence if (length(control) == 7) control{8} = 1e-5; endif # tolerance for gradient convergence # validity checks on all controls tmp = control{1}; if (((tmp !=Inf) && (tmp != -1)) && (tmp > 0 && (mod(tmp,1) != 0))) usage("bfgsmin: 1st element of 3rd argument (iteration limit) must be Inf or positive integer"); endif tmp = control{2}; if ((tmp < 0) || (tmp > 3) || (mod(tmp,1) != 0)) usage("bfgsmin: 2nd element of 3rd argument (verbosity level) must be 0, 1, 2, or 3"); endif tmp = control{3}; if ((tmp != 0) && (tmp != 1)) usage("bfgsmin: 3rd element of 3rd argument (strong/weak convergence) must be 0 (weak) or 1 (strong)"); endif tmp = control{4}; if ((tmp < 1) || (tmp > length(f_args)) || (mod(tmp,1) != 0)) usage("bfgsmin: 4th element of 3rd argument (arg with respect to which minimization is done) must be an integer that indicates one of the elements of f_args"); endif tmp = control{5}; if ((tmp < 0) || (mod(tmp,1) != 0)) usage("bfgsmin: 5th element of 3rd argument (memory for lbfgs must be zero (regular bfgs) or a positive integer"); endif tmp = control{6}; if (tmp < 0) usage("bfgsmin: 6th element of 3rd argument (tolerance for function convergence) must be a positive real number"); endif tmp = control{7}; if (tmp < 0) usage("bfgsmin: 7th element of 3rd argument (tolerance for parameter convergence) must be a positive real number"); endif tmp = control{8}; if (tmp < 0) usage("bfgsmin: 8th element of 3rd argument (tolerance for gradient convergence) must be a positive real number"); endif # check that the parameter we minimize w.r.t. is a vector minarg = control{4}; theta = f_args{minarg}; theta = theta(:); if (!isvector(theta)) usage("bfgsmin: minimization must be done with respect to a vector of parameters"); endif f_args{minarg} = theta; # now go ahead and do the minimization [parameter, obj, convergence, iters] = __bfgsmin(f, f_args, control); endfunction optim-1.4.1/inst/de_min.m0000644000175000017500000004260512477523726013705 0ustar olafolaf## Copyright (C) 1996, 1997 R. Storn ## Copyright (C) 2009-2010 Christian Fischer ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## de_min: global optimisation using differential evolution ## ## Usage: [x, obj_value, nfeval, convergence] = de_min(fcn, control) ## ## minimization of a user-supplied function with respect to x(1:D), ## using the differential evolution (DE) method based on an algorithm ## by Rainer Storn (http://www.icsi.berkeley.edu/~storn/code.html) ## See: http://www.softcomputing.net/tevc2009_1.pdf ## ## ## Arguments: ## --------------- ## fcn string : Name of function. Must return a real value ## control vector : (Optional) Control variables, described below ## or struct ## ## Returned values: ## ---------------- ## x vector : parameter vector of best solution ## obj_value scalar : objective function value of best solution ## nfeval scalar : number of function evaluations ## convergence : 1 = best below value to reach (VTR) ## 0 = population has reached defined quality (tol) ## -1 = some values are close to constraints/boundaries ## -2 = max number of iterations reached (maxiter) ## -3 = max number of functions evaluations reached (maxnfe) ## ## Control variable: (optional) may be named arguments (i.e. "name",value ## ---------------- pairs), a struct, or a vector, where ## NaN's are ignored. ## ## XVmin : vector of lower bounds of initial population ## *** note: by default these are no constraints *** ## XVmax : vector of upper bounds of initial population ## constr : 1 -> enforce the bounds not just for the initial population ## const : data vector (remains fixed during the minimization) ## NP : number of population members ## F : difference factor from interval [0, 2] ## CR : crossover probability constant from interval [0, 1] ## strategy : 1 --> DE/best/1/exp 7 --> DE/best/1/bin ## 2 --> DE/rand/1/exp 8 --> DE/rand/1/bin ## 3 --> DE/target-to-best/1/exp 9 --> DE/target-to-best/1/bin ## 4 --> DE/best/2/exp 10--> DE/best/2/bin ## 5 --> DE/rand/2/exp 11--> DE/rand/2/bin ## 6 --> DEGL/SAW/exp else DEGL/SAW/bin ## refresh : intermediate output will be produced after "refresh" ## iterations. No intermediate output will be produced ## if refresh is < 1 ## VTR : Stopping criterion: "Value To Reach" ## de_min will stop when obj_value <= VTR. ## Use this if you know which value you expect. ## tol : Stopping criterion: "tolerance" ## stops if (best-worst)/max(1,worst) < tol ## This stops basically if the whole population is "good". ## maxnfe : maximum number of function evaluations ## maxiter : maximum number of iterations (generations) ## ## The algorithm seems to work well only if [XVmin,XVmax] covers the ## region where the global minimum is expected. ## DE is also somewhat sensitive to the choice of the ## difference factor F. A good initial guess is to choose F from ## interval [0.5, 1], e.g. 0.8. ## CR, the crossover probability constant from interval [0, 1] ## helps to maintain the diversity of the population and is ## rather uncritical but affects strongly the convergence speed. ## If the parameters are correlated, high values of CR work better. ## The reverse is true for no correlation. ## Experiments suggest that /bin likes to have a slightly ## larger CR than /exp. ## The number of population members NP is also not very critical. A ## good initial guess is 10*D. Depending on the difficulty of the ## problem NP can be lower than 10*D or must be higher than 10*D ## to achieve convergence. ## ## Default Values: ## --------------- ## XVmin = [-2]; ## XVmax = [ 2]; ## constr= 0; ## const = []; ## NP = 10 *D ## F = 0.8; ## CR = 0.9; ## strategy = 12; ## refresh = 0; ## VTR = -Inf; ## tol = 1.e-3; ## maxnfe = 1e6; ## maxiter = 1000; ## ## ## Example to find the minimum of the Rosenbrock saddle: ## ---------------------------------------------------- ## Define f as: ## function result = f(x); ## result = 100 * (x(2) - x(1)^2)^2 + (1 - x(1))^2; ## end ## Then type: ## ## ctl.XVmin = [-2 -2]; ## ctl.XVmax = [ 2 2]; ## [x, obj_value, nfeval, convergence] = de_min (@f, ctl); ## ## Keywords: global-optimisation optimisation minimisation function [bestmem, bestval, nfeval, convergence] = de_min(fcn, varargin) ## Default options XVmin = [-2 ]; XVmax = [ 2 ]; constr= 0; const = []; NP = 0; # NP will be set later F = 0.8; CR = 0.9; strategy = 12; refresh = 0; VTR = -Inf; tol = 1.e-3; maxnfe = 1e6; maxiter = 1000; ## ------------ Check input variables (ctl) -------------------------------- if nargin >= 2, # Read control arguments va_arg_cnt = 1; if nargin > 2, ctl = struct (varargin{:}); else ctl = varargin{va_arg_cnt++}; end if isnumeric (ctl) if length (ctl)>=1 && !isnan (ctl(1)), XVmin = ctl(1); end if length (ctl)>=2 && !isnan (ctl(2)), XVmax = ctl(2); end if length (ctl)>=3 && !isnan (ctl(3)), constr = ctl(3); end if length (ctl)>=4 && !isnan (ctl(4)), const = ctl(4); end if length (ctl)>=5 && !isnan (ctl(5)), NP = ctl(5); end if length (ctl)>=6 && !isnan (ctl(6)), F = ctl(6); end if length (ctl)>=7 && !isnan (ctl(7)), CR = ctl(7); end if length (ctl)>=8 && !isnan (ctl(8)), strategy = ctl(8); end if length (ctl)>=9 && !isnan (ctl(9)), refresh = ctl(9); end if length (ctl)>=10&& !isnan (ctl(10)), VTR = ctl(10); end if length (ctl)>=11&& !isnan (ctl(11)), tol = ctl(11); end if length (ctl)>=12&& !isnan (ctl(12)), maxnfe = ctl(12); end if length (ctl)>=13&& !isnan (ctl(13)), maxiter = ctl(13); end else if isfield (ctl,"XVmin") && !isnan (ctl.XVmin), XVmin=ctl.XVmin; end if isfield (ctl,"XVmax") && !isnan (ctl.XVmax), XVmax=ctl.XVmax; end if isfield (ctl,"constr")&& !isnan (ctl.constr), constr=ctl.constr; end if isfield (ctl,"const") && !isnan (ctl.const), const=ctl.const; end if isfield (ctl, "NP" ) && ! isnan (ctl.NP ), NP = ctl.NP ; end if isfield (ctl, "F" ) && ! isnan (ctl.F ), F = ctl.F ; end if isfield (ctl, "CR" ) && ! isnan (ctl.CR ), CR = ctl.CR ; end if isfield (ctl, "strategy") && ! isnan (ctl.strategy), strategy = ctl.strategy ; end if isfield (ctl, "refresh") && ! isnan (ctl.refresh), refresh = ctl.refresh ; end if isfield (ctl, "VTR") && ! isnan (ctl.VTR ), VTR = ctl.VTR ; end if isfield (ctl, "tol") && ! isnan (ctl.tol ), tol = ctl.tol ; end if isfield (ctl, "maxnfe") && ! isnan (ctl.maxnfe) maxnfe = ctl.maxnfe; end if isfield (ctl, "maxiter") && ! isnan (ctl.maxiter) maxiter = ctl.maxiter; end end end ## set dimension D and population size NP D = length (XVmin); if (NP == 0); NP = 10 * D; end ## -------- do a few sanity checks -------------------------------- if (length (XVmin) != length (XVmax)) error("Length of upper and lower bounds does not match.") end if (NP < 5) error("Population size NP must be bigger than 5.") end if ((F <= 0) || (F > 2)) error("Difference Factor F out of range (0,2].") end if ((CR < 0) || (CR > 1)) error("CR value out of range [0,1].") end if (maxiter <= 0) error("maxiter must be positive.") end if (maxnfe <= 0) error("maxnfe must be positive.") end refresh = floor(abs(refresh)); ## ----- Initialize population and some arrays -------------------------- pop = zeros(NP,D); # initialize pop ## pop is a matrix of size NPxD. It will be initialized with ## random values between the min and max values of the parameters for i = 1:NP pop(i,:) = XVmin + rand (1,D) .* (XVmax - XVmin); end ## initialise the weighting factors between 0.0 and 1.0 w = rand (NP,1); wi = w; popold = zeros (size (pop)); # toggle population val = zeros (1, NP); # create and reset the "cost array" bestmem = zeros (1, D); # best population member ever bestmemit = zeros (1 ,D); # best population member in iteration nfeval = 0; # number of function evaluations ## ------ Evaluate the best member after initialization ------------------ ibest = 1; # start with first population member val(1) = feval (fcn, [pop(ibest,:) const]); bestval = val(1); # best objective function value so far bestw = w(1); # weighting of best design so far for i = 2:NP # check the remaining members val(i) = feval (fcn, [pop(i,:) const]); if (val(i) < bestval) # if member is better ibest = i; # save its location bestval = val(i); bestw = w(i); end end nfeval = nfeval + NP; bestmemit = pop(ibest,:); # best member of current iteration bestvalit = bestval; # best value of current iteration bestmem = bestmemit; # best member ever ## ------ DE - Minimization --------------------------------------- ## popold is the population which has to compete. It is static ## through one iteration. pop is the newly emerging population. bm_n= zeros (NP, D); # initialize bestmember matrix in neighbourh. lpm1= zeros (NP, D); # initialize local population matrix 1 lpm1= zeros (NP, D); # initialize local population matrix 2 rot = 0:1:NP-1; # rotating index array (size NP) rotd= 0:1:D-1; # rotating index array (size D) iter = 1; while ((iter < maxiter) && (nfeval < maxnfe) && (bestval > VTR) && ... ((abs (max (val) - bestval) / max (1, abs (max (val))) > tol))) popold = pop; # save the old population wold = w; # save the old weighting factors ind = randperm (4); # index pointer array a1 = randperm (NP); # shuffle locations of vectors rt = rem (rot + ind(1), NP); # rotate indices by ind(1) positions a2 = a1(rt+1); # rotate vector locations rt = rem (rot + ind(2), NP); a3 = a2(rt+1); rt = rem (rot +ind(3), NP); a4 = a3(rt+1); rt = rem (rot + ind(4), NP); a5 = a4(rt+1); pm1 = popold(a1,:); # shuffled population 1 pm2 = popold(a2,:); # shuffled population 2 pm3 = popold(a3,:); # shuffled population 3 w1 = wold(a4); # shuffled weightings 1 w2 = wold(a5); # shuffled weightings 2 bm = repmat (bestmemit, NP, 1); # population filled with the best member # of the last iteration bw = repmat (bestw, NP, 1); # the same for the weighting of the best mui = rand (NP, D) < CR; # mask for intermediate population # all random numbers < CR are 1, 0 otherwise if (strategy > 6) st = strategy - 6; # binomial crossover else st = strategy; # exponential crossover mui = sort (mui'); # transpose, collect 1's in each column for i = 1:NP n = floor (rand * D); if (n > 0) rtd = rem (rotd + n, D); mui(:,i) = mui(rtd+1,i); #rotate column i by n endif endfor mui = mui'; # transpose back endif mpo = mui < 0.5; # inverse mask to mui if (st == 1) # DE/best/1 ui = bm + F*(pm1 - pm2); # differential variation elseif (st == 2) # DE/rand/1 ui = pm3 + F*(pm1 - pm2); # differential variation elseif (st == 3) # DE/target-to-best/1 ui = popold + F*(bm-popold) + F*(pm1 - pm2); elseif (st == 4) # DE/best/2 pm4 = popold(a4,:); # shuffled population 4 pm5 = popold(a5,:); # shuffled population 5 ui = bm + F*(pm1 - pm2 + pm3 - pm4); # differential variation elseif (st == 5) # DE/rand/2 pm4 = popold(a4,:); # shuffled population 4 pm5 = popold(a5,:); # shuffled population 5 ui = pm5 + F*(pm1 - pm2 + pm3 - pm4); # differential variation else # DEGL/SAW ## The DEGL/SAW method is more complicated. ## We have to generate a neighbourhood first. ## The neighbourhood size is 10% of NP and at least 1. nr = max (1, ceil ((0.1*NP -1)/2)); # neighbourhood radius ## FIXME: I don't know how to vectorise this. - if possible for i = 1:NP neigh_ind = i-nr:i+nr; # index range of neighbourhood neigh_ind = neigh_ind + ((neigh_ind <= 0)-(neigh_ind > NP))*NP; # do wrap around [x, ix] = min (val(neigh_ind)); # find the local best and its index bm_n(i,:) = popold(neigh_ind(ix),:); # copy the data from the local best neigh_ind(nr+1) = []; # remove "i" pq = neigh_ind(randperm (length (neigh_ind))); # permutation of the remaining ind. lpm1(i,:) = popold(pq(1),:); # create the local pop member matrix lpm2(i,:) = popold(pq(2),:); # for the random point p,q endfor ## calculate the new weihting factors wi = wold + F*(bw - wold) + F*(w1 - w2); # use DE/target-to-best/1/nocross # for optimisation of weightings ## fix bounds for weightings o = ones (NP, 1); wi = sort ([0.05*o, wi, 0.95*o],2)(:,2); # sort and take the second column ## fill weighting matrix wm = repmat (wi, 1, D); li = popold + F*(bm_n- popold) + F*(lpm1 - lpm2); gi = popold + F*(bm - popold) + F*(pm1 - pm2); ui = wm.*gi + (1-wm).*li; # combine global and local part endif ## crossover ui = popold.*mpo + ui.*mui; ## enforce initial bounds/constraints if specified if (constr == 1) for i = 1:NP ui(i,:) = max (ui(i,:), XVmin); ui(i,:) = min (ui(i,:), XVmax); end end ## ----- Select which vectors are allowed to enter the new population ------ for i = 1:NP tempval = feval (fcn, [ui(i,:) const]); # check cost of competitor if (tempval <= val(i)) # if competitor is better pop(i,:) = ui(i,:); # replace old vector with new one val(i) = tempval; # save value in "cost array" w(i) = wi(i); # save the weighting factor ## we update bestval only in case of success to save time if (tempval <= bestval) # if competitor better than the best one ever bestval = tempval; # new best value bestmem = ui(i,:); # new best parameter vector ever bestw = wi(i); # save best weighting end end endfor #---end for i = 1:NP nfeval = nfeval + NP; # increase number of function evaluations bestmemit = bestmem; # freeze the best member of this iteration for the # coming iteration. This is needed for some of the # strategies. ## ---- Output section ---------------------------------------------------- if (refresh > 0) if (rem (iter, refresh) == 0) printf ('Iteration: %d, Best: %8.4e, Worst: %8.4e\n', ... iter, bestval, max(val)); for n = 1:D printf ('x(%d) = %e\n', n, bestmem(n)); end end end iter = iter + 1; endwhile #---end while ((iter < maxiter) ... ## check that all variables are well within bounds/constraints boundsOK = 1; for i = 1:NP range = XVmax - XVmin; if (ui(i,:) < XVmin + 0.01*range) boundsOK = 0; end if (ui(i,:) > XVmax - 0.01*range) boundsOK = 0; end end ## create the convergence result if (bestval <= VTR) convergence = 1; elseif (abs (max (val) - bestval) / max (1, abs (max (val))) <= tol) convergence = 0; elseif (boundsOK == 0) convergence = -1; elseif (iter >= maxiter) convergence = -2; elseif (nfeval >= maxnfe) convergence = -3; end endfunction %!function result = f(x); %! result = 100 * (x(2) - x(1)^2)^2 + (1 - x(1))^2; %!test %! tol = 1.0e-4; %! ctl.tol = 0.0; %! ctl.VTR = 1.0e-6; %! ctl.XVmin = [-2 -2]; %! ctl.XVmax = [ 2 2]; %! rand("state", 11) %! [x, obj_value, nfeval, convergence] = de_min (@f, ctl); %! assert (convergence == 1); %! assert (f(x) == obj_value); %! assert (obj_value < ctl.VTR); %!demo %! ## define a simple example function %! f = @(x) peaks(x(1), x(2)); %! ## plot the function to see where the minimum might be %! peaks() %! ## first we set the region where we expect the minimum %! ctl.XVmin = [-3 -3]; %! ctl.XVmax = [ 3 3]; %! ## and solve it with de_min %! [x, obj_value, nfeval, convergence] = de_min (f, ctl) optim-1.4.1/inst/line_min.m0000644000175000017500000000557612477523726014252 0ustar olafolaf## Copyright (C) 2000 Ben Sapp ## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2011 Nir Krakauer nkrakauer@ccny.cuny.edu ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [a,fx,nev] = line_min (f, dx, args, narg, h, nev_max) - Minimize f() along dx ## ## INPUT ---------- ## f : string : Name of minimized function ## dx : matrix : Direction along which f() is minimized ## args : cell : Arguments of f ## narg : integer : Position of minimized variable in args. Default=1 ## h : scalar : Step size to use for centered finite difference ## approximation of first and second derivatives. Default=1E-3. ## nev_max : integer : Maximum number of function evaluations. Default=30 ## ## OUTPUT --------- ## a : scalar : Value for which f(x+a*dx) is a minimum (*) ## fx : scalar : Value of f(x+a*dx) at minimum (*) ## nev : integer : Number of function evaluations ## ## (*) The notation f(x+a*dx) assumes that args == {x}. ## ## Reference: David G Luenberger's Linear and Nonlinear Programming function [a,fx,nev] = line_min (f, dx, args, narg, h, nev_max) velocity = 1; acceleration = 1; if (nargin < 4) narg = 1; endif if (nargin < 5) h = 0.001; endif if (nargin < 6) nev_max = 30; endif nev = 0; x = args{narg}; a = 0; min_velocity_change = 0.000001; while (abs (velocity) > min_velocity_change && nev < nev_max) fx = feval (f,args{1:narg-1}, x+a*dx, args{narg+1:end}); fxph = feval (f,args{1:narg-1}, x+(a+h)*dx, args{narg+1:end}); fxmh = feval (f,args{1:narg-1}, x+(a-h)*dx, args{narg+1:end}); if (nev == 0) fx0 = fx; endif velocity = (fxph - fxmh)/(2*h); acceleration = (fxph - 2*fx + fxmh)/(h^2); if abs(acceleration) <= eps, acceleration = 1; end # Don't do div by zero # Use abs(accel) to avoid problems due to # concave function a = a - velocity/abs(acceleration); nev += 3; endwhile fx = feval (f, args{1:narg-1}, x+a*dx, args{narg+1:end}); nev++; if fx >= fx0 # if no improvement, return the starting value a = 0; fx = fx0; endif if (nev >= nev_max) disp ("line_min: maximum number of function evaluations reached") endif endfunction ## Rem : Although not clear from the code, the returned a always seems to ## correspond to (nearly) optimal fx. optim-1.4.1/inst/jacobs.m0000644000175000017500000001545512477523726013716 0ustar olafolaf## Copyright (C) 2011 Fotios Kasolis ## Copyright (C) 2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {Df =} jacobs (@var{x}, @var{f}) ## @deftypefnx {Function File} {Df =} jacobs (@var{x}, @var{f}, @var{hook}) ## Calculate the jacobian of a function using the complex step method. ## ## Let @var{f} be a user-supplied function. Given a point @var{x} at ## which we seek for the Jacobian, the function @command{jacobs} returns ## the Jacobian matrix @code{d(f(1), @dots{}, df(end))/d(x(1), @dots{}, ## x(n))}. The function uses the complex step method and thus can be ## applied to real analytic functions. ## ## The optional argument @var{hook} is a structure with additional options. @var{hook} ## can have the following fields: ## @itemize @bullet ## @item ## @code{h} - can be used to define the magnitude of the complex step and defaults ## to 1e-20; steps larger than 1e-3 are not allowed. ## @item ## @code{fixed} - is a logical vector internally usable by some optimization ## functions; it indicates for which elements of @var{x} no gradient should be ## computed, but zero should be returned. ## @end itemize ## ## For example: ## ## @example ## @group ## f = @@(x) [x(1)^2 + x(2); x(2)*exp(x(1))]; ## Df = jacobs ([1, 2], f) ## @end group ## @end example ## @end deftypefn function Df = jacobs (x, f, hook) if ( (nargin < 2) || (nargin > 3) ) print_usage (); endif if (ischar (f)) f = str2func (f, "global"); endif n = numel (x); default_h = 1e-20; max_h = 1e-3; # must be positive if (nargin > 2) if (isfield (hook, "h")) if (! (isscalar (hook.h))) error ("complex step magnitude must be a scalar"); endif if (abs (hook.h) > max_h) warning ("complex step magnitude larger than allowed, set to %e", ... max_h) h = max_h; else h = hook.h; endif else h = default_h; endif if (isfield (hook, "fixed")) if (numel (hook.fixed) != n) error ("index of fixed parameters has wrong dimensions"); endif fixed = hook.fixed(:); else fixed = false (n, 1); endif if (isfield (hook, 'parallel_local')) parallel_local = hook.parallel_local; else parallel_local = false; end else h = default_h; fixed = false (n, 1); parallel_local = false; endif if (all (fixed)) error ("all elements of 'x' are fixed"); endif x = repmat (x(:), 1, n) + h * 1i * eye (n); idx = find (! fixed).'; if (parallel_local) ## symplicistic approach, fork for each computation and leave all ## scheduling to kernel; otherwise arguments would have to be passed ## over pipes, not sure whether this would be faster n_childs = sum (! fixed); child_data = zeros (n_childs, 4); # pipe descriptor for reading, # pid, line number, parameter number child_data(:, 3) = (1 : n_childs).'; active_childs = true (n_childs, 1); unwind_protect ready = false; lerrm = lasterr (); lasterr (""); cid = 0; for count = idx cid++; child_data(cid, 4) = count; [pd1, pd2, err, msg] = pipe (); if (err) error ("could not create pipe: %s", msg); endif child_data(cid, 1) = pd1; if ((pid = fork ()) == 0) ## child pclose (pd1); unwind_protect tp = imag (f (x(:, count))(:) / h); __bw_psend__ (pd2, tp); unwind_protect_cleanup pclose (pd2); __internal_exit__ (); end_unwind_protect ## end child elseif (pid > 0) child_data(cid, 2) = pid; pclose (pd2); else error ("could not fork"); endif endfor first = true; while (any (active_childs)) [~, act] = select (child_data(active_childs, 1), [], [], -1); act_idx = child_data(active_childs, 3)(act); for id = act_idx.' res = __bw_prcv__ (child_data(id, 1)); if (isnumeric (res)) error ("child closed pipe without sending"); endif res = res.psend_var; pclose (child_data(id, 1)); child_data(id, 1) = 0; waitpid (child_data(id, 2)); child_data(id, 2) = 0; active_childs(id) = false; if (first) dim = numel (res); Df = zeros (dim, n); first = false; endif Df(:, child_data(id, 4)) = res; endfor endwhile ready = true; # try/catch would not handle ctrl-c unwind_protect_cleanup if (! ready) for (id = 1 : n_childs) if (child_data(id, 1)) pclose (child_data(id, 1)); endif if (child_data(id, 2)) kill (child_data(id, 2), 9); waitpid (child_data(id, 2)); endif endfor nerrm = lasterr (); error ("no success, last error message: %s", nerrm); endif lasterr (lerrm); end_unwind_protect else # not parallel ## after first evaluation, dimensionness of 'f' is known t_Df = imag (f (x(:, idx(1)))(:)); dim = numel (t_Df); Df = zeros (dim, n); Df(:, idx(1)) = t_Df; for count = idx(2:end) Df(:, count) = imag (f (x(:, count))(:)); endfor Df /= h; endif endfunction %!assert (jacobs (1, @(x) x), 1) %!assert (jacobs (6, @(x) x^2), 12) %!assert (jacobs ([1; 1], @(x) [x(1)^2; x(1)*x(2)]), [2, 0; 1, 1]) %!assert (jacobs ([1; 2], @(x) [x(1)^2 + x(2); x(2)*exp(x(1))]), [2, 1; 2*exp(1), exp(1)]) %% Test input validation %!error jacobs () %!error jacobs (1) %!error jacobs (1, 2, 3, 4) %!error jacobs (@sin, 1, [1, 1]) %!error jacobs (@sin, 1, ones(2, 2)) %!demo %! # Relative error against several h-values %! k = 3:20; h = 10 .^ (-k); x = 0.3*pi; %! err = zeros (1, numel (k)); %! for count = 1 : numel (k) %! err(count) = abs (jacobs (x, @sin, struct ("h", h(count))) - cos (x)) / abs (cos (x)) + eps; %! endfor %! loglog (h, err); grid minor; %! xlabel ("h"); ylabel ("|Df(x) - cos(x)| / |cos(x)|") %! title ("f(x)=sin(x), f'(x)=cos(x) at x = 0.3pi") optim-1.4.1/inst/d2_min.m0000644000175000017500000003024012477523726013612 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [x,v,nev,h,args] = d2_min(f,d2f,args,ctl,code) - Newton-like minimization ## ## Minimize f(x) using 1st and 2nd derivatives. Any function w/ second ## derivatives can be minimized, as in Newton. f(x) decreases at each ## iteration, as in Levenberg-Marquardt. This function is inspired from the ## Levenberg-Marquardt algorithm found in the book "Numerical Recipes". ## ## ARGUMENTS : ## f : string : Cost function's name ## ## d2f : string : Name of function returning the cost (1x1), its ## differential (1xN) and its second differential or it's ## pseudo-inverse (NxN) (see ctl(5) below) : ## ## [v,dv,d2v] = d2f (x). ## ## args : list : f and d2f's arguments. By default, minimize the 1st ## or matrix : argument. ## ## ctl : vector : Control arguments (see below) ## or struct ## ## code : string : code will be evaluated after each outer loop that ## produced some (any) improvement. Variables visible from ## "code" include "x", the best parameter found, "v" the ## best value and "args", the list of all arguments. All can ## be modified. This option can be used to re-parameterize ## the argument space during optimization ## ## CONTROL VARIABLE ctl : (optional). May be a struct or a vector of length ## ---------------------- 5 or less where NaNs are ignored. Default values ## are written . ## FIELD VECTOR ## NAME POS ## ## ftol, f N/A : Stop search when value doesn't improve, as tested by ## ## f > Deltaf/max(|f(x)|,1) ## ## where Deltaf is the decrease in f observed in the last ## iteration. <10*sqrt(eps)> ## ## utol, u N/A : Stop search when updates are small, as tested by ## ## u > max { dx(i)/max(|x(i)|,1) | i in 1..N } ## ## where dx is the change in the x that occured in the last ## iteration. ## ## dtol, d N/A : Stop search when derivative is small, as tested by ## ## d > norm (dv) ## ## crit, c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'utol' (c=2) ## or 'dtol' (c=3) to the value of by the 'tol' option. <1> ## ## tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> ## ## narg, n ctl(3) : Position of the minimized argument in args <1> ## maxev,m ctl(4) : Maximum number of function evaluations ## maxout,m : Maximum number of outer loops ## id2f, i ctl(5) : 0 if d2f returns the 2nd derivatives, 1 if <0> ## it returns its pseudo-inverse. ## ## verbose, v N/A : Be more or less verbose (quiet=0) <0> function [xbest,vbest,nev,hbest,args] = d2_min (f,d2f,args,ctl,code) persistent warned = false; if (! warned) warned = true; warning ("Octave:deprecated-function", "The standalone function `d2_min' has been deprecated, and will be removed in the future. The algorithm can now be used with the frontend `nonlin_min'."); endif maxout = inf; maxinner = 30 ; tcoeff = 0.5 ; # Discount on total weight ncoeff = 0.5 ; # Discount on weight of newton ocoeff = 1.5 ; # Factor for outwards searching report = 0 ; # Never report verbose = 0 ; # Be quiet prudent = 1 ; # Check coherence of d2f and f? niter = 0 ; crit = 0; # Default control variables ftol = 10 * sqrt (eps); dtol = eps; utol = tol = nan; narg = 1; maxev = inf; id2f = 0; if nargin >= 4 # Read arguments if isnumeric (ctl) if length (ctl)>=1 && !isnan (ctl(1)), crit = ctl(1); end if length (ctl)>=2 && !isnan (ctl(2)), tol = ctl(2); end if length (ctl)>=3 && !isnan (ctl(3)), narg = ctl(3); end if length (ctl)>=4 && !isnan (ctl(4)), maxev = ctl(4); end if length (ctl)>=5 && !isnan (ctl(5)), id2f = ctl(5); end elseif isstruct (ctl) if isfield (ctl, "crit") , crit = ctl.crit ; end if isfield (ctl, "tol") , tol = ctl.tol ; end if isfield (ctl, "narg") , narg = ctl.narg ; end if isfield (ctl, "maxev") , maxev = ctl.maxev ; end if isfield (ctl, "maxout") , maxout = ctl.maxout ; end if isfield (ctl, "id2f") , id2f = ctl.id2f ; end if isfield (ctl, "verbose"), verbose = ctl.verbose; end if isfield (ctl, "code") , code = ctl.code ; end else error ("The 'ctl' argument should be either a vector or a struct"); end end if crit == 1, ftol = tol; elseif crit == 2, utol = tol; elseif crit == 3, dtol = tol; elseif crit, error ("crit is %i. Should be 1,2 or 3.\n"); end if nargin < 5, code = "" ; end if iscell (args) # List of arguments x = args{narg}; else # Single argument x = args; args = {args}; end ############################## Checking ############################## if narg > length (args) error ("d2_min : narg==%i, length (args)==%i\n", narg, length (args)); end if tol <= 0 printf ("d2_min : tol=%8.3g <= 0\n",tol) ; end if !ischar (d2f) || !ischar (f) printf ("d2_min : f and d2f should be strings!\n"); end sz = size (x); N = prod (sz); v = feval (f, args{:}); nev = [1,0]; if prudent && (! isnumeric (v) || isnan (v) || any (size (v)>1)) error ("Function '%s' returns inadequate output", f); end xbest = x = x(:); vold = vbest = nan ; # Values of f hbest = nan ; # Inv. Hessian if verbose printf ( "d2_min : Initially, v=%8.3g\n",v); end while niter <= maxout niter += 1; if nev(1) > maxev, break; end; [v,d,h] = feval (d2f, args{1:narg-1},reshape(x,sz),args{narg+1:end}); nev(2)++; if prudent && niter <= 1 && ... (! isnumeric (v) || isnan (v) || any (size (v)>1) || ... ! isnumeric (d) || length (d(:)) != N || ... ! isnumeric (h) || any (size (h) != N)) error ("Function '%s' returns inadequate output", d2f); end if ! id2f, h = pinv (h); end d = d(:); if prudent v2 = feval (f, args{1:narg-1},reshape(x,sz),args{narg+1:end}); nev(1)++; if abs(v2-v) > 0.001 * sqrt(eps) * max (abs(v2), 1) printf ("d2_min : f and d2f disagree %8.3g\n",abs(v2-v)); end end xbest = x ; if ! isnan (vbest) # Check that v ==vbest if abs (vbest - v) > 1000*eps * max (vbest, 1) printf ("d2_min : vbest changed at beginning of outer loop\n"); end end vold = vbest = v ; hbest = h ; if length (code), abest = args; end # Eventually stash all args if verbose || (report && rem(niter,max(report,1)) == 1) printf ("d2_min : niter=%d, v=%8.3g\n",niter,v ); end if norm (d) < dtol # Check for small derivative if verbose || report printf ("d2_min : Exiting because of low gradient\n"); end break; # Exit outer loop end dnewton = -h*d ; # Newton step # Heuristic for negative hessian if dnewton'*d > 0, dnewton = -100*d; end wn = 1 ; # Weight of Newton step wt = 1 ; # Total weight ninner = 0; done_inner = 0; # 0=not found. 1=Ready to quit inner. # ########################################## while ninner < maxinner, # Inner loop ############################### ninner += 1; # Proposed step dx = wt*(wn*dnewton - (1-wn)*d) ; xnew = x+dx ; if verbose printf (["Weight : total=%8.3g, newtons's=%8.3g vbest=%8.3g ",... "Norm:Newton=%8.3g, deriv=%8.3g\n"],... wt,wn,vbest,norm(wt*wn*dnewton),norm(wt*(1-wn)*d)); end if any(isnan(xnew)) printf ("d2_min : Whoa!! any(isnan(xnew)) (1)\n"); end vnew = feval (f, args{1:narg-1},reshape(xnew,sz),args{narg+1:end}); nev(1)++; if vnew= maxinner # There was a problem if verbose printf ( "d2_min : Too many inner loops (vnew=%8.3g)\n",vnew); end # ########################################## else # Look for improvement along dbest wn = ocoeff ; xnew = x+wn*dbest; if any(isnan(xnew)), printf ("d2_min : Whoa!! any(isnan(xnew)) (2)\n"); end vnew = feval (f, args{1:narg-1},reshape(xnew,sz),args{narg+1:end}); nev(1)++; while vnew < vbest, vbest = vnew; # Stash best values wbest = wn; xbest = xnew; wn = wn*ocoeff ; xnew = x+wn*dbest; vnew = feval (f, args{1:narg-1},reshape(xnew,sz),args{narg+1:length(args)}); if verbose printf ( "Looking farther : v = %8.3g\n",vnew); end nev(1)++; end end # End of improving along dbest # ########################################## if verbose || rem(niter,max(report,1)) == 1 if vold, if verbose printf ("d2_min : Inner loop : vbest=%8.5g, vbest/vold=%8.5g\n",... vbest,vbest/vold); end else if verbose printf ( "d2_min : Inner loop : vbest=%8.5g, vold=0\n", vbest); end end end if vbest < vold ## "improvement found" if prudent tmpv = feval (f, args{1:narg-1},reshape(xbest,sz),args{2:end}); nev(1)++; if abs (tmpv-vbest) > eps printf ("d2_min : Whoa! Value at xbest changed by %g\n",... abs(tmpv-vbest)); end end v = vbest; x = xbest; if ! isempty (code) if verbose printf ("d2_min : Going to eval (\"%s\")\n",code); end xstash = xbest; astash = abest; args = abest; # Here : added 2001/11/07. Is that right? x = xbest; eval (code, "printf (\"code fails\\n\");"); xbest = x; abest = args; # Check whether eval (code) changes value if prudent tmpv = feval (f, args{1:narg-1},reshape(x,sz),args{2:end}); nev(1)++; if abs (tmpv-vbest) > max (min (100*eps,0.00001*abs(vbest)), eps) , printf ("d2_min : Whoa! Value changes by %g after eval (code)\n",... abs (tmpv-vbest)); end end end end if ! isnan (ftol) && ftol > (vold-vbest)/max(vold,1), if verbose || report printf ("d2_min : Quitting, niter=%-3d v=%8.3g, ",niter,v); if vold, printf ("v/vold=%8.3g \n",v/vold); else printf ("vold =0 \n",v); end end break ; # out of outer loop end if ! isnan (utol) && utol > max (abs (wbest*dbest)) / max(abs (xbest),1) if verbose || report printf ("d2_min : Quitting, niter=%-3d v=%8.3g, ",niter,v); if vold, printf ("v/vold=%8.3g \n",v/vold); else printf ("vold =0 \n",v); end end break ; # out of outer loop end end # End of outer loop ################## xbest = reshape (xbest, sz); if length (code) args = abest; args(narg) = xbest; end if niter > maxout if verbose printf ( "d2_min : Outer loop lasts forever\n"); end end # One last check if prudent err = feval (f, args{1:narg-1},reshape(xbest,sz),args{2:end}); nev(1)++; if abs (err-vbest) > eps, printf ("d2_min : Whoa!! xbest does not eval to vbest\n"); printf (" : %8.3e - %8.3e = %8.3e != 0\n",err,vbest,err-vbest); end end optim-1.4.1/inst/vfzero.m0000644000175000017500000003033612477523726013763 0ustar olafolaf## Copyright (C) 2008, 2009 VZLU Prague, a.s. ## Copyright (C) 2010 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {} vfzero (@var{fun}, @var{x0}) ## @deftypefnx {Function File} {} vfzero (@var{fun}, @var{x0}, @var{options}) ## @deftypefnx {Function File} {[@var{x}, @var{fval}, @var{info}, @var{output}] =} vfzero (@dots{}) ## A variant of @code{fzero}. Finds a zero of a vector-valued ## multivariate function where each output element only depends on the ## input element with the same index (so the Jacobian is diagonal). ## ## @var{fun} should be a handle or name of a function returning a column ## vector. @var{x0} should be a two-column matrix, each row specifying ## two points which bracket a zero of the respective output element of ## @var{fun}. ## ## If @var{x0} is a single-column matrix then several nearby and distant ## values are probed in an attempt to obtain a valid bracketing. If ## this is not successful, the function fails. @var{options} is a ## structure specifying additional options. Currently, @code{vfzero} ## recognizes these options: @code{"FunValCheck"}, @code{"OutputFcn"}, ## @code{"TolX"}, @code{"MaxIter"}, @code{"MaxFunEvals"}. For a ## description of these options, see optimset. ## ## On exit, the function returns @var{x}, the approximate zero and ## @var{fval}, the function value thereof. @var{info} is a column vector ## of exit flags that can have these values: ## ## @itemize ## @item 1 The algorithm converged to a solution. ## ## @item 0 Maximum number of iterations or function evaluations has been ## reached. ## ## @item -1 The algorithm has been terminated from user output function. ## ## @item -5 The algorithm may have converged to a singular point. ## @end itemize ## ## @var{output} is a structure containing runtime information about the ## @code{fzero} algorithm. Fields in the structure are: ## ## @itemize ## @item iterations Number of iterations through loop. ## ## @item nfev Number of function evaluations. ## ## @item bracketx A two-column matrix with the final bracketing of the ## zero along the x-axis. ## ## @item brackety A two-column matrix with the final bracketing of the ## zero along the y-axis. ## @end itemize ## @end deftypefn ## This is essentially the ACM algorithm 748: Enclosing Zeros of ## Continuous Functions due to Alefeld, Potra and Shi, ACM Transactions ## on Mathematical Software, Vol. 21, No. 3, September 1995. Although ## the workflow should be the same, the structure of the algorithm has ## been transformed non-trivially; instead of the authors' approach of ## sequentially calling building blocks subprograms we implement here a ## FSM version using one interior point determination and one bracketing ## per iteration, thus reducing the number of temporary variables and ## simplifying the algorithm structure. Further, this approach reduces ## the need for external functions and error handling. The algorithm has ## also been slightly modified. ## Author: Jaroslav Hajek ## PKG_ADD: __all_opts__ ("vfzero"); function [x, fval, info, output] = vfzero (fun, x0, options = struct ()) ## Get default options if requested. if (nargin == 1 && ischar (fun) && strcmp (fun, 'defaults')) x = optimset ("MaxIter", Inf, "MaxFunEvals", Inf, "TolX", 1e-8, ... "OutputFcn", [], "FunValCheck", "off"); return; endif if (nargin < 2 || nargin > 3) print_usage (); endif if (ischar (fun)) fun = str2func (fun, "global"); endif ## TODO ## displev = optimget (options, "Display", "notify"); funvalchk = strcmpi (optimget (options, "FunValCheck", "off"), "on"); outfcn = optimget (options, "OutputFcn"); tolx = optimget (options, "TolX", 1e-8); maxiter = optimget (options, "MaxIter", Inf); maxfev = optimget (options, "MaxFunEvals", Inf); nx = rows (x0); ## fun may assume a certain length of x, so we will always call it ## with the full-length x, even if only some elements are needed persistent mu = 0.5; if (funvalchk) ## Replace fun with a guarded version. fun = @(x) guarded_eval (fun, x); endif ## The default exit flag if exceeded number of iterations. info = zeros (nx, 1); niter = 0; nfev = 0; x = fval = fc = a = fa = b = fb = aa = c = u = fu = NaN (nx, 1); bracket_ready = false (nx, 1); eps = eps (class (x0)); ## Prepare... a = x0(:, 1); fa = fun (a)(:); nfev = 1; if (columns (x0) > 1) b = x0(:, 2); fb = fun (b)(:); nfev += 1; else ## Try to get b. aa(idx = a == 0) = 1; aa(! idx) = a(! idx); for tb = [0.9*aa, 1.1*aa, aa-1, aa+1, 0.5*aa 1.5*aa, -aa, 2*aa, -10*aa, 10*aa] tfb = fun (tb)(:); nfev += 1; idx = ! bracket_ready & sign (fa) .* sign (tfb) <= 0; bracket_ready |= idx; b(idx) = tb(idx); fb(idx) = tfb(idx); if (all (bracket_ready)) break; endif endfor endif tp = a(idx = b < a); a(idx) = b(idx); b(idx) = tp; tp = fa(idx); fa(idx) = fb(idx); fb(idx) = tp; if (! all (sign (fa) .* sign (fb) <= 0)) error ("fzero:bracket", "vfzero: not a valid initial bracketing"); endif slope0 = (fb - fa) ./ (b - a); idx = fa == 0; b(idx) = a(idx); fb(idx) = fa(idx); idx = (! idx & fb == 0); a(idx) = b(idx); fa(idx) = fb(idx); itype = ones (nx, 1); idx = abs (fa) < abs (fb); u(idx) = a(idx); fu(idx) = fa(idx); u(! idx) = b(! idx); fu(! idx) = fb(! idx); d = e = u; fd = fe = fu; mba = mu * (b - a); not_ready = true (nx, 1); while (niter < maxiter && nfev < maxfev && any (not_ready)) ## itype == 1 type1idx = not_ready & itype == 1; ## The initial test. idx = b - a <= 2*(2 * eps * abs (u) + tolx) & type1idx; x(idx) = u(idx); fval(idx) = fu(idx); info(idx) = 1; not_ready(idx) = false; type1idx &= not_ready; exclidx = type1idx; ## Secant step. idx = type1idx & ... (tidx = abs (fa) <= 1e3*abs (fb) & abs (fb) <= 1e3*abs (fa)); c(idx) = u(idx) - (a(idx) - b(idx)) ./ (fa(idx) - fb(idx)) .* fu(idx); ## Bisection step. idx = type1idx & ! tidx; c(idx) = 0.5*(a(idx) + b(idx)); d(type1idx) = u(type1idx); fd(type1idx) = fu(type1idx); itype(type1idx) = 5; ## itype == 2 or 3 type23idx = not_ready & ! exclidx & (itype == 2 | itype == 3); exclidx |= type23idx; uidx = cellfun (@ (x) length (unique (x)), ... num2cell ([fa, fb, fd, fe], 2)) == 4; oidx = sign (c - a) .* sign (c - b) > 0; ## Inverse cubic interpolation. idx = type23idx & (uidx & ! oidx); q11 = (d(idx) - e(idx)) .* fd(idx) ./ (fe(idx) - fd(idx)); q21 = (b(idx) - d(idx)) .* fb(idx) ./ (fd(idx) - fb(idx)); q31 = (a(idx) - b(idx)) .* fa(idx) ./ (fb(idx) - fa(idx)); d21 = (b(idx) - d(idx)) .* fd(idx) ./ (fd(idx) - fb(idx)); d31 = (a(idx) - b(idx)) .* fb(idx) ./ (fb(idx) - fa(idx)); q22 = (d21 - q11) .* fb(idx) ./ (fe(idx) - fb(idx)); q32 = (d31 - q21) .* fa(idx) ./ (fd(idx) - fa(idx)); d32 = (d31 - q21) .* fd(idx) ./ (fd(idx) - fa(idx)); q33 = (d32 - q22) .* fa(idx) ./ (fe(idx) - fa(idx)); c(idx) = a(idx) + q31 + q32 + q33; ## Quadratic interpolation + newton. idx = type23idx & (oidx | ! uidx); a0 = fa(idx); a1 = (fb(idx) - fa(idx))./(b(idx) - a(idx)); a2 = ((fd(idx) - fb(idx))./(d(idx) - b(idx)) - a1) ./ (d(idx) - a(idx)); ## Modification 1: this is simpler and does not seem to be worse. c(idx) = a(idx) - a0./a1; taidx = a2 != 0; tidx = idx; tidx(tidx) = taidx; c(tidx) = a(tidx)(:) - (a0(taidx)./a1(taidx))(:); for i = 1:3 tidx &= i <= itype; taidx = tidx(idx); pc = a0(taidx)(:) + (a1(taidx)(:) + ... a2(taidx)(:).*(c(tidx) - b(tidx))(:)) ... .*(c(tidx) - a(tidx))(:); pdc = a1(taidx)(:) + a2(taidx)(:).*(2*c(tidx) - a(tidx) - b(tidx))(:); tidx0 = tidx; tidx0(tidx0, 1) &= (p0idx = pdc == 0); taidx0 = tidx0(idx); tidx(tidx, 1) &= ! p0idx; c(tidx0) = a(tidx0)(:) - (a0(taidx0)./a1(taidx0))(:); c(tidx) = c(tidx)(:) - (pc(! p0idx)./pdc(! p0idx))(:); endfor itype(type23idx) += 1; ## itype == 4 type4idx = not_ready & ! exclidx & itype == 4; exclidx |= type4idx; ## Double secant step. idx = type4idx; c(idx) = u(idx) - 2*(b(idx) - a(idx))./(fb(idx) - fa(idx)).*fu(idx); ## Bisect if too far. idx = type4idx & abs (c - u) > 0.5*(b - a); c(idx) = 0.5 * (b(idx) + a(idx)); itype(type4idx) = 5; ## itype == 5 type5idx = not_ready & ! exclidx & itype == 5; ## Bisection step. idx = type5idx; c(idx) = 0.5 * (b(idx) + a(idx)); itype(type5idx) = 2; ## Don't let c come too close to a or b. delta = 2*0.7*(2 * eps * abs (u) + tolx); nidx = not_ready & ! (idx = b - a <= 2*delta); idx &= not_ready; c(idx) = (a(idx) + b(idx))/2; c(nidx) = max (a(nidx) + delta(nidx), ... min (b(nidx) - delta(nidx), c(nidx))); ## Calculate new point. idx = not_ready; x(idx, 1) = c(idx, 1); if (any (idx)) c(! idx) = u(! idx); # to have some working place-holders since # fun() might expect full-length # argument fval(idx, 1) = fc(idx, 1) = fun (c)(:)(idx, 1); niter ++; nfev ++; endif ## Modification 2: skip inverse cubic interpolation if ## nonmonotonicity is detected. nidx = not_ready & ! (idx = sign (fc - fa) .* sign (fc - fb) >= 0); idx &= not_ready; ## The new point broke monotonicity. ## Disable inverse cubic. fe(idx) = fc(idx); ## e(nidx) = d(nidx); fe(nidx) = fd(nidx); ## Bracketing. idx1 = not_ready & sign (fa) .* sign (fc) < 0; idx2 = not_ready & ! idx1 & sign (fb) .* sign (fc) < 0; idx3 = not_ready & ! (idx1 | idx2) & fc == 0; d(idx1) = b(idx1); fd(idx1) = fb(idx1); b(idx1) = c(idx1); fb(idx1) = fc(idx1); d(idx2) = a(idx2); fd(idx2) = fa(idx2); a(idx2) = c(idx2); fa(idx2) = fc(idx2); a(idx3) = b(idx3) = c(idx3); fa(idx3) = fb(idx3) = fc(idx3); info(idx3) = 1; not_ready(idx3) = false; if (any (not_ready & ! (idx1 | idx2 | idx3))) ## This should never happen. error ("fzero:bracket", "vfzero: zero point is not bracketed"); endif ## If there's an output function, use it now. if (! isempty (outfcn)) optv.funccount = nfev; optv.fval = fval; optv.iteration = niter; idx = not_ready & outfcn (x, optv, "iter"); info(idx) = -1; not_ready(idx) = false; endif nidx = not_ready & ! (idx = abs (fa) < abs (fb)); idx &= not_ready; u(idx) = a(idx); fu(idx) = fa(idx); u(nidx) = b(nidx); fu(nidx) = fb(nidx); idx = not_ready & b - a <= 2*(2 * eps * abs (u) + tolx); info(idx) = 1; not_ready(idx) = false; ## Skip bisection step if successful reduction. itype(not_ready & itype == 5 & (b - a) <= mba) = 2; idx = not_ready & itype == 2; mba(idx) = mu * (b(idx) - a(idx)); endwhile ## Check solution for a singularity by examining slope idx = not_ready & info == 1 & (b - a) != 0; idx(idx, 1) &= ... abs ((fb(idx, 1) - fa(idx, 1))./(b(idx, 1) - a(idx, 1)) ... ./ slope0(idx, 1)) > max (1e6, 0.5/(eps+tolx)); info(idx) = - 5; output.iterations = niter; output.funcCount = nfev; output.bracketx = [a, b]; output.brackety = [fa, fb]; endfunction ## An assistant function that evaluates a function handle and checks for ## bad results. function fx = guarded_eval (fun, x) fx = fun (x); if (! isreal (fx)) error ("fzero:notreal", "vfzero: non-real value encountered"); elseif (any (isnan (fx))) error ("fzero:isnan", "vfzero: NaN value encountered"); endif endfunction %!shared opt0 %! opt0 = optimset ("tolx", 0); %!assert(vfzero(@cos, [0, 3], opt0), pi/2, 10*eps) %!assert(vfzero(@(x) x^(1/3) - 1e-8, [0,1], opt0), 1e-24, 1e-22*eps) optim-1.4.1/inst/dcdp.m0000644000175000017500000000246312477523726013362 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## function prt = dcdp (f, p, dp, func[, bounds]) ## ## This is an interface to __dfdp__.m, similar to dfdp.m, but for ## functions only of parameters 'p', not of independents 'x'. See ## dfdp.m. ## ## dfpdp is more general and is meant to be used instead of dcdp in ## optimization. function prt = dcdp (f, p, dp, func, bounds) if (ischar (func)) func = str2func (func); endif hook.f = f; if (nargin > 4) hook.lbounds = bounds(:, 1); hook.ubounds = bounds(:, 2); endif hook.diffp = abs (dp); hook.fixed = dp == 0; hook.diff_onesided = dp < 0; prt = __dfdp__ (p, func, hook); endfunction optim-1.4.1/inst/private/0002755000175000017500000000000012477523726013741 5ustar olafolafoptim-1.4.1/inst/private/__siman__.m0000644000175000017500000003761612477523726016035 0ustar olafolaf## Copyright (C) 2012 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## The simulated annealing code is translated and adapted from siman.c, ## written by Mark Galassi, of the GNU Scientific Library. function [p_res, objf, cvg, outp] = __siman__ (f, pin, hook) ## passed constraints: ## ## hook.mc: matrix of linear constraints ## ## hook.vc: vector of linear constraints ## ## hook.f_cstr: function of all constraints ## ## hook.df_cstr: function of derivatives of all constraints ## ## hook.n_gencstr: number of non-linear constraints ## ## hook.eq_idx: logical index of equality constraints in all ## constraints ## ## hook.lbound, hook.ubound: bounds, subset of linear inequality ## constraints in mc and vc ## passed values of constraints for initial parameters pin_cstr = hook.pin_cstr; ## passed function for complementary pivoting, currently sqp is used ## instead ## ## cpiv = hook.cpiv; ## passed simulated annealing parameters T_init = hook.siman.T_init; T_min = hook.siman.T_min; mu_T = hook.siman.mu_T; iters_fixed_T = hook.siman.iters_fixed_T; max_rand_step = hook.max_rand_step; ## passed options fixed = hook.fixed; verbose = strcmp (hook.Display, "iter"); user_interaction = hook.user_interaction; siman_log = hook.siman_log; trace_steps = hook.trace_steps; save_state = ! isempty (hook.save_state); recover_state = ! isempty (hook.recover_state); ## Parallelization, if any, will be done within the iterations at a ## fixed temperature. Some parameter combinations will be tested in ## parallel, but an order will be defined for them and all results ## after the first accepted will be discarded. The time savings will ## depend on the frequency of accepted results. To limit time losses ## even in cases where the first of the parallel results is accepted, ## the number of parallel tests will not exceed the number of ## available processor cores. if ((parallel_local = hook.parallel_local)) np = int32 (nproc ("current")); np = ifelse (iters_fixed_T < np, iters_fixed_T, np); if (np < 2) parallel_local = false; endif endif ## some useful variables derived from passed variables n_lconstr = length (hook.vc); n_bounds = sum (hook.lbound != -Inf) + sum (hook.ubound != Inf); hook.ac_idx = true (n_lconstr + hook.n_gencstr, 1); hook.ineq_idx = ! hook.eq_idx; hook.leq_idx = hook.eq_idx(1:n_lconstr); hook.lineq_idx = hook.ineq_idx(1:n_lconstr); hook.lfalse_idx = false(n_lconstr, 1); nz = 20 * eps; # This is arbitrary. Accuracy of equality constraints. ## backend-specific checking of options and constraints ## ## equality constraints can not be met by chance if ((any (hook.eq_idx) || any (hook.lbound == hook.ubound)) && ! hook.stoch_regain_constr) error ("If 'stoch_regain_constr' is not set, equality constraints or identical lower and upper bounds are not allowed by simulated annealing backend."); endif ## if (any (pin < hook.lbound | pin > hook.ubound) || any (pin_cstr.inequ.lin_except_bounds < 0) || any (pin_cstr.inequ.gen < 0) || any (abs (pin_cstr.equ.lin)) >= nz || any (abs (pin_cstr.equ.gen)) >= nz) error ("Initial parameters violate constraints."); endif ## if (all (fixed)) error ("no free parameters"); endif ## idx = isna (max_rand_step); max_rand_step(idx) = 0.005 * pin(idx); ## fill constant fields of hook for derivative-functions; some fields ## may be backend-specific dfdp_hook.fixed = fixed; # this may be handled by the frontend, but # the backend still may add to it ## set up for iterations done = false; if (recover_state) state = load (hook.recover_state); p = state.p; best_p = state.best_p; E = state.E; best_E = state.best_E; T = state.T; n_evals = state.n_evals; n_iter = state.n_iter; rand ("state", state.rstate); if (isfield (state, "log")) log = state.log; endif if (isfield (state, "trace")) trace = state.trace; endif else p = best_p = pin; E = best_E = f (pin); T = T_init; n_evals = 1; n_iter = 0; if (siman_log) log = zeros (0, 5); endif if (trace_steps) trace = [0, 0, E, pin.']; endif if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", 0, "fval", E), "init"))) p_res = p; outp.niter = 0; objf = E; cvg = -1; return; endif endif cvg = 1; unwind_protect if (parallel_local) parallel_ready = false; lerrm = lasterr (); lasterr (""); child_data = zeros (np, 4); # pipe descriptor for reading, pipe # descriptor for writing, pid, line # number child_data(:, 4) = 1 : np; ## create subprocesses for id = 1 : np ## parameter pipe [pdp_r, pdp_w, err, msg] = pipe (); if (err) error ("could not create pipe: %s", msg); endif ## result pipe [pdr_r, pdr_w, err, msg] = pipe (); if (err) error ("could not create pipe: %s", msg); endif child_data(id, 1) = pdr_r; child_data(id, 2) = pdp_w; if ((pid = fork ()) == 0) ## child pclose (pdp_w); pclose (pdr_r); try while (true) p = __bw_prcv__ (pdp_r); if (isnumeric (p)) error ("parent closed without sending"); endif if (ischar (p.psend_var)) pclose (pdp_r); pclose (pdr_w); __internal_exit__ (); endif new_E = f (p.psend_var); __bw_psend__ (pdr_w, new_E); fflush (pdr_w); endwhile catch pclose (pdp_r); pclose (pdr_w); __internal_exit__ (); end_try_catch ## end child elseif (pid > 0) ## parent child_data(id, 3) = pid; pclose (pdp_r); pclose (pdr_w); else ## fork error error ("could not fork"); endif endfor ## create subprocesses endif # parallel_local ## simulated annealing while (! done) n_iter++; n_accepts = n_rejects = n_eless = 0; ## rand() for potential decisions on accepting a step with an ## increase is called here for all possibly parallized tests, to ## make the course of optimization potentially reproducible ## between parallelized and non-parallelized runs rand_store = rand (iters_fixed_T, 1); if (parallel_local) n_left = int32 (iters_fixed_T); while (n_left) ## number of currently used processes cnp = ifelse (np <= n_left, np, n_left); ## for restoration rand_states = cell (cnp - 1, 1); busy_children = true (cnp, 1); tp_E = zeros (cnp, 1); # results tp_p = cell (cnp, 1); # tested parameters for id = 1 : cnp ## all rand() calls are done in the parent process new_p = p + max_rand_step .* (2 * rand (size (p)) - 1); new_p = apply_constraints (p, new_p, hook, nz, verbose); ## tp_p{id} = new_p; if (id < cnp) rand_states{id} = rand ("state"); endif __bw_psend__ (child_data(id, 2), new_p); fflush (child_data(id, 2)); endfor while (any (busy_children)) [~, act] = ... select (child_data(busy_children, 1), [], [], -1); act_idx = child_data(busy_children, 4)(act); for id = act_idx.' res = __bw_prcv__ (child_data(id, 1)); if (isnumeric (res)) error ("child closed pipe without sending"); endif tp_E(id) = res.psend_var; busy_children(id) = false; endfor endwhile for (id = 1 : cnp) id_iters = double (iters_fixed_T - n_left + id); if (tp_E(id) < best_E) best_p = tp_p{id}; best_E = tp_E(id); endif if (tp_E(id) < E) ## take a step p = tp_p{id}; E = tp_E(id); n_eless++; if (trace_steps) trace(end + 1, :) = [n_iter, id_iters, E, p.']; endif break; elseif (rand_store(id_iters) < ... exp (- (tp_E(id) - E) / T)) ## take a step p = tp_p{id}; E = tp_E(id); n_accepts++; if (trace_steps) trace(end + 1, :) = [n_iter, id_iters, E, p.']; endif break; else n_rejects++; endif endfor ## 'id' is now the number of (ordered) parallel tests up to ## the accepted one; we discard all other tests as invalid n_left -= id; if (int32 (id) < cnp) ## restore random generator rand ("state", rand_states{id}) endif n_evals += id; endwhile # n_left else # ! parallel_local for id = 1 : iters_fixed_T new_p = p + max_rand_step .* (2 * rand (size (p)) - 1); new_p = apply_constraints (p, new_p, hook, nz, verbose); new_E = f (new_p); n_evals++; if (new_E < best_E) best_p = new_p; best_E = new_E; endif if (new_E < E) ## take a step p = new_p; E = new_E; n_eless++; if (trace_steps) trace(end + 1, :) = [n_iter, id, E, p.']; endif elseif (rand_store(id) < exp (- (new_E - E) / T)) ## take a step p = new_p; E = new_E; n_accepts++; if (trace_steps) trace(end + 1, :) = [n_iter, id, E, p.']; endif else n_rejects++; endif endfor # iters_fixed_T endif # ! parallel_local if (verbose) printf ("temperature no. %i: %e, energy %e,\n", n_iter, T, E); printf ("tries with energy less / not less but accepted / rejected:\n"); printf ("%i / %i / %i\n", n_eless, n_accepts, n_rejects); endif if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", n_iter, "fval", E), "iter"))) p_res = p; outp.niter = n_iter; objf = E; cvg = -1; return; endif if (siman_log) log(end + 1, :) = [T, E, n_eless, n_accepts, n_rejects]; endif ## cooling T /= mu_T; if (T < T_min) done = true; endif if (save_state) rstate = rand ("state"); unwind_protect unwind_protect_cleanup save ("-binary", hook.save_state, "p", "best_p", "E", ... "best_E", "T", "n_evals", "n_iter", "rstate", ... {"log"}(siman_log){:}, {"trace"}(trace_steps){:}); end_unwind_protect endif endwhile ## 'regular' cleanup if (parallel_local) for (id = 1 : np) __bw_psend__ (child_data(id, 2), "exit"); pclose (child_data(id, 2)); child_data(id, 2) = 0; pclose (child_data(id, 1)); child_data(id, 1) = 0; waitpid (child_data(id, 3)); child_data(id, 3) = 0; endfor parallel_ready = true; # try/catch would not handle ctrl-c endif unwind_protect_cleanup if (parallel_local) if (! parallel_ready) for (id = 1 : np) if (child_data(id, 1)) pclose (child_data(id, 1)); endif if (child_data(id, 2)) pclose (child_data(id, 2)); endif if (child_data(id, 3)) kill (child_data(id, 3), 9); waitpid (child_data(id, 3)); endif endfor nerrm = lasterr (); error ("no success, last error message: %s", nerrm); endif lasterr (lerrm); endif end_unwind_protect ## return result p_res = best_p; objf = best_E; outp.niter = n_iter; if (trace_steps) outp.trace = trace; endif if (siman_log) outp.log = log; endif if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p_res, struct ("iteration", n_iter, "fval", objf), "done"))) cvg = -1; endif endfunction function new_p = apply_constraints (p, new_p, hook, nz, verbose) if (hook.stoch_regain_constr) evidx = (abs ((ac = hook.f_cstr (new_p, hook.ac_idx))(hook.eq_idx)) >= nz); ividx = (ac(hook.ineq_idx) < 0); if (any (evidx) || any (ividx)) nv = sum (evidx) + sum (ividx); if (sum (lbvidx = (new_p < hook.lbound)) + ... sum (ubvidx = (new_p > hook.ubound)) == ... nv) ## special case only bounds violated, set back to bound new_p(lbvidx) = hook.lbound(lbvidx); new_p(ubvidx) = hook.ubound(ubvidx); elseif (nv == 1 && ... sum (t_eq = (abs (ac(hook.leq_idx)) >= nz)) + ... sum (t_inequ = (ac(hook.lineq_idx) < 0)) == 1) ## special case only one linear constraint violated, set back ## perpendicularly to constraint tidx = hook.lfalse_idx; tidx(hook.leq_idx) = t_eq; tidx(hook.lineq_idx) = t_inequ; c = hook.mc(:, tidx); d = ac(tidx); new_p -= c * (d / (c.' * c)); else ## other cases, set back keeping the distance to original ## 'new_p' minimal, using quadratic programming, or sequential ## quadratic programming for nonlinear constraints [new_p, discarded, sqp_info] = ... sqp (new_p, ... {@(x)sumsq(x-new_p), ... @(x)2*(x-new_p), ... @(x)2*eye(numel(p))}, ... {@(x)hook.f_cstr(x,hook.eq_idx), ... @(x)hook.df_cstr(x,hook.eq_idx, ... setfield(hook,"f", ... hook.f_cstr(x,hook.ac_idx)))}, ... {@(x)hook.f_cstr(x,hook.ineq_idx), ... @(x)hook.df_cstr(x,hook.ineq_idx, ... setfield(hook,"f", ... hook.f_cstr(x,hook.ac_idx)))}); if (sqp_info != 101) cvg = 0; done = true; break; endif endif endif else n_retry_constr = 0; while (any (abs ((ac = hook.f_cstr (new_p, hook.ac_idx))(hook.eq_idx)) >= nz) ... || any (ac(hook.ineq_idx) < 0)) new_p = p + hook.max_rand_step .* (2 * rand (size (p)) - 1); n_retry_constr++; endwhile if (verbose && n_retry_constr) printf ("%i additional tries of random step to meet constraints\n", n_retry_constr); endif endif endfunction optim-1.4.1/inst/private/__do_user_interaction__.m0000644000175000017500000000317212477523726020753 0ustar olafolaf## Copyright (C) 2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## -*- texinfo -*- ## @deftypefn{Function File} {[@var{stop}, @var{info}] =} __do_user_interaction__ (@var{user_funcs}, @var{p}, @var{hook}, @var{state}) ## Undocumented internal function. ## @end deftypefn function [stop, ret_info] = __do_user_interaction__ ... (user_funcs, p, hook, state) n = numel (user_funcs); ### current cellfun chokes on this for anonymous functions with two ### outputs: ### ### idx = num2cell ((1:n).'); ### ### [stop_cell, info_cell] = ... ### cellfun (@ (id, p, hook, state) user_funcs{id} (p, hook, state), ### idx, {p}, {hook}, {state}, "UniformOutput", false); ### ### stop_vec = cell2mat (stop_cell); ### ### so use this loop instead: stop_vec = false (n, 1); info_cell = cell (n, 1); for id = 1 : n [stop_vec(id), info_cell{id}] = user_funcs{id} (p, hook, state); endfor ### stop = any (stop_vec); ret_info.stop = stop_vec; ret_info.info = info_cell; endfunction optim-1.4.1/inst/private/__bracket_min.m0000644000175000017500000000251012477523726016667 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2009 Levente Torok ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [a, b, ga, gb, nev] = semi_bracket (f, dx, a, narg, args) ## ## Find an interval containing a local minimum of the function ## g : h in reals ---> f (x+h*dx) where x = args{narg} ## ## a < b. ## nev is the number of function evaluations function [a, b, ga, gb, n] = __bracket_min (f, dx, narg, args) [a,b, ga,gb, n] = __semi_bracket (f, dx, 0, narg, args); if a != 0, return; end [a2,b2, ga2,gb2, n2] = __semi_bracket (f, -dx, 0, narg, args); n += n2; if a2 == 0, a = -b2; ga = gb2; else a = -b2; b = -a2; ga = gb2; gb = ga2; end optim-1.4.1/inst/private/__covp_corp_wls__.m0000644000175000017500000000642412477523726017576 0ustar olafolaf## Copyright (C) 2011-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## This is based on Bard, Nonlinear Parameter Estimation, Academic ## Press, 1974, section 7-5, but also on own re-calculations for the ## specific case of weighted least squares. The part with only certain ## elements of covp or corp being defined is self-made, I don't know a ## reference. function hook = __covp_corp_wls__ (hook) ## compute jacobian, if not already done if (! isfield (hook, "jac")) hook.jac = hook.dfdp (hook.pfin, hook); endif jact = (jac = hook.jac).'; ## compute guessed covariance matrix of data, if not already done if (! isfield (hook, "covd")) hook = hook.funs.covd (hook); endif covd_inv = inv (covd = hook.covd); if (rcond (A = jact * covd_inv * jac) > eps) covp = hook.covp = inv (A); d = sqrt (diag (covp)); hook.corp = covp ./ (d * d.'); else n = hook.np; covp = NA (n); ## Now we have the equation "A * covp * A.' == A". ## Find a particular solution for "covp * A.'". part_covp_At = A \ A; ## Find a particular solution for "covp". Only uniquely defined ## elements (identified later) will be further used. part_covp = A \ part_covp_At.'; ## Find a basis for the nullspace of A. if (true) # test for Octave version once submitted patch is applied # to Octave (bug #33503) null = @ __null_optim__; endif if (isempty (basis = null (A))) error ("internal error, singularity assumed, but null-space computed to be zero-dimensional"); endif ## Find an index (applied to both row and column) of uniquely ## defined elements of covp. idun = all (basis == 0, 2); ## Fill in these elements. covp(idun, idun) = part_covp(idun, idun); ## Compute corp as far as possible at the moment. d = sqrt (diag (covp)); corp = covp ./ (d * d.'); ## All diagonal elements of corp should be one, even those as yet ## NA. corp(1 : n + 1 : n * n) = 1; ## If there are indices, applied to both row and column, so that ## indexed elements within one row or one column are determined up ## to a multiple of a vector, find both these vectors and the ## respective indices. In the same run, use them to further fill in ## corp as described below. for id = 1 : (cb = columns (basis)) if (any (idx = ... all (basis(:, (1 : cb) != id) == 0, 2) & ... basis(:, id) != 0)) vec = sign (basis(idx, id)); ## Depending on "vec", single coefficients of correlation ## indexed by "idx" are either +1 or -1. corp(idx, idx) = vec * vec.'; endif endfor hook.covp = covp; hook.corp = corp; endif endfunction optim-1.4.1/inst/private/__read_options__.m0000644000175000017500000001131012477523726017373 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn{Function File} {[@var{op}, @var{nread}] =} __read_options__ (@var{args}, @var{varargin}) ## Undocumented internal function. ## @end deftypefn function [op,nread] = __read_options__ (args, varargin) ## pos 2.1.39 verbose = 0; op = struct (); # Empty struct op0 = op1 = " "; skipnan = prefix = quiet = nocase = quiet = 0; extra = ""; nargs = nargin-1; # nargin is now a function if rem (nargs, 2), error ("odd number of optional args"); endif i=1; while i 1 # Ambiguous option name fullen = zeros (1,length (ii)); # Full length of each optio tmp = correct = ""; j = 0; for i = ii fullen(++j) = spi(find (spi > i,1))-i ; tmp = [tmp,"', '",opts(i:i+fullen(j)-1)]; endfor tmp = tmp(5:length(tmp)); if sum (fullen == min (fullen)) > 1 || ... ((min (fullen) != length(name)) && ! prefix) , error ("ambiguous option '%s'. Could be '%s'",oname,tmp); endif j = find (fullen == min (fullen), 1); ii = ii(j); endif # Full name of option (w/ correct case) fullname = opts_orig(ii:spi(find (spi > ii, 1))-1); if ii < iend if verbose, printf ("read_options : found boolean '%s'\n",fullname); endif op.(fullname) = 1; else if verbose, printf ("read_options : found '%s'\n",fullname); endif if nread < length (args) tmp = args{++nread}; if verbose, printf ("read_options : size is %i x %i\n",size(tmp)); endif if !isnumeric (tmp) || !all (isnan (tmp(:))) || ... !isfield (op, fullname) op.(fullname) = tmp; else if verbose, printf ("read_options : ignoring nan\n"); endif endif else error ("options end before I can read value of option '%s'",oname); endif endif endwhile endfunction optim-1.4.1/inst/private/__collect_constraints__.m0000644000175000017500000000501612477523726020767 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . function [mc, vc, f_gencstr, df_gencstr, user_df] = ... __collect_constraints__ (cstr, do_cstep, context) mc = vc = f_gencstr = df_gencstr = []; user_df = false; if (isempty (cstr)) return; endif for id = 1 : length (cstr) if (ischar (cstr{id})) cstr{id} = str2func (cstr{id}); endif endfor if (isnumeric (tp = cstr{1}) || isstruct (tp)) mc = tp; vc = cstr{2}; if ((tp = length (cstr)) > 2) f_genstr = cstr{3}; if (tp > 3) df_gencstr = cstr{4}; user_df = true; endif endif else lid = 0; # no linear constraints f_gencstr = cstr{1}; if ((len = length (cstr)) > 1) if (isnumeric (c = cstr{2}) || isstruct (c)) lid = 2; else df_gencstr = c; user_df = true; if (len > 2) lid = 3; endif endif endif if (lid) mc = cstr{lid}; vc = cstr{lid + 1}; endif endif if (! isempty (f_gencstr)) if (ischar (f_gencstr)) f_gencstr = str2func (f_gencstr); endif f_gencstr = @ (varargin) ... tf_gencstr (f_gencstr, varargin{:}); if (user_df) if (do_cstep) error ("both complex step derivative chosen and user Jacobian function specified for %s", context); endif if (ischar (df_gencstr)) df_gencstr = str2func (df_gencstr); endif df_gencstr = @ (p, func, idx, hook) ... df_gencstr (p, idx, hook); else if (do_cstep) df_gencstr = @ (p, func, idx, hook) jacobs (p, func, hook); else df_gencstr = @ (p, func, idx, hook) __dfdp__ (p, func, hook); endif endif endif endfunction function ret = tf_gencstr (f, varargin) # varargin: p[, idx[, info]] ## necessary since user function f_gencstr might return [] or a row ## vector if (isempty (ret = f (varargin{:}))) ret = zeros (0, 1); elseif (columns (ret) > 1) ret = ret(:); endif endfunction optim-1.4.1/inst/private/__covd_wls__.m0000644000175000017500000000206212477523726016531 0ustar olafolaf## Copyright (C) 2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . function hook = __covd_wls__ (hook) m = hook.nm; n = hook.np; if (m <= n) error ("guessing covariance-matrix of residuals for weighted least squares requires at least one more residual than free parameters"); endif w = hook.weights(:); res = hook.residuals(:); w2 = w .^ 2; hook.covd = diag (res.' * diag (w2) * res / (m - n) ./ w2); endfunction optim-1.4.1/inst/private/__optimget_parallel_local__.m0000644000175000017500000000304612477523726021572 0ustar olafolaf## Copyright (C) 2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## undocumented internal function function parallel_local = __optimget_parallel_local__ (settings, default) min_version = "2.0.5"; if ((parallel_local = optimget (settings, "parallel_local", default))) [desc, state] = pkg ("describe", "parallel"); if (! strcmpi (state{1}, "loaded")) parallel_local = false; warning ("optim:parallel_local", ... "option 'parallel_local=true' ignored, since state of package 'parallel' is '%s' instead of 'Loaded'", ... state{1}); elseif (compare_versions (desc{1}.version, min_version, "<")) parallel_local = false; warning ("optim:parallel_local", ... "option 'parallel_local=true' ignored, since loaded version of package 'parallel' is %s, but must be >= %s", ... desc{1}.version, min_version); endif endif endfunctionoptim-1.4.1/inst/private/__semi_bracket.m0000644000175000017500000000270512477523726017047 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2009 Levente Torok ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [a, b, ga, gb, nev] = semi_bracket (f, dx, a, narg, args) ## ## Find an interval containing a local minimum of the function ## g : h in [a, inf[ ---> f (x+h*dx) where x = args{narg} ## ## The local minimum may be in a. ## a < b. ## nev is the number of function evaluations. function [a,b,ga,gb,n] = __semi_bracket (f, dx, a, narg, args) step = 1; x = args{narg}; args{narg} = x+a*dx; ga = feval (f, args ); b = a + step; args{narg} = x+b*dx; gb = feval (f, args ); n = 2; if gb >= ga, return ; end while 1, c = b + step; args{narg} = x+c*dx; gc = feval( f, args ); n++; if gc >= gb, # ga >= gb <= gc gb = gc; b = c; return; end step *= 2; a = b; b = c; ga = gb; gb = gc; end optim-1.4.1/inst/private/__lm_feasible__.m0000644000175000017500000004351212477523726017160 0ustar olafolaf## Copyright (C) 2012 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . function [p_res, objf, cvg, outp] = __lm_feasible__ (f, pin, hook) ## some backend specific defaults fract_prec_default = 0; max_fract_step_default = Inf; n = length (pin); ## passed constraints mc = hook.mc; # matrix of linear constraints vc = hook.vc; # vector of linear constraints f_cstr = hook.f_cstr; # function of all constraints df_cstr = hook.df_cstr; # function of derivatives of all constraints n_gencstr = hook.n_gencstr; # number of non-linear constraints eq_idx = hook.eq_idx; # logical index of equality constraints in all # constraints lbound = hook.lbound; # bounds, subset of linear inequality ubound = hook.ubound; # constraints in mc and vc ## passed values of constraints for initial parameters pin_cstr = hook.pin_cstr; ## passed function for gradient of objective function grad_f = hook.dfdp; ## passed function for hessian of objective function if (isempty (hessian = hook.hessian)) user_hessian = false; A = eye (n); else user_hessian = true; if (hook.inverse_hessian) error ("this backend can't handle inverse hessians"); endif endif ## passed function for complementary pivoting cpiv = hook.cpiv; ## passed options ftol = hook.TolFun; if (isempty (niter = hook.MaxIter)) niter = 20; endif fixed = hook.fixed; maxstep = hook.max_fract_change; maxstep(isna (maxstep)) = max_fract_step_default; pprec = hook.fract_prec; pprec(isna (pprec)) = fract_prec_default; ## keep absolute precision positive for non-null relative precision; ## arbitrary value, added to parameters before multiplying with ## relative precision add_pprec = zeros (n, 1); add_pprec(pprec > 0) = sqrt (eps); ## verbose = strcmp (hook.Display, "iter"); user_interaction = hook.user_interaction; ## some useful variables derived from passed variables n_lcstr = size (vc, 1); have_constraints_except_bounds = ... n_lcstr + n_gencstr > ... sum (lbound != -Inf) + sum (ubound != Inf); ac_idx = true (n_lcstr + n_gencstr, 1); # index of all constraints nc_idx = false (n_lcstr + n_gencstr, 1); # none of all constraints gc_idx = cat (1, false (n_lcstr, 1), true (n_gencstr, 1)); # gen. constr. nz = 20 * eps; # This is arbitrary. Accuracy of equality constraints. ## backend-specific checking of options and constraints ## if (any (pin < lbound | pin > ubound) || any (pin_cstr.inequ.lin_except_bounds < 0) || any (pin_cstr.inequ.gen < 0) || any (abs (pin_cstr.equ.lin) >= nz) || any (abs (pin_cstr.equ.gen) >= nz)) error ("Initial parameters violate constraints."); endif ## idx = lbound == ubound; if (any (idx)) warning ("lower and upper bounds identical for some parameters, fixing the respective parameters"); fixed(idx) = true; endif if (all (fixed)) error ("no free parameters"); endif if (n_gencstr > 0 && any (! isinf (maxstep))) warning ("setting both a maximum fractional step change of parameters and general constraints may result in inefficiency and failure"); endif ## fill constant fields of hook for derivative-functions; some fields ## may be backend-specific dfdp_hook.fixed = fixed; # this may be handled by the frontend, but # the backend still may add to it ## set up for iterations p = pbest = pin; vf = fbest = f (pin); iter = 0; nobjf = 1; done = false; ll = 1; ltab = [.1, 1, 1e2, 1e4, 1e6]; chgprev = Inf (n, 1); df = []; c_act = false (n, 1); dca = zeros (n, 0); if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", 0, "fval", vf), "init"))) p_res = p; outp.niter = 0; objf = vf; cvg = -1; return; endif while (! done) iter++; ## gradient of objective function old_df = df; df = grad_f (p, setfield (dfdp_hook, "f", vf))(:); ## constraints, preparation of some constants v_cstr = f_cstr (p, ac_idx); old_c_act = c_act; old_dca = dca; c_act = v_cstr < nz | eq_idx; # index of active constraints if (any (c_act)) if (n_gencstr) ## full gradient is needed later dct = df_cstr (p, ac_idx, setfield (dfdp_hook, "f", v_cstr)); dct(:, fixed) = 0; # for user supplied dfdp; necessary? dcat = dct(c_act, :); else dcat = df_cstr (p, c_act, setfield (dfdp_hook, "f", v_cstr)); dcat(:, fixed) = 0; # for user supplied dfdp; necessary? endif dca = dcat.'; a_eq_idx = eq_idx(c_act); else dca = zeros (n, 0); endif ## hessian of objectiv function if (user_hessian) A = hessian (p); idx = isnan (A); A(idx) = A.'(idx); if (any (isnan (A(:)))) error ("some second derivatives undefined by user function"); endif if (! isreal (A)) error ("second derivatives given by user function not real"); endif if (! issymmetric (A)) error ("Hessian returned by user function not symmetric"); endif elseif (iter > 1) if (any (chg)) ## approximate Hessian of Lagrangian ## I wonder if this hassle here and above with accounting for ## changing active sets is indeed better than just approximating ## the Hessian only of the objective function. ## ## index, over all constraints, of constraints active both ## previously and currently s_c_act = old_c_act & c_act; ## index, over currently active constraints, of constraints ## active both previously and currently id_new = s_c_act(c_act); ## index, over previously active constraints, of constraints ## active both previously and currently id_old = s_c_act(old_c_act); ## gradients of currently active constraints which were also ## active previously dca_new_id = dca(:, id_new); ## gradients of previously active constraints which are also ## active currently dca_old_id = old_dca(:, id_old); ## index, over constraints active both previously and currently, ## of (old) non-zero multipliers (bidx set below previously) bidx_old_id = bidx(id_old); ## index, over (old) non-zero multipliers, of constraints active ## both previously and currently (bidx set below previously) old_l_idx = id_old(bidx); ## difference of derivatives of new and old active constraints, ## multiplied by multipliers, as used for BFGS update (lb set ## below previously) dch = (dca_new_id(:, bidx_old_id) - ... dca_old_id(:, bidx_old_id)) * ... lb(old_l_idx); y = df - old_df - dch; ## Damped BFGS according to Nocedal & Wright, 2nd edition, ## procedure 18.2. chgt = chg.'; sAs = chgt * A * chg; cy = chgt * y; if (cy >= .2 * sAs) th = 1; else if ((den1 = sAs - cy) == 0) cvg = -4; break; endif th = .8 * sAs / den1; endif Ac = A * chg; r = th * y + (1 - th) * Ac; if ((den2 = chgt * r) == 0 || sAs == 0) cvg = -4; break; endif A += r * r.' / den2 - Ac * Ac.' / sAs; endif endif ## Inverse scaled decomposition A = G * (1 ./ L) * G.' ## ## make matrix Binv for scaling Binv = diag (A); nidx = ! (idx = Binv == 0); Binv(nidx) = 1 ./ sqrt (abs (Binv(nidx))); Binv(idx) = 1; Binv = diag (Binv); ## eigendecomposition of scaled A [V, L] = eig (Binv * A * Binv); L = diag (L); ## A is symmetric, so V and L are real, delete any imaginary parts, ## which might occur due to inaccuracy V = real (V); L = real (L); ## nminL = - min (L) * 1.1 / ltab(1); G = Binv * V; ## Levenberg/Marquardt fgoal = vf - (abs (vf) + sqrt (eps)) * ftol; for l = ltab ll = max (ll, nminL); l = max (1e-7, ll * l); R = G * diag (1 ./ (L + l)) * G.'; ## step computation if (any (c_act)) ## some constraints are active, quadratic programming tp = dcat * R; [lb, bidx, ridx, tbl] = cpiv (- tp * df, tp * dca, a_eq_idx); chg = R * (dca(:, bidx) * lb - df); # step direction ## indices for different types of constraints c_inact = ! c_act; # inactive constraints c_binding = c_unbinding = nc_idx; c_binding(c_act) = bidx; # constraints selected binding c_unbinding(c_act) = ridx; # constraints unselected binding c_nonbinding = c_act & ! (c_binding | c_unbinding); # #constraints selected non-binding else ## no constraints are active, chg is the Levenberg/Marquardt step chg = - R * df; # step direction lb = zeros (0, 1); bidx = false (0, 1); ## indices for different types of constraints (meaning see above) c_inact = ac_idx; c_binding = nc_idx; c_unbinding = nc_idx; c_nonbinding = nc_idx; endif ## apply inactive and non-binding constraints to step width ## ## linear constraints k = 1; c_tp = c_inact(1:n_lcstr); mcit = mc(:, c_tp).'; vci = vc(c_tp); hstep = mcit * chg; idx = hstep < 0; if (any (idx)) k = min (1, min (- (vci(idx) + mcit(idx, :) * p) ./ ... hstep(idx))); endif ## ## general constraints if (n_gencstr) c_tp = gc_idx & (c_nonbinding | c_inact); if (any (c_tp) && any (f_cstr (p + k * chg, c_tp) < 0)) [k, fval, info] = ... fzero (@ (x) min (cat (1, ... f_cstr (p + x * chg, c_tp), ... k - x, ... ifelse (x < 0, -Inf, Inf))), ... 0); if (info != 1 || abs (fval) >= nz) error ("could not find stepwidth to satisfy inactive and non-binding general inequality constraints"); endif endif endif ## chg = k * chg; ## if necessary, regain binding constraints and one of the ## possibly active previously inactive or non-binding constraints if (any (gc_idx & c_binding)) # none selected binding => none # unselected binding ptp1 = p + chg; tp = true; nt_nosuc = true; lim = 20; while (nt_nosuc && lim >= 0) ## we keep d_p.' * inv (R) * d_p minimal in each step of the ## inner loop c_tp0 = c_inact | c_nonbinding; c_tp1 = c_inact | (gc_idx & c_nonbinding); btbl = tbl(bidx, bidx); c_tp2 = c_binding; ## once (any(tp)==false), it would not get true again even ## with the following assignment if (any (tp) && ... any (tp = f_cstr (ptp1, c_tp1) < nz)) ## keep only the first true entry in tp tp(tp) = logical (cat (1, 1, zeros (sum (tp) - 1, 1))); ## supplement binding index with one (the first) getting ## binding in c_tp1 c_tp2(c_tp1) = tp; ## gradient of this added constraint caddt = dct(c_tp2 & ! c_binding, :); cadd = caddt.'; C = dct(c_binding, :) * R * cadd; Ct = C.'; T = [btbl, btbl * C; ... -Ct * btbl, caddt * R * cadd - Ct * btbl * C]; btbl = gjp (T, size (T, 1)); endif dcbt = dct(c_tp2, :); mfc = - R * dcbt.' * btbl; ptp2 = ptp1; nt_niter = nt_niter_start = 100; while (nt_nosuc && nt_niter >= 0) hv = f_cstr (ptp2, c_tp2); if (all (abs (hv) < nz)) nt_nosuc = false; chg = ptp2 - p; else ptp2 = ptp2 + mfc * hv; # step should be zero for each # component for which the parameter is # "fixed" endif nt_niter--; endwhile if (nt_nosuc || ... any (abs (chg) > abs (p .* maxstep)) || ... any (f_cstr (ptp2, c_tp0) < -nz)) ## if (nt_nosuc), regaining did not converge, else, ## regaining violated type 3 and 4. nt_nosuc = true; ptp1 = (p + ptp1) / 2; endif if (! nt_nosuc && ... any ((tp = f_cstr (ptp2, c_unbinding)) < 0)) [discarded, id] = min(tp); tid = find (ridx); id = tid(id); # index within active constraints unsuccessful_exchange = false; if (abs (tbl(id, id)) < nz) # Bard: not absolute value ## exchange this unselected binding constraint against a ## binding constraint, but not against an equality ## constraint tbidx = bidx & ! a_eq_idx; if (! any (tbidx)) unsuccessful_exchange = true; else [discarded, idm] = max (abs (tbl(tbidx, id))); tid = find (tbidx); idm = tid(idm); # -> index within active constraints tbl = gjp (tbl, idm); bidx(idm) = false; ridx(idm) = true; endif endif if (unsuccessful_exchange) ## It probably doesn't look good now; this desperate last ## attempt is not in the original algortithm, since that ## didn't account for equality constraints. ptp1 = (p + ptp1) / 2; else tbl = gjp (tbl, id); bidx(id) = true; ridx(id) = false; c_binding = nc_idx; c_binding(c_act) = bidx; c_unbinding = nc_idx; c_unbinding(c_act) = ridx; endif ## regaining violated type 2 constraints nt_nosuc = true; endif lim--; endwhile if (nt_nosuc) error ("could not regain binding constraints"); endif else ## check the maximal stepwidth and apply as necessary ochg = chg; idx = ! isinf (maxstep); limit = abs (maxstep(idx) .* p(idx)); chg(idx) = min (max (chg(idx), - limit), limit); if (verbose && any (ochg != chg)) printf ("Change in parameter(s): %s:maximal fractional stepwidth enforced", ... sprintf ("%d ", find (ochg != chg))); endif endif # regaining aprec = pprec .* (abs (pbest) + add_pprec); if (any (abs (chg) > 0.1 * aprec)) # only worth evaluating # function if there is some # non-miniscule change skipped = false; p_chg = p + chg; ## since the projection method may have slightly violated ## constraints due to inaccuracy, correct parameters to bounds ## --- but only if no further constraints are given, otherwise ## the inaccuracy in honoring them might increase by this if (! have_constraints_except_bounds) lidx = p_chg < lbound; uidx = p_chg > ubound; p_chg(lidx, 1) = lbound(lidx, 1); p_chg(uidx, 1) = ubound(uidx, 1); chg(lidx, 1) = p_chg(lidx, 1) - p(lidx, 1); chg(uidx, 1) = p_chg(uidx, 1) - p(uidx, 1); endif ## if (! isreal (vf_chg = f (p_chg))) error ("objective function not real"); endif nobjf++; if (vf_chg < fbest) pbest = p_chg; fbest = vf_chg; endif if (vf_chg < fgoal) # <, not <=, since fgoal can be equal to vf # if TolFun <= eps p = p_chg; vf = vf_chg; break; endif else skipped = true; break; endif endfor ll = l; if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", iter, "fval", vf), "iter"))) p_res = p; outp.niter = iter; objf = vf; cvg = -1; return; endif aprec = pprec .* (abs (pbest) + add_pprec); if (skipped) cvg = 2; done = true; elseif (vf_chg >= fgoal) # >=, not >, since fgoal can be equal to vf # if TolFun <= eps cvg = 3; done = true; elseif (all (abs (chg) <= aprec) && all (abs (chgprev) <= aprec)) cvg = 2; done = true; elseif (iter == niter) cvg = 0; done = true; else chgprev = chg; endif endwhile ## return result p_res = pbest; objf = fbest; outp.niter = iter; outp.nobjf = nobjf; if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p_res, struct ("iteration", iter, "fval", objf), "done"))) cvg = -1; endif endfunction optim-1.4.1/inst/private/__sqp__.m0000644000175000017500000001060612477523726015517 0ustar olafolaf## Copyright (C) 2012 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . function [p_res, objf, cvg, outp] = __sqp__ (f, pin, hook) n = length (pin); ## passed constraints mc = hook.mc; # matrix of linear constraints vc = hook.vc; # vector of linear constraints f_cstr = hook.f_cstr; # function of all constraints df_cstr = hook.df_cstr; # function of derivatives of all constraints n_gencstr = hook.n_gencstr; # number of non-linear constraints eq_idx = hook.eq_idx; # logical index of equality constraints in all # constraints lbound = hook.lbound; # bounds, subset of linear inequality ubound = hook.ubound; # constraints in mc and vc ## passed values of constraints for initial parameters pin_cstr = hook.pin_cstr; ## passed return value of f for initial parameters f_pin = hook.f_pin; ## passed function for gradient of objective function grad_f = hook.dfdp; ## passed function for hessian of objective function if (isempty (hessian = hook.hessian)) user_hessian = false; R = eye (n); else user_hessian = true; endif ## passed function for complementary pivoting cpiv = hook.cpiv; ## passed options ftol = hook.TolFun; niter = hook.MaxIter; if (isempty (niter)) niter = 20; endif fixed = hook.fixed; ## some useful variables derived from passed variables ## ## ... nz = 20 * eps; # This is arbitrary. Accuracy of equality constraints. ## backend-specific checking of options and constraints ## if (any (pin < lbound | pin > ubound) || any (pin_cstr.inequ.lin_except_bounds < 0) || any (pin_cstr.inequ.gen < 0) || any (abs (pin_cstr.equ.lin)) >= nz || any (abs (pin_cstr.equ.gen)) >= nz) error ("Initial parameters violate constraints."); endif ## if (all (fixed)) error ("no free parameters"); endif ## fill constant fields of hook for derivative-functions; some fields ## may be backend-specific dfdp_hook.fixed = fixed; # this may be handled by the frontend, but # the backend still may add to it ## set up for iterations p = pin; f = f_pin; n_iter = 0; done = false; while (! done) niter++; if (user_hessian) H = hessian (p); idx = isnan (H); H(idx) = H.'(idx); if (any (isnan (H(:)))) error ("some second derivatives undefined by user function"); endif if (! isreal (H)) error ("second derivatives given by user function not real"); endif if (! issymmetric (H)) error ("Hessian returned by user function not symmetric"); endif R = directional_discrimination (H); endif endwhile ## return result endfunction function R = directional_discrimination (A) ## A is expected to be real and symmetric without checking ## ## "Directional discrimination" (Bard, Nonlinear Parameter Estimation, ## Academic Press, 1974). Compute R, which is "similar" to computing ## inv(H), but succeeds even if H is singular; R is positive definite ## and is tuned to avoid large steps of one parameter with respect to ## the steps of the others. ## make matrix Binv for scaling Binv = diag (A); nidx = ! (idx = Binv == 0); Binv(nidx) = 1 ./ sqrt (abs (Binv(nidx))); Binv(idx) = 1; Binv = diag (Binv); ## eigendecomposition of scaled hessian [V, L] = eig (Binv * A * Binv); ## A is symmetric, so V and L are real, delete any imaginary parts, ## which might occur due to inaccuracy V = real (V); L = real (L); ## actual directional discrimination, does not exactly follow Bard L = abs (diag (L)); # R should get positive definite L = max (L, .001 * max (L)); # avoids relatively large steps of # parameters G = Binv * V; R = G * diag (1 ./ L) * G.'; endfunction optim-1.4.1/inst/private/__null_optim__.m0000644000175000017500000001177312477523726017104 0ustar olafolaf## Copyright (C) 1994-2011 John W. Eaton ## ## This file is part of Octave. ## ## Octave is free software; you can redistribute it and/or modify it ## under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or (at ## your option) any later version. ## ## Octave is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with Octave; see the file COPYING. If not, see ## . ## -*- texinfo -*- ## @deftypefn {Function File} {} null (@var{A}) ## @deftypefnx {Function File} {} null (@var{A}, @var{tol}) ## Return an orthonormal basis of the null space of @var{A}. ## ## The dimension of the null space is taken as the number of singular ## values of @var{A} not greater than @var{tol}. If the argument @var{tol} ## is missing, it is computed as ## ## @example ## max (size (@var{A})) * max (svd (@var{A})) * eps ## @end example ## @seealso{orth} ## @end deftypefn ## Author: KH ## Created: 24 December 1993. ## Adapted-By: jwe ## Adapted-By: Olaf Till ## This function has also been submitted to Octave (bug #33503). ## TODO: This changeset is only for the record, it corrects the mistake ## which triggered bug #43699. But it turned out that for a different ## reason the result is still wrong in such cases. The fix for the ## latter requires some rearrangement of code and will be done in a ## later changeset. function retval = __null_optim__ (A, tol) if (isempty (A)) retval = []; else [U, S, V] = svd (A); [rows, cols] = size (A); [S_nr, S_nc] = size (S); if (S_nr == 1 || S_nc == 1) s = S(1); else s = diag (S); endif if (nargin == 1) if (isa (A, "single")) tol = max (size (A)) * s (1) * (meps = eps ("single")); else tol = max (size (A)) * s (1) * (meps = eps); endif elseif (nargin != 2) print_usage (); endif rank = sum (s > tol); if (rank < cols) retval = V (:, rank+1:cols); if (rows >= cols) cb = columns (retval); if (cb > 1) ## For multidimensional null spaces LAPACK seems to return ## very large error angles (> pi/2) for the basis vectors, so ## we cannot use these angles to determine which elements of ## the basis vectors could be zero. In some of such cases ## LAPACK seems to set elements "meant" to be zero exactly to ## zero in the basis vectors, so we don't need to do anything. ## In the other cases, we can't do anything. else ## Set those elements of each vector to zero whose absolute ## values are smallest and which together could be zero ## without making the angle to the originally computed vector ## larger than given by the error bound. Do this in an ## approximative but numerically feasible way. ## The following code still treats the multidimensional case ## though it currently doesn't arrive here. ## error bounds of basis vectors in radians, see LAPACK user ## guide, http://www.netlib.org/lapack/lug/node96.html if (true) # test for Octave version once submitted patch is # applied to Octave (bug #33503) __disna__ = @ __disna_optim__; endif ## This deviates from the LAPACK reference by the factor "2 * ## max(size(A))". This deviation is chosen because the results ## in setting elements to zero are better so. ("tol" used ## above for the rank test also seems to deviate from LAPACK ## reference, by factor "max(size(A))". ebnd = 2 * tol ./ (__disna__ ("R", s, rows, cols)(rank+1:cols)); ## sort elements by magnitude sb = conj (retval) .* retval; [sb, idx] = sort (sb); idx += repmat (0:cols:cols*(cb-1), cols, 1); # for un-sorting ## norms of vectors made by all elements up to this sb = sqrt (cumsum (sb)); ## The norm of the vectors made up by elements settable to ## zero is small enough to be approximately equal to the angle ## between the full vectors before and after setting these ## elements to zero (considering the norms of the full vectors ## being 1). Index of approximated angles not exceeding error ## bound. zidx = sb <= repmat (ebnd.', cols, 1); ## set indexed elements to zero in original basis retval(idx(zidx)) = 0; endif else ## no error bounds computable with LAPACK ## this is from original null.m retval(abs (retval) < meps) = 0; endif else retval = zeros (cols, 0); endif endif endfunction optim-1.4.1/inst/private/__lm_svd__.m0000644000175000017500000004740712477523726016211 0ustar olafolaf## Copyright (C) 1992-1994 Richard Shrager ## Copyright (C) 1992-1994 Arthur Jutan ## Copyright (C) 1992-1994 Ray Muzic ## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . function [p, resid, cvg, outp] = __lm_svd__ (F, pin, hook) ## This is a backend for optimization. This code was originally ## contained in leasqr.m, which is now a frontend. ## some backend specific defaults fract_prec_default = 0; max_fract_step_default = Inf; n = length (pin); ## passed constraints mc = hook.mc; # matrix of linear constraints vc = hook.vc; # vector of linear constraints f_cstr = hook.f_cstr; # function of all constraints df_cstr = hook.df_cstr; # function of derivatives of all constraints n_gencstr = hook.n_gencstr; # number of non-linear constraints eq_idx = hook.eq_idx; # logical index of equality constraints in all # constraints lbound = hook.lbound; # bounds, subset of linear inequality ubound = hook.ubound; # constraints in mc and vc ## passed values of constraints for initial parameters pin_cstr = hook.pin_cstr; ## passed return value of F for initial parameters f_pin = hook.f_pin; ## passed derivative of residual function dfdp = hook.dfdp; ## passed function for complementary pivoting cpiv = hook.cpiv; ## passed options maxstep = hook.max_fract_change; maxstep(isna (maxstep)) = max_fract_step_default; pprec = hook.fract_prec; pprec(isna (pprec)) = fract_prec_default; ## keep absolute precision positive for non-null relative precision; ## arbitrary value, added to parameters before multiplying with ## relative precision add_pprec = zeros (n, 1); add_pprec(pprec > 0) = sqrt (eps); ## stol = hook.TolFun; niter = hook.MaxIter; if (isempty (niter)) niter = 20; endif wt = hook.weights; fixed = hook.fixed; verbose = strcmp (hook.Display, "iter"); user_interaction = hook.user_interaction; ## only preliminary, for testing if (isfield (hook, "testing")) testing = hook.testing; else testing = false; endif if (isfield (hook, "new_s")) new_s = hook.new_s; else new_s = false; endif ## some useful variables derived from passed variables n_lcstr = size (vc, 1); have_constraints_except_bounds = ... n_lcstr + n_gencstr > ... sum (lbound ~= -Inf) + sum (ubound ~= Inf); wtl = wt(:); nz = 20 * eps; # This is arbitrary. Constraint function will be # regarded as <= zero if less than nz. ## backend-specific checking of options and constraints if (have_constraints_except_bounds) if (any (pin_cstr.inequ.lin_except_bounds < 0) || ... (n_gencstr > 0 && any (pin_cstr.inequ.gen < 0))) warning ("initial parameters violate inequality constraints"); endif if (any (abs (pin_cstr.equ.lin) >= nz) || ... (n_gencstr > 0 && any (abs (pin_cstr.equ.gen) >= nz))) warning ("initial parameters violate equality constraints"); endif endif idx = lbound == ubound; if (any (idx)) warning ("lower and upper bounds identical for some parameters, fixing the respective parameters"); fixed(idx) = true; endif if (all (fixed)) error ("no free parameters"); endif lidx = pin < lbound; uidx = pin > ubound; if (any (lidx | uidx) && have_constraints_except_bounds) warning ("initial parameters outside bounds, not corrected since other constraints are given"); else if (any (lidx)) warning ("some initial parameters set to lower bound"); pin(lidx, 1) = lbound(lidx, 1); endif if (any (uidx)) warning ("some initial parameters set to upper bound"); pin(uidx, 1) = ubound(uidx, 1); endif endif if (n_gencstr > 0 && any (~isinf (maxstep))) warning ("setting both a maximum fractional step change of parameters and general constraints may result in inefficiency and failure"); endif ## fill constant fields of hook for derivative-functions; some fields ## may be backend-specific dfdp_hook.fixed = fixed; # this may be handled by the frontend, but # the backend still may add to it ## set up for iterations ## p = pin; f = f_pin; fbest=f; pbest=p; m = prod (size (f)); r = wt .* f; r = r(:); if (~isreal (r)) error ("weighted residuals are not real"); endif ss = r.' * r; sbest=ss; chgprev=Inf*ones(n,1); cvg=0; epsLlast=1; epstab=[.1, 1, 1e2, 1e4, 1e6]; ac_idx = true (n_lcstr + n_gencstr, 1); # all constraints nc_idx = false (n_lcstr + n_gencstr, 1); # none of all constraints gc_idx = cat (1, false (n_lcstr, 1), true (n_gencstr, 1)); # gen. constr. lc_idx = ~gc_idx; if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", 0, "residual", f), "init"))) outp.niter = 0; resid = f; cvg = -1; return; endif ## do iterations ## for iter = 1:niter deb_printf (testing, "\nstart outer iteration\n"); v_cstr = f_cstr (p, ac_idx); ## index of active constraints c_act = v_cstr < nz | eq_idx; # equality constraints might be # violated at start if (any (c_act)) if (n_gencstr > 0) ## full gradient is needed later dct = df_cstr (p, ac_idx, ... setfield (dfdp_hook, "f", v_cstr)); dct(:, fixed) = 0; # for user supplied dfdp; necessary? dcat = dct(c_act, :); else dcat = df_cstr (p, c_act, ... setfield (dfdp_hook, "f", v_cstr)); dcat(:, fixed) = 0; # for user supplied dfdp; necessary? endif dca = dcat.'; endif nrm = zeros (1, n); pprev=pbest; prt = dfdp (p, setfield (dfdp_hook, "f", fbest(:))); prt(:, fixed) = 0; # for user supplied dfdp; necessary? r = wt .* -fbest; r = r(:); if (~isreal (r)) error ("weighted residuals are not real"); endif sprev=sbest; sgoal=(1-stol)*sprev; msk = ~fixed; prt(:, msk) = prt(:, msk) .* wtl(:, ones (1, sum (msk))); nrm(msk) = sumsq (prt(:, msk), 1); msk = nrm > 0; nrm(msk) = 1 ./ sqrt (nrm(msk)); prt = prt .* nrm(ones (1, m), :); nrm = nrm.'; [prt,s,v]=svd(prt,0); s=diag(s); g = prt.' * r; for jjj=1:length(epstab) deb_printf (testing, "\nstart inner iteration\n"); epsL = max(epsLlast*epstab(jjj),1e-7); ## printf ("epsL: %e\n", epsL); # for testing ## Usage of this "ser" later is equivalent to pre-multiplying the ## gradient with a positive-definit matrix, but not with a ## diagonal matrix, at epsL -> Inf; so there is a fallback to ## gradient descent, but not in general to descent for each ## gradient component. Using the commented-out "ser" ((1 / (1 + ## epsL^2)) * (1 ./ se + epsL * s)) would be equivalent to using ## Marquardts diagonal of the Hessian-approximation for epsL -> ## Inf, but currently this gives no advantages in tests, even with ## constraints. ### ser = 1 ./ sqrt((s.*s)+epsL); se = sqrt ((s.*s) + epsL); if (new_s) ## for testing ser = (1 / (1 + epsL^2)) * (1 ./ se + epsL * s); else ser = 1 ./ se; endif tp1 = (v * (g .* ser)) .* nrm; if (any (c_act)) deb_printf (testing, "constraints are active:\n"); deb_printf (testing, "%i\n", c_act); ## calculate chg by "quadratic programming" nrme= diag (nrm); ser2 = diag (ser .* ser); mfc1 = nrme * v * ser2 * v.' * nrme; tp2 = mfc1 * dca; a_eq_idx = eq_idx(c_act); [lb, bidx, ridx, tbl] = cpiv (dcat * tp1, dcat * tp2, a_eq_idx); chg = tp1 + tp2(:, bidx) * lb; # if a parameter is "fixed", # the respective component of chg should # be zero too, even here (with active # constraints) deb_printf (testing, "change:\n"); deb_printf (testing, "%e\n", chg); deb_printf (testing, "\n"); ## indices for different types of constraints c_inact = ~c_act; # inactive constraints c_binding = nc_idx; c_binding(c_act) = bidx; # constraints selected binding c_unbinding = nc_idx; c_unbinding(c_act) = ridx; # constraints unselected binding c_nonbinding = c_act & ~(c_binding | c_unbinding); # constraints # selected non-binding else ## chg is the Levenberg/Marquardt step chg = tp1; ## indices for different types of constraints c_inact = ac_idx; # inactive constraints consist of all # constraints c_binding = nc_idx; c_unbinding = nc_idx; c_nonbinding = nc_idx; endif ## apply constraints to step width (since this is a ## Levenberg/Marquardt algorithm, no line-search is performed ## here) k = 1; c_tp = c_inact(1:n_lcstr); mcit = mc(:, c_tp).'; vci = vc(c_tp); hstep = mcit * chg; idx = hstep < 0; if (any (idx)) k = min (1, min (- (vci(idx) + mcit(idx, :) * pprev) ./ ... hstep(idx))); endif if (k < 1) deb_printf (testing, "stepwidth: linear constraints\n"); endif if (n_gencstr > 0) c_tp = gc_idx & (c_nonbinding | c_inact); if (any (c_tp) && any (f_cstr (pprev + k * chg, c_tp) < 0)) [k, fval, info] = ... fzero (@ (x) min (cat (1, ... f_cstr (pprev + x * chg, c_tp), ... k - x, ... ifelse (x < 0, -Inf, Inf))), ... 0); if (info ~= 1 || abs (fval) >= nz) error ("could not find stepwidth to satisfy inactive and non-binding general inequality constraints"); endif deb_printf (testing, "general constraints limit stepwidth\n"); endif endif chg = k * chg; if (any (gc_idx & c_binding)) # none selected binding => # none unselected binding deb_printf (testing, "general binding constraints must be regained:\n"); ## regain binding constraints and one of the possibly active ## previously inactive or non-binding constraints ptp1 = pprev + chg; tp = true; nt_nosuc = true; lim = 20; while (nt_nosuc && lim >= 0) deb_printf (testing, "starting from new value of p in regaining:\n"); deb_printf (testing, "%e\n", ptp1); ## we keep d_p.' * inv (mfc1) * d_p minimal in each step of ## the inner loop; this is both sensible (this metric ## considers a guess of curvature of sum of squared residuals) ## and convenient (we have useful matrices available for it) c_tp0 = c_inact | c_nonbinding; c_tp1 = c_inact | (gc_idx & c_nonbinding); btbl = tbl(bidx, bidx); c_tp2 = c_binding; if (any (tp)) # if none before, does not get true again tp = f_cstr (ptp1, c_tp1) < nz; if (any (tp)) # could be less clumsy, but ml-compatibility.. ## keep only the first true entry in tp tp(tp) = logical (cat (1, 1, zeros (sum (tp) - 1, 1))); ## supplement binding index with one (the first) getting ## binding in c_tp1 c_tp2(c_tp1) = tp; ## gradient of this added constraint caddt = dct(c_tp2 & ~c_binding, :); cadd = caddt.'; C = dct(c_binding, :) * mfc1 * cadd; Ct = C.'; G = [btbl, btbl * C; ... -Ct * btbl, caddt * mfc1 * cadd - Ct * btbl * C]; btbl = gjp (G, size (G, 1)); endif endif dcbt = dct(c_tp2, :); mfc = - mfc1 * dcbt.' * btbl; deb_printf (testing, "constraints to regain:\n"); deb_printf (testing, "%i\n", c_tp2); ptp2 = ptp1; nt_niter_start = 100; nt_niter = nt_niter_start; while (nt_nosuc && nt_niter >= 0) hv = f_cstr (ptp2, c_tp2); if (all (abs (hv) < nz)) nt_nosuc = false; chg = ptp2 - pprev; else ptp2 = ptp2 + mfc * hv; # step should be zero for each # component for which the parameter is # "fixed" endif nt_niter = nt_niter - 1; endwhile deb_printf (testing, "constraints after regaining:\n"); deb_printf (testing, "%e\n", hv); if (nt_nosuc || ... any (abs (chg) > abs (pprev .* maxstep)) || ... any (f_cstr (ptp2, c_tp0) < -nz)) if (nt_nosuc) deb_printf (testing, "regaining did not converge\n"); else deb_printf (testing, "regaining violated type 3 and 4\n"); endif nt_nosuc = true; ptp1 = (pprev + ptp1) / 2; endif if (~nt_nosuc) tp = f_cstr (ptp2, c_unbinding); if (any (tp) < 0) # again ml-compatibility clumsyness.. [discarded, id] = min(tp); tid = find (ridx); id = tid(id); # index within active constraints unsuccessful_exchange = false; if (abs (tbl(id, id)) < nz) # Bard: not absolute value ## exchange this unselected binding constraint against a ## binding constraint, but not against an equality ## constraint tbidx = bidx & ~a_eq_idx; if (~any (tbidx)) unsuccessful_exchange = true; else [discarded, idm] = max (abs (tbl(tbidx, id))); tid = find (tbidx); idm = tid(idm); # -> index within active constraints tbl = gjp (tbl, idm); bidx(idm) = false; ridx(idm) = true; endif endif if (unsuccessful_exchange) ## It probably doesn't look good now; this desperate ## last attempt is not in the original algortithm, since ## that didn't account for equality constraints. ptp1 = (pprev + ptp1) / 2; else tbl = gjp (tbl, id); bidx(id) = true; ridx(id) = false; c_binding = nc_idx; c_binding(c_act) = bidx; c_unbinding = nc_idx; c_unbinding(c_act) = ridx; endif nt_nosuc = true; deb_printf (testing, "regaining violated type 2\n"); endif endif if (~nt_nosuc) deb_printf (testing, "regaining successful, converged with %i iterations:\n", ... nt_niter_start - nt_niter); deb_printf (testing, "%e\n", ptp2); endif lim = lim - 1; endwhile if (nt_nosuc) error ("could not regain binding constraints"); endif else ## check the maximal stepwidth and apply as necessary ochg=chg; idx = ~isinf(maxstep); limit = abs(maxstep(idx).*pprev(idx)); chg(idx) = min(max(chg(idx),-limit),limit); if (verbose && any(ochg ~= chg)) disp(['Change in parameter(s): ', ... sprintf("%d ",find(ochg ~= chg)), "maximal fractional stepwidth enforced"]); endif endif aprec = pprec .* (abs (pbest) + add_pprec); ## ss=scalar sum of squares=sum((wt.*f)^2). if (any(abs(chg) > 0.1*aprec))#--- # only worth evaluating # function if there is some non-miniscule # change ## In the code of the outer loop before the inner loop pbest is ## actually identical to p, since once they deviate, the outer ## loop will not be repeated. Though the inner loop can still be ## repeated in this case, pbest is not used in it. Since pprev ## is set from pbest in the outer loop before the inner loop, it ## is also identical to p up to here. p=chg+pprev; ## since the projection method may have slightly violated ## constraints due to inaccuracy, correct parameters to bounds ## --- but only if no further constraints are given, otherwise ## the inaccuracy in honoring them might increase by this skipped = false; if (~have_constraints_except_bounds) lidx = p < lbound; uidx = p > ubound; p(lidx, 1) = lbound(lidx, 1); p(uidx, 1) = ubound(uidx, 1); chg(lidx, 1) = p(lidx, 1) - pprev(lidx, 1); chg(uidx, 1) = p(uidx, 1) - pprev(uidx, 1); endif ## f = F (p); r = wt .* f; r = r(:); if (~isreal (r)) error ("weighted residuals are not real"); endif ss = r.' * r; deb_printf (testing, "tried parameters:\n"); deb_printf (testing, "%.16e\n", p); deb_printf (testing, "sbest: %.16e\n", sbest); deb_printf (testing, "sgoal: %.16e\n", sgoal); deb_printf (testing, " ss: %.16e\n", ss); if (ss= sgoal) # >=, not >, since sgoal can be equal to sprev if # TolFun <= eps cvg = 3; break; endif aprec = pprec .* (abs (pbest) + add_pprec); ## [aprec, chg, chgprev] if (all(abs(chg) <= aprec) && all(abs(chgprev) <= aprec)) cvg = 2; if (verbose) fprintf("Parameter changes converged to specified precision\n"); endif break; else chgprev=chg; endif endfor ## set further return values ## p = pbest; resid = fbest; outp.niter = iter; if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, pbest, struct ("iteration", iter, "residual", fbest), "done"))) cvg = -1; endif endfunction function deb_printf (do_printf, varargin) ## for testing if (do_printf) printf (varargin{:}) endif endfunction optim-1.4.1/inst/private/__residmin_stat__.m0000644000175000017500000004322412477523726017563 0ustar olafolaf## Copyright (C) 2011-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## Internal function, called by residmin_stat --- see there --- and ## others. Calling __residmin_stat__ indirectly hides the argument ## "hook", usable by wrappers, from users. Currently, hook can contain ## the field "observations". Since much uf the interface code is taken ## from __nonlin_residmin__, it may be that not everything is ideal for ## the present case; but I think it's allright to leave it so. ## ## Some general considerations while making this function: ## ## Different Functions for optimization statistics should be made for ## mere objective function optimization (to be made yet) and ## residual-derived minimization (this function), since there are ## different computing aspects. Don't put the contained functionality ## (statistics) into the respective optimization functions (or ## backends), since different optimization algorithms can share a way to ## compute statistics (e.g. even stochastic optimizers can mimize ## (weighted) squares of residuals). Also, don't use the same frontend ## for optimization and statistics, since the differences in the ## interface for both uses may be confusing otherwise, also the optimset ## options only partially overlap. function ret = __residmin_stat__ (f, pfin, settings, hook) ## scalar defaults diffp_default = .001; cstep_default = 1e-20; if (nargin == 1 && ischar (f) && strcmp (f, "defaults")) ret = optimset ("param_config", [], ... "param_order", [], ... "param_dims", [], ... "f_pstruct", false, ... "df_pstruct", false, ... "dfdp", [], ... "diffp", [], ... "diff_onesided", [], ... "complex_step_derivative_f", false, ... "cstep", cstep_default, ... "fixed", [], ... "weights", [], ... "residuals", [], ... "covd", [], ... "objf_type", [], ... # no default, e.g. "wls" "ret_dfdp", false, ... "ret_covd", false, ... "ret_covp", false, ... "ret_corp", false); return; endif assign = @ assign; # Is this faster in repeated calls? if (nargin != 4) error ("incorrect number of arguments"); endif if (ischar (f)) f = str2func (f); endif if (! (p_struct = isstruct (pfin))) if (! isempty (pfin) && (! isvector (pfin) || columns (pfin) > 1)) error ("parameters must be either a structure or a column vector"); endif endif #### processing of settings and consistency checks pconf = optimget (settings, "param_config"); pord = optimget (settings, "param_order"); pdims = optimget (settings, "param_dims"); f_pstruct = optimget (settings, "f_pstruct", false); dfdp_pstruct = optimget (settings, "df_pstruct", f_pstruct); dfdp = optimget (settings, "dfdp"); if (ischar (dfdp)) dfdp = str2func (dfdp); endif if (isstruct (dfdp)) dfdp_pstruct = true; endif diffp = optimget (settings, "diffp"); diff_onesided = optimget (settings, "diff_onesided"); fixed = optimget (settings, "fixed"); residuals = optimget (settings, "residuals"); do_cstep = optimget (settings, "complex_step_derivative_f", false); cstep = optimget (settings, "cstep", cstep_default); if (do_cstep && ! isempty (dfdp)) error ("both 'complex_step_derivative_f' and 'dfdp' are set"); endif any_vector_conf = ! (isempty (diffp) && isempty (diff_onesided) && ... isempty (fixed)); ## correct "_pstruct" settings if functions are not supplied if (isempty (dfdp)) dfdp_pstruct = false; endif if (isempty (f)) f_pstruct = false; endif ## some settings require a parameter order if (p_struct || ! isempty (pconf) || f_pstruct || dfdp_pstruct) if (isempty (pord)) if (p_struct) if (any_vector_conf || ... ! ((f_pstruct || isempty (f)) && ... (dfdp_pstruct || isempty (dfdp)))) error ("no parameter order specified and constructing a parameter order from the structure of parameters can not be done since not all configuration or given functions are structure based"); else pord = fieldnames (pfin); endif else error ("given settings require specification of parameter order or parameters in the form of a structure"); endif endif pord = pord(:); if (p_struct && ! all (isfield (pfin, pord))) error ("some parameters lacking"); endif if ((nnames = rows (unique (pord))) < rows (pord)) error ("duplicate parameter names in 'param_order'"); endif if (isempty (pdims)) if (p_struct) pdims = cellfun ... (@ size, fields2cell (pfin, pord), "UniformOutput", false); else pdims = num2cell (ones (nnames, 2), 2); endif else pdims = pdims(:); if (p_struct && ... ! all (cellfun (@ (x, y) prod (size (x)) == prod (y), ... struct2cell (pfin), pdims))) error ("given param_dims and dimensions of parameters do not match"); endif endif if (nnames != rows (pdims)) error ("lengths of 'param_order' and 'param_dims' not equal"); endif pnel = cellfun (@ prod, pdims); ppartidx = pnel; if (any (pnel > 1)) pnonscalar = true; cpnel = num2cell (pnel); prepidx = cat (1, cellfun ... (@ (x, n) x(ones (1, n), 1), ... num2cell ((1:nnames).'), cpnel, ... "UniformOutput", false){:}); epord = pord(prepidx, 1); psubidx = cat (1, cellfun ... (@ (n) (1:n).', cpnel, ... "UniformOutput", false){:}); else pnonscalar = false; # some less expensive interfaces later prepidx = (1:nnames).'; epord = pord; psubidx = ones (nnames, 1); endif else pord = []; # spares checks for given but not needed endif if (p_struct) np = sum (pnel); else np = length (pfin); if (! isempty (pord) && np != sum (pnel)) error ("number of initial parameters not correct"); endif endif if (isnumeric (dfdp) && ! isempty (dfdp) && np == 0) np = columns (dfdp); endif plabels = num2cell (num2cell ((1:np).')); if (! isempty (pord)) plabels = cat (2, plabels, num2cell (epord), ... num2cell (num2cell (psubidx))); endif ## some useful vectors zerosvec = zeros (np, 1); falsevec = false (np, 1); sizevec = [np, 1]; ## collect parameter-related configuration if (! isempty (pconf)) ## use supplied configuration structure ## parameter-related configuration is either allowed by a structure ## or by vectors if (any_vector_conf) error ("if param_config is given, its potential items must not \ be configured in another way"); endif ## supplement parameter names lacking in param_config nidx = ! isfield (pconf, pord); pconf = cell2fields ({struct()}(ones (1, sum (nidx))), ... pord(nidx), 2, pconf); pconf = structcat (1, fields2cell (pconf, pord){:}); ## in the following, use reshape with explicit dimensions (instead ## of x(:)) so that errors are thrown if a configuration item has ## incorrect number of elements diffp = zerosvec; diffp(:) = diffp_default; if (isfield (pconf, "diffp")) idx = ! fieldempty (pconf, "diffp"); if (pnonscalar) diffp(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).diffp}.', cpnel(idx), ... "UniformOutput", false){:}); else diffp(idx) = [pconf.diffp]; endif endif diff_onesided = fixed = falsevec; if (isfield (pconf, "diff_onesided")) idx = ! fieldempty (pconf, "diff_onesided"); if (pnonscalar) diff_onesided(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).diff_onesided}.', cpnel(idx), ... "UniformOutput", false){:})); else diff_onesided(idx) = logical ([pconf.diff_onesided]); endif endif if (isfield (pconf, "fixed")) idx = ! fieldempty (pconf, "fixed"); if (pnonscalar) fixed(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).fixed}.', cpnel(idx), ... "UniformOutput", false){:})); else fixed(idx) = logical ([pconf.fixed]); endif endif else ## use supplied configuration vectors if (isempty (diffp)) diffp = zerosvec; diffp(:) = diffp_default; else if (any (size (diffp) != sizevec)) error ("diffp: wrong dimensions"); endif diffp(isna (diffp)) = diffp_default; endif if (isempty (diff_onesided)) diff_onesided = falsevec; else if (any (size (diff_onesided) != sizevec)) error ("diff_onesided: wrong dimensions") endif diff_onesided(isna (diff_onesided)) = false; diff_onesided = logical (diff_onesided); endif if (isempty (fixed)) fixed = falsevec; else if (any (size (fixed) != sizevec)) error ("fixed: wrong dimensions"); endif fixed(isna (fixed)) = false; fixed = logical (fixed); endif endif #### consider whether parameters and functions are based on parameter #### structures or parameter vectors; wrappers for call to default #### function for jacobians ## parameters if (p_struct) if (pnonscalar) pfin = cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... fields2cell (pfin, pord), cpnel, ... "UniformOutput", false){:}); else pfin = cat (1, fields2cell (pfin, pord){:}); endif endif ## model function if (f_pstruct) if (pnonscalar) f = @ (p, varargin) ... f (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f = @ (p, varargin) ... f (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif if (isempty (residuals)) if (isempty (f)) error ("neither model function nor residuals given"); endif residuals = f (pfin); endif if (isfield (hook, "observations")) if (any (size (residuals) != size (obs = hook.observations))) error ("dimensions of observations and values of model function must match"); endif f = @ (varargin) f (varargin{:}) - obs; residuals -= obs; endif ## jacobian of model function if (isempty (dfdp)) if (! isempty (f)) if (do_cstep) dfdp = @ (p, hook) jacobs (p, f, hook); else dfdp = @ (p, hook) __dfdp__ (p, f, hook); endif endif elseif (! isa (dfdp, "function_handle")) if (isnumeric (dfdp)) if (numel (size_dfdp = size (dfdp)) > 2 || ... any (size_dfdp != [prod(size(residuals)), np])) error ("jacobian has wrong size"); endif elseif (! dfdp_pstruct) error ("jacobian has wrong type"); endif dfdp = @ (varargin) dfdp; # simply make a function returning it endif if (dfdp_pstruct) if (pnonscalar) dfdp = @ (p, hook) ... cat (2, ... fields2cell ... (dfdp (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), hook), ... pord){:}); else dfdp = @ (p, hook) ... cat (2, ... fields2cell ... (dfdp (cell2struct (num2cell (p), pord, 1), hook), ... pord){:}); endif endif ## parameter-related configuration for jacobian function if (dfdp_pstruct) if(pnonscalar) s_diffp = cell2struct ... (cellfun (@ reshape, mat2cell (diffp, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_diff_onesided = cell2struct ... (cellfun (@ reshape, mat2cell (diff_onesided, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_plabels = cell2struct ... (num2cell ... (cat (2, cellfun ... (@ (x) cellfun ... (@ reshape, mat2cell (cat (1, x{:}), ppartidx), ... pdims, "UniformOutput", false), ... num2cell (plabels, 1), "UniformOutput", false){:}), ... 2), ... pord, 1); s_orig_fixed = cell2struct ... (cellfun (@ reshape, mat2cell (fixed, ppartidx), ... pdims, "UniformOutput", false), pord, 1); else s_diffp = cell2struct (num2cell (diffp), pord, 1); s_diff_onesided = cell2struct (num2cell (diff_onesided), pord, 1); s_plabels = cell2struct (num2cell (plabels, 2), pord, 1); s_fixed = cell2struct (num2cell (fixed), pord, 1); endif endif #### further values and checks ## check weights dimensions weights = optimget (settings, "weights", ones (size (residuals))); if (any (size (weights) != size (residuals))) error ("dimension of weights and residuals must match"); endif #### collect remaining settings need_dfdp = false; covd = optimget (settings, "covd"); need_objf_label = false; if ((ret_dfdp = optimget (settings, "ret_dfdp", false))) need_dfdp = true; endif if ((ret_covd = optimget (settings, "ret_covd", false))) need_objf_label = true; if (np == 0) error ("number of parameters must be known for 'covd', specify either parameters or a jacobian matrix"); endif endif if ((ret_covp = optimget (settings, "ret_covp", false))) need_objf_label = true; need_dfdp = true; endif if ((ret_corp = optimget (settings, "ret_corp", false))) need_objf_label = true; need_dfdp = true; endif if (need_objf_label) if (isempty (objf = optimget (settings, "objf_type"))) error ("label of objective function must be specified"); else funs = map_objf (objf); endif else funs = struct (); endif if (isempty (dfdp) && need_dfdp) error ("jacobian required and default function for jacobian requires a model function"); endif #### ## Everything which is computed is stored in a hook structure which is ## passed to and returned by every backend function. This hook is not ## identical to the returned structure, since some more results could ## be computed by the way. #### handle fixing of parameters orig_p = pfin; if (all (fixed) && ! isempty (fixed)) error ("no free parameters"); endif ## The policy should be that everything which is computed is left as ## it is up to the end --- since other computations might need it in ## this form --- and supplemented with values corresponding to fixed ## parameters (mostly NA, probably) not until then. ## ## The jacobian backend is the only backend which has the whole ## parameter vector available (including fixed elements), possibly ## handling fixing internally (e.g. by omitting computation). nonfixed = ! fixed; np_after_fixing = sum (nonfixed); if (any (fixed)) if (! isempty (pfin)) pfin = pfin(nonfixed); endif ## model function f = @ (p, varargin) f (assign (pfin, nonfixed, p), varargin{:}); ## jacobian of model function if (! isempty (dfdp)) dfdp = @ (p, hook) ... dfdp (assign (orig_p, nonfixed, p), hook)(:, nonfixed); endif endif #### supplement constants to jacobian function if (dfdp_pstruct) dfdp = @ (p, hook) ... dfdp (p, cell2fields ... ({s_diffp, s_diff_onesided, s_plabels, s_fixed, cstep}, ... {"diffp", "diff_onesided", "plabels", "fixed", "h"}, ... 2, hook)); else if (! isempty (dfdp)) dfdp = @ (p, hook) ... dfdp (p, cell2fields ... ({diffp, diff_onesided, plabels, fixed, cstep}, ... {"diffp", "diff_onesided", "plabels", "fixed", "h"}, ... 2, hook)); endif endif #### prepare interface hook ## passed final parameters of an optimization hook.pfin = pfin; ## passed function for derivative of model function hook.dfdp = dfdp; ## passed function for complementary pivoting ## hook.cpiv = cpiv; # set before ## passed value of residual function for initial parameters hook.residuals = residuals; ## passed weights hook.weights = weights; ## passed dimensions hook.np = np_after_fixing; hook.nm = prod (size (residuals)); ## passed statistics functions hook.funs = funs; ## passed covariance matrix of data (if given by user) if (! isempty (covd)) covd_dims = size (covd); if (length (covd_dims) != 2 || any (covd_dims != hook.nm)) error ("wrong dimensions of covariance matrix of data"); endif hook.covd = covd; endif #### do the actual work if (ret_dfdp) hook.jac = hook.dfdp (hook.pfin, hook); endif if (ret_covd) hook = funs.covd (hook); endif if (ret_covp || ret_corp) hook = funs.covp_corp (hook); endif #### convert (consider fixing ...) and return results ret = struct (); if (ret_dfdp) ret.dfdp = zeros (hook.nm, np); ret.dfdp(:, nonfixed) = hook.jac; endif if (ret_covd) ret.covd = hook.covd; endif if (ret_covp) if (any (fixed)) ret.covp = NA (np); ret.covp(nonfixed, nonfixed) = hook.covp; else ret.covp = hook.covp; endif endif if (ret_corp) if (any (fixed)) ret.corp = NA (np); ret.corp(nonfixed, nonfixed) = hook.corp; else ret.corp = hook.corp; endif endif endfunction function funs = map_objf (objf) switch (objf) case "wls" # weighted least squares funs.covd = str2func ("__covd_wls__"); funs.covp_corp = str2func ("__covp_corp_wls__"); otherwise error ("no statistics implemented for objective function '%s'", ... objf); endswitch endfunction function lval = assign (lval, lidx, rval) lval(lidx) = rval; endfunction optim-1.4.1/inst/private/__s2mat__.m0000644000175000017500000000343312477523726015742 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## __s2mat__ (s, ord) ## ## Returns a matrix of second derivatives with respect to some ## parameters from the structure-based representation of such a matrix ## in s, using the order of parameter names ord. s has to contain all ## fields named in ord. Each field has some subfields named in ord so ## that each second derivative is represented at least in one of its two ## possible orders. If it is represented differently in both orders, no ## error is returned, but both entries might get into the final matrix ## at symmetric positions. ## ## Should be included as a subfunction of a wrapper for optimization ## functions possibly needing a Hessian. function ret = __s2mat__ (s, ord) if (any (size (s) != [1, 1])) error ("structure must be scalar"); endif if (! (iscell (ord) && isvector (ord))) error ("ord must be a one-dimensional cell-array"); endif c = fields2cell (structcat (1, fields2cell (s, ord){:}), ord); neidx = ! (eidx = cellfun ("isempty", c)); ret = zeros (length (ord)); ret(neidx) = [c{neidx}]; # faster than [c{:}] ? ret(eidx) = ret.'(eidx); endfunctionoptim-1.4.1/inst/private/__nonlin_residmin__.m0000644000175000017500000010073012477523726020101 0ustar olafolaf## Copyright (C) 2010-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## Internal function, called by nonlin_residmin --- see there --- and ## others. Calling __nonlin_residmin__ indirectly hides the argument ## "hook", usable by wrappers, from users. Currently, hook can contain ## the field "observations", so that dimensions of observations and ## returned values of unchanged model function can be checked against ## each other exactly one time. function [p, resid, cvg, outp] = ... __nonlin_residmin__ (f, pin, settings, hook) ## some scalar defaults; some defaults are backend specific, so ## lacking elements in respective constructed vectors will be set to ## NA here in the frontend diffp_default = .001; stol_default = .0001; cstep_default = 1e-20; if (nargin == 1 && ischar (f) && strcmp (f, "defaults")) p = optimset ("param_config", [], ... "param_order", [], ... "param_dims", [], ... "f_inequc_pstruct", false, ... "f_equc_pstruct", false, ... "f_pstruct", false, ... "df_inequc_pstruct", false, ... "df_equc_pstruct", false, ... "df_pstruct", false, ... "lbound", [], ... "ubound", [], ... "dfdp", [], ... "cpiv", @ cpiv_bard, ... "max_fract_change", [], ... "fract_prec", [], ... "diffp", [], ... "diff_onesided", [], ... "complex_step_derivative_f", false, ... "complex_step_derivative_inequc", false, ... "complex_step_derivative_equc", false, ... "cstep", cstep_default, ... "fixed", [], ... "inequc", [], ... "equc", [], ... "f_inequc_idx", false, ... "df_inequc_idx", false, ... "f_equc_idx", false, ... "df_equc_idx", false, ... "weights", [], ... "TolFun", stol_default, ... "MaxIter", [], ... "Display", "off", ... "Algorithm", "lm_svd_feasible", ... "parallel_local", false, ... # Matlabs UseParallel # works differently "plot_cmd", @ (f) 0, ... "user_interaction", {}, ... "debug", false, ... "FunValCheck", "off", ... "lm_svd_feasible_alt_s", false); return; endif assign = @ assign; # Is this faster in repeated calls? if (nargin != 4) error ("incorrect number of arguments"); endif if (ischar (f)) f = str2func (f); endif if (! (pin_struct = isstruct (pin))) if (! isvector (pin) || columns (pin) > 1) error ("initial parameters must be either a structure or a column vector"); endif endif #### processing of settings and consistency checks pconf = optimget (settings, "param_config"); pord = optimget (settings, "param_order"); pdims = optimget (settings, "param_dims"); f_inequc_pstruct = optimget (settings, "f_inequc_pstruct", false); f_equc_pstruct = optimget (settings, "f_equc_pstruct", false); f_pstruct = optimget (settings, "f_pstruct", false); dfdp_pstruct = optimget (settings, "df_pstruct", f_pstruct); df_inequc_pstruct = optimget (settings, "df_inequc_pstruct", ... f_inequc_pstruct); df_equc_pstruct = optimget (settings, "df_equc_pstruct", ... f_equc_pstruct); lbound = optimget (settings, "lbound"); ubound = optimget (settings, "ubound"); dfdp = optimget (settings, "dfdp"); if (ischar (dfdp)) dfdp = str2func (dfdp); endif max_fract_change = optimget (settings, "max_fract_change"); fract_prec = optimget (settings, "fract_prec"); diffp = optimget (settings, "diffp"); diff_onesided = optimget (settings, "diff_onesided"); fixed = optimget (settings, "fixed"); do_cstep = optimget (settings, "complex_step_derivative_f", false); cstep = optimget (settings, "cstep", cstep_default); if (do_cstep && ! isempty (dfdp)) error ("both 'complex_step_derivative_f' and 'dfdp' are set"); endif do_cstep_inequc = ... optimget (settings, "complex_step_derivative_inequc", false); do_cstep_equc = optimget (settings, "complex_step_derivative_equc", false); if (! iscell (user_interaction = ... optimget (settings, "user_interaction", {}))) user_interaction = {user_interaction}; endif any_vector_conf = ! (isempty (lbound) && isempty (ubound) && ... isempty (max_fract_change) && ... isempty (fract_prec) && isempty (diffp) && ... isempty (diff_onesided) && isempty (fixed)); ## collect constraints [mc, vc, f_genicstr, df_gencstr, user_df_gencstr] = ... __collect_constraints__ (optimget (settings, "inequc"), ... do_cstep_inequc, "inequality constraints"); [emc, evc, f_genecstr, df_genecstr, user_df_genecstr] = ... __collect_constraints__ (optimget (settings, "equc"), ... do_cstep_equc, "equality constraints"); mc_struct = isstruct (mc); emc_struct = isstruct (emc); ## correct "_pstruct" settings if functions are not supplied, handle ## constraint functions not honoring indices if (isempty (dfdp)) dfdp_pstruct = false; endif if (isempty (f_genicstr)) f_inequc_pstruct = false; elseif (! optimget (settings, "f_inequc_idx", false)) f_genicstr = @ (p, varargin) apply_idx_if_given ... (f_genicstr (p, varargin{:}), varargin{:}); endif if (isempty (f_genecstr)) f_equc_pstruct = false; elseif (! optimget (settings, "f_equc_idx", false)) f_genecstr = @ (p, varargin) apply_idx_if_given ... (f_genecstr (p, varargin{:}), varargin{:}); endif if (user_df_gencstr) if (! optimget (settings, "df_inequc_idx", false)) df_gencstr = @ (varargin) df_gencstr (varargin{:})(varargin{2}, :); endif else df_inequc_pstruct = false; endif if (user_df_genecstr) if (! optimget (settings, "df_equc_idx", false)) df_genecstr = @ (varargin) df_genecstr (varargin{:})(varargin{2}, :); endif else df_equc_pstruct = false; endif ## some settings require a parameter order if (pin_struct || ! isempty (pconf) || f_inequc_pstruct || ... f_equc_pstruct || f_pstruct || dfdp_pstruct || ... df_inequc_pstruct || df_equc_pstruct || mc_struct || ... emc_struct) if (isempty (pord)) if (pin_struct) if (any_vector_conf || ... ! (f_pstruct && ... (f_inequc_pstruct || isempty (f_genicstr)) && ... (f_equc_pstruct || isempty (f_genecstr)) && ... (dfdp_pstruct || isempty (dfdp)) && ... (df_inequc_pstruct || ! user_df_gencstr) && ... (df_equc_pstruct || ! user_df_genecstr) && ... (mc_struct || isempty (mc)) && ... (emc_struct || isempty (emc)))) error ("no parameter order specified and constructing a parameter order from the structure of initial parameters can not be done since not all configuration or given functions are structure based"); else pord = fieldnames (pin); endif else error ("given settings require specification of parameter order or initial parameters in the form of a structure"); endif endif pord = pord(:); if (pin_struct && ! all (isfield (pin, pord))) error ("some initial parameters lacking"); endif if ((nnames = rows (unique (pord))) < rows (pord)) error ("duplicate parameter names in 'param_order'"); endif if (isempty (pdims)) if (pin_struct) pdims = cellfun ... (@ size, fields2cell (pin, pord), "UniformOutput", false); else pdims = num2cell (ones (nnames, 2), 2); endif else pdims = pdims(:); if (pin_struct && ... ! all (cellfun (@ (x, y) prod (size (x)) == prod (y), ... struct2cell (pin), pdims))) error ("given param_dims and dimensions of initial parameters do not match"); endif endif if (nnames != rows (pdims)) error ("lengths of 'param_order' and 'param_dims' not equal"); endif pnel = cellfun (@ prod, pdims); ppartidx = pnel; if (any (pnel > 1)) pnonscalar = true; cpnel = num2cell (pnel); prepidx = cat (1, cellfun ... (@ (x, n) x(ones (1, n), 1), ... num2cell ((1:nnames).'), cpnel, ... "UniformOutput", false){:}); epord = pord(prepidx, 1); psubidx = cat (1, cellfun ... (@ (n) (1:n).', cpnel, ... "UniformOutput", false){:}); else pnonscalar = false; # some less expensive interfaces later prepidx = (1:nnames).'; epord = pord; psubidx = ones (nnames, 1); endif else pord = []; # spares checks for given but not needed endif if (pin_struct) np = sum (pnel); else np = length (pin); if (! isempty (pord) && np != sum (pnel)) error ("number of initial parameters not correct"); endif endif plabels = num2cell (num2cell ((1:np).')); if (! isempty (pord)) plabels = cat (2, plabels, num2cell (epord), ... num2cell (num2cell (psubidx))); endif ## some useful vectors zerosvec = zeros (np, 1); NAvec = NA (np, 1); Infvec = Inf (np, 1); falsevec = false (np, 1); sizevec = [np, 1]; ## collect parameter-related configuration if (! isempty (pconf)) ## use supplied configuration structure ## parameter-related configuration is either allowed by a structure ## or by vectors if (any_vector_conf) error ("if param_config is given, its potential items must not \ be configured in another way"); endif ## supplement parameter names lacking in param_config nidx = ! isfield (pconf, pord); pconf = cell2fields ({struct()}(ones (1, sum (nidx))), ... pord(nidx), 2, pconf); pconf = structcat (1, fields2cell (pconf, pord){:}); ## in the following, use reshape with explicit dimensions (instead ## of x(:)) so that errors are thrown if a configuration item has ## incorrect number of elements lbound = - Infvec; if (isfield (pconf, "lbound")) idx = ! fieldempty (pconf, "lbound"); if (pnonscalar) lbound (idx(prepidx), 1) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).lbound}.', ... cpnel(idx), "UniformOutput", false){:}); else lbound(idx, 1) = cat (1, pconf.lbound); endif endif ubound = Infvec; if (isfield (pconf, "ubound")) idx = ! fieldempty (pconf, "ubound"); if (pnonscalar) ubound (idx(prepidx), 1) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).ubound}.', ... cpnel(idx), "UniformOutput", false){:}); else ubound(idx, 1) = cat (1, pconf.ubound); endif endif max_fract_change = fract_prec = NAvec; if (isfield (pconf, "max_fract_change")) idx = ! fieldempty (pconf, "max_fract_change"); if (pnonscalar) max_fract_change(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).max_fract_change}.', ... cpnel(idx), ... "UniformOutput", false){:}); else max_fract_change(idx) = [pconf.max_fract_change]; endif endif if (isfield (pconf, "fract_prec")) idx = ! fieldempty (pconf, "fract_prec"); if (pnonscalar) fract_prec(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).fract_prec}.', cpnel(idx), ... "UniformOutput", false){:}); else fract_prec(idx) = [pconf.fract_prec]; endif endif diffp = zerosvec; diffp(:) = diffp_default; if (isfield (pconf, "diffp")) idx = ! fieldempty (pconf, "diffp"); if (pnonscalar) diffp(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).diffp}.', cpnel(idx), ... "UniformOutput", false){:}); else diffp(idx) = [pconf.diffp]; endif endif diff_onesided = fixed = falsevec; if (isfield (pconf, "diff_onesided")) idx = ! fieldempty (pconf, "diff_onesided"); if (pnonscalar) diff_onesided(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).diff_onesided}.', cpnel(idx), ... "UniformOutput", false){:})); else diff_onesided(idx) = logical ([pconf.diff_onesided]); endif endif if (isfield (pconf, "fixed")) idx = ! fieldempty (pconf, "fixed"); if (pnonscalar) fixed(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).fixed}.', cpnel(idx), ... "UniformOutput", false){:})); else fixed(idx) = logical ([pconf.fixed]); endif endif else ## use supplied configuration vectors if (isempty (lbound)) lbound = - Infvec; elseif (any (size (lbound) != sizevec)) error ("bounds: wrong dimensions"); endif if (isempty (ubound)) ubound = Infvec; elseif (any (size (ubound) != sizevec)) error ("bounds: wrong dimensions"); endif if (isempty (max_fract_change)) max_fract_change = NAvec; elseif (any (size (max_fract_change) != sizevec)) error ("max_fract_change: wrong dimensions"); endif if (isempty (fract_prec)) fract_prec = NAvec; elseif (any (size (fract_prec) != sizevec)) error ("fract_prec: wrong dimensions"); endif if (isempty (diffp)) diffp = zerosvec; diffp(:) = diffp_default; else if (any (size (diffp) != sizevec)) error ("diffp: wrong dimensions"); endif diffp(isna (diffp)) = diffp_default; endif if (isempty (diff_onesided)) diff_onesided = falsevec; else if (any (size (diff_onesided) != sizevec)) error ("diff_onesided: wrong dimensions") endif diff_onesided(isna (diff_onesided)) = false; diff_onesided = logical (diff_onesided); endif if (isempty (fixed)) fixed = falsevec; else if (any (size (fixed) != sizevec)) error ("fixed: wrong dimensions"); endif fixed(isna (fixed)) = false; fixed = logical (fixed); endif endif ## guaranty all (lbound <= ubound) if (any (lbound > ubound)) error ("some lower bounds larger than upper bounds"); endif #### consider whether initial parameters and functions are based on #### parameter structures or parameter vectors; wrappers for call to #### default function for jacobians ## initial parameters if (pin_struct) if (pnonscalar) pin = cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... fields2cell (pin, pord), cpnel, ... "UniformOutput", false){:}); else pin = cat (1, fields2cell (pin, pord){:}); endif endif ## model function if (f_pstruct) if (pnonscalar) f = @ (p, varargin) ... f (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f = @ (p, varargin) ... f (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif f_pin = f (pin); # Doing this in the frontend is useful for # residual-based minimization (but not # fro scalar objective functions) if (isfield (hook, "observations")) if (any (size (f_pin) != size (obs = hook.observations))) error ("dimensions of observations and values of model function must match"); endif f = @ (varargin) f (varargin{:}) - obs; f_pin -= obs; user_interaction = ... cellfun (@ (f_handle) @ (p, v, s) ... f_handle (p, setfield (v, "model_y", v.residual + obs), s), user_interaction(:), "UniformOutput", false); endif ## jacobian of model function if (isempty (dfdp)) if (do_cstep) dfdp = @ (p, hook) jacobs (p, f, hook); else dfdp = @ (p, hook) __dfdp__ (p, f, hook); endif endif if (dfdp_pstruct) if (pnonscalar) dfdp = @ (p, hook) ... cat (2, ... fields2cell ... (dfdp (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), hook), ... pord){:}); else dfdp = @ (p, hook) ... cat (2, ... fields2cell ... (dfdp (cell2struct (num2cell (p), pord, 1), hook), ... pord){:}); endif endif ## function for general inequality constraints if (f_inequc_pstruct) if (pnonscalar) f_genicstr = @ (p, varargin) ... f_genicstr (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f_genicstr = @ (p, varargin) ... f_genicstr ... (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif ## note this stage possibly_pstruct_f_genicstr = f_genicstr; ## jacobian of general inequality constraints if (df_inequc_pstruct) if (pnonscalar) df_gencstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_gencstr ... (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), pord, 1), ... func, idx, hook), ... pord){:}); else df_gencstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_gencstr (cell2struct (num2cell (p), pord, 1), ... func, idx, hook), ... pord){:}); endif endif ## function for general equality constraints if (f_equc_pstruct) if (pnonscalar) f_genecstr = @ (p, varargin) ... f_genecstr (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f_genecstr = @ (p, varargin) ... f_genecstr ... (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif ## note this stage possibly_pstruct_f_genecstr = f_genecstr; ## jacobian of general equality constraints if (df_equc_pstruct) if (pnonscalar) df_genecstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_genecstr ... (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), pord, 1), ... func, idx, hook), ... pord){:}); else df_genecstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_genecstr (cell2struct (num2cell (p), pord, 1), ... func, idx, hook), ... pord){:}); endif endif ## linear inequality constraints if (mc_struct) idx = isfield (mc, pord); if (rows (fieldnames (mc)) > sum (idx)) error ("unknown fields in structure of linear inequality constraints"); endif smc = mc; mc = zeros (np, rows (vc)); mc(idx(prepidx), :) = cat (1, fields2cell (smc, pord(idx)){:}); endif ## linear equality constraints if (emc_struct) idx = isfield (emc, pord); if (rows (fieldnames (emc)) > sum (idx)) error ("unknown fields in structure of linear equality constraints"); endif semc = emc; emc = zeros (np, rows (evc)); emc(idx(prepidx), :) = cat (1, fields2cell (semc, pord(idx)){:}); endif ## parameter-related configuration for jacobian functions if (dfdp_pstruct || df_inequc_pstruct || df_equc_pstruct) if(pnonscalar) s_diffp = cell2struct ... (cellfun (@ reshape, mat2cell (diffp, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_diff_onesided = cell2struct ... (cellfun (@ reshape, mat2cell (diff_onesided, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_orig_lbound = cell2struct ... (cellfun (@ reshape, mat2cell (lbound, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_orig_ubound = cell2struct ... (cellfun (@ reshape, mat2cell (ubound, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_plabels = cell2struct ... (num2cell ... (cat (2, cellfun ... (@ (x) cellfun ... (@ reshape, mat2cell (cat (1, x{:}), ppartidx), ... pdims, "UniformOutput", false), ... num2cell (plabels, 1), "UniformOutput", false){:}), ... 2), ... pord, 1); s_orig_fixed = cell2struct ... (cellfun (@ reshape, mat2cell (fixed, ppartidx), ... pdims, "UniformOutput", false), pord, 1); else s_diffp = cell2struct (num2cell (diffp), pord, 1); s_diff_onesided = cell2struct (num2cell (diff_onesided), pord, 1); s_orig_lbound = cell2struct (num2cell (lbound), pord, 1); s_orig_ubound = cell2struct (num2cell (ubound), pord, 1); s_plabels = cell2struct (num2cell (plabels, 2), pord, 1); s_orig_fixed = cell2struct (num2cell (fixed), pord, 1); endif endif #### some further values and checks if (any (fixed & (pin < lbound | pin > ubound))) warning ("some fixed parameters outside bounds"); endif if (any (diffp <= 0)) error ("some elements of 'diffp' non-positive"); endif if (cstep <= 0) error ("'cstep' non-positive"); endif if ((hook.TolFun = optimget (settings, "TolFun", stol_default)) < 0) error ("'TolFun' negative"); endif if (any (fract_prec < 0)) error ("some elements of 'fract_prec' negative"); endif if (any (max_fract_change < 0)) error ("some elements of 'max_fract_change' negative"); endif ## dimensions of linear constraints if (isempty (mc)) mc = zeros (np, 0); vc = zeros (0, 1); endif if (isempty (emc)) emc = zeros (np, 0); evc = zeros (0, 1); endif [rm, cm] = size (mc); [rv, cv] = size (vc); if (rm != np || cm != rv || cv != 1) error ("linear inequality constraints: wrong dimensions"); endif [erm, ecm] = size (emc); [erv, ecv] = size (evc); if (erm != np || ecm != erv || ecv != 1) error ("linear equality constraints: wrong dimensions"); endif ## check weights dimensions weights = optimget (settings, "weights", ones (size (f_pin))); if (any (size (weights) != size (f_pin))) error ("dimension of weights and residuals must match"); endif if (any (weights(:) < 0)) error ("some weights negative") endif ## note initial values of linear constraits pin_cstr.inequ.lin_except_bounds = mc.' * pin + vc; pin_cstr.equ.lin = emc.' * pin + evc; ## note number and initial values of general constraints if (isempty (f_genicstr)) pin_cstr.inequ.gen = []; n_genicstr = 0; else n_genicstr = length (pin_cstr.inequ.gen = f_genicstr (pin)); endif if (isempty (f_genecstr)) pin_cstr.equ.gen = []; n_genecstr = 0; else n_genecstr = length (pin_cstr.equ.gen = f_genecstr (pin)); endif #### collect remaining settings parallel_local = hook.parallel_local = ... __optimget_parallel_local__ (settings, false); hook.MaxIter = optimget (settings, "MaxIter"); if (ischar (hook.cpiv = optimget (settings, "cpiv", @ cpiv_bard))) hook.cpiv = str2func (hook.cpiv); endif hook.Display = optimget (settings, "Display", "off"); if (isempty (plot_cmd = optimget (settings, "plot_cmd", []))) hook.plot_cmd = @ (f) 0; else warning ("setting 'plot_cmd' is deprecated, please use 'user_interaction'"); hook.plot_cmd = plot_cmd; endif hook.testing = optimget (settings, "debug", false); hook.new_s = optimget (settings, "lm_svd_feasible_alt_s", false); hook.FunValCheck = optimget (settings, "FunValCheck", "off"); backend = optimget (settings, "Algorithm", "lm_svd_feasible"); backend = map_matlab_algorithm_names (backend); backend = map_backend (backend); #### handle fixing of parameters orig_lbound = lbound; orig_ubound = ubound; orig_fixed = fixed; if (all (fixed)) error ("no free parameters"); endif nonfixed = ! fixed; if (any (fixed)) ## backend (returned values and initial parameters) backend = @ (f, pin, hook) ... backend_wrapper (backend, fixed, f, pin, hook); ## model function f = @ (p, varargin) f (assign (pin, nonfixed, p), varargin{:}); ## jacobian of model function dfdp = @ (p, hook) ... dfdp (assign (pin, nonfixed, p), hook)(:, nonfixed); ## function for general inequality constraints f_genicstr = @ (p, varargin) ... f_genicstr (assign (pin, nonfixed, p), varargin{:}); ## jacobian of general inequality constraints df_gencstr = @ (p, func, idx, hook) ... df_gencstr (assign (pin, nonfixed, p), func, idx, hook) ... (:, nonfixed); ## function for general equality constraints f_genecstr = @ (p, varargin) ... f_genecstr (assign (pin, nonfixed, p), varargin{:}); ## jacobian of general equality constraints df_genecstr = @ (p, func, idx, hook) ... df_genecstr (assign (pin, nonfixed, p), func, idx, hook) ... (:, nonfixed); ## linear inequality constraints vc += mc(fixed, :).' * (tp = pin(fixed)); mc = mc(nonfixed, :); ## linear equality constraints evc += emc(fixed, :).' * tp; emc = emc(nonfixed, :); ## _last_ of all, vectors of parameter-related configuration, ## including "fixed" itself lbound = lbound(nonfixed, :); ubound = ubound(nonfixed, :); max_fract_change = max_fract_change(nonfixed); fract_prec = fract_prec(nonfixed); fixed = fixed(nonfixed); endif #### supplement constants to jacobian functions ## jacobian of model function if (dfdp_pstruct) dfdp = @ (p, hook) ... dfdp (p, cell2fields ... ({s_diffp, s_diff_onesided, s_orig_lbound, ... s_orig_ubound, s_plabels, ... cell2fields(num2cell(hook.fixed), pord(nonfixed), ... 1, s_orig_fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); else dfdp = @ (p, hook) ... dfdp (p, cell2fields ... ({diffp, diff_onesided, orig_lbound, orig_ubound, ... plabels, assign(orig_fixed, nonfixed, hook.fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); endif ## jacobian of general inequality constraints if (df_inequc_pstruct) df_gencstr = @ (p, func, idx, hook) ... df_gencstr (p, func, idx, cell2fields ... ({s_diffp, s_diff_onesided, s_orig_lbound, ... s_orig_ubound, s_plabels, ... cell2fields(num2cell(hook.fixed), pord(nonfixed), ... 1, s_orig_fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); else df_gencstr = @ (p, func, idx, hook) ... df_gencstr (p, func, idx, cell2fields ... ({diffp, diff_onesided, orig_lbound, ... orig_ubound, plabels, ... assign(orig_fixed, nonfixed, hook.fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); endif ## jacobian of general equality constraints if (df_equc_pstruct) df_genecstr = @ (p, func, idx, hook) ... df_genecstr (p, func, idx, cell2fields ... ({s_diffp, s_diff_onesided, s_orig_lbound, ... s_orig_ubound, s_plabels, ... cell2fields(num2cell(hook.fixed), pord(nonfixed), ... 1, s_orig_fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); else df_genecstr = @ (p, func, idx, hook) ... df_genecstr (p, func, idx, cell2fields ... ({diffp, diff_onesided, orig_lbound, ... orig_ubound, plabels, ... assign(orig_fixed, nonfixed, hook.fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); endif #### interfaces to constraints ## include bounds into linear inequality constraints tp = eye (sum (nonfixed)); lidx = lbound != - Inf; uidx = ubound != Inf; mc = cat (2, tp(:, lidx), - tp(:, uidx), mc); vc = cat (1, - lbound(lidx, 1), ubound(uidx, 1), vc); ## concatenate linear inequality and equality constraints mc = cat (2, mc, emc); vc = cat (1, vc, evc); n_lincstr = rows (vc); ## concatenate general inequality and equality constraints if (n_genecstr > 0) if (n_genicstr > 0) nidxi = 1 : n_genicstr; nidxe = n_genicstr + 1 : n_genicstr + n_genecstr; f_gencstr = @ (p, idx, varargin) ... cat (1, ... f_genicstr (p, idx(nidxi), varargin{:}), ... f_genecstr (p, idx(nidxe), varargin{:})); df_gencstr = @ (p, idx, hook) ... cat (1, ... df_gencstr (p, @ (p, varargin) ... possibly_pstruct_f_genicstr ... (p, idx(nidxi), varargin{:}), ... idx(nidxi), ... setfield (hook, "f", ... hook.f(nidxi(idx(nidxi))))), ... df_genecstr (p, @ (p, varargin) ... possibly_pstruct_f_genecstr ... (p, idx(nidxe), varargin{:}), ... idx(nidxe), ... setfield (hook, "f", ... hook.f(nidxe(idx(nidxe)))))); else f_gencstr = f_genecstr; df_gencstr = @ (p, idx, hook) ... df_genecstr (p, ... @ (p, varargin) ... possibly_pstruct_f_genecstr ... (p, idx, varargin{:}), ... idx, ... setfield (hook, "f", hook.f(idx))); endif else f_gencstr = f_genicstr; df_gencstr = @ (p, idx, hook) ... df_gencstr (p, ... @ (p, varargin) ... possibly_pstruct_f_genicstr (p, idx, varargin{:}), ... idx, ... setfield (hook, "f", hook.f(idx))); endif n_gencstr = n_genicstr + n_genecstr; ## concatenate linear and general constraints, defining the final ## function interfaces if (n_gencstr > 0) nidxl = 1:n_lincstr; nidxh = n_lincstr + 1 : n_lincstr + n_gencstr; f_cstr = @ (p, idx, varargin) ... cat (1, ... mc(:, idx(nidxl)).' * p + vc(idx(nidxl), 1), ... f_gencstr (p, idx(nidxh), varargin{:})); df_cstr = @ (p, idx, hook) ... cat (1, ... mc(:, idx(nidxl)).', ... df_gencstr (p, idx(nidxh), ... setfield (hook, "f", ... hook.f(nidxh)))); else f_cstr = @ (p, idx, varargin) mc(:, idx).' * p + vc(idx, 1); df_cstr = @ (p, idx, hook) mc(:, idx).'; endif ## define eq_idx (logical index of equality constraints within all ## concatenated constraints eq_idx = false (n_lincstr + n_gencstr, 1); eq_idx(n_lincstr + 1 - rows (evc) : n_lincstr) = true; n_cstr = n_lincstr + n_gencstr; eq_idx(n_cstr + 1 - n_genecstr : n_cstr) = true; #### prepare interface hook ## passed constraints hook.mc = mc; hook.vc = vc; hook.f_cstr = f_cstr; hook.df_cstr = df_cstr; hook.n_gencstr = n_gencstr; hook.eq_idx = eq_idx; hook.lbound = lbound; hook.ubound = ubound; ## passed values of constraints for initial parameters hook.pin_cstr = pin_cstr; ## passed function for derivative of model function hook.dfdp = dfdp; ## passed function for complementary pivoting ## hook.cpiv = cpiv; # set before ## passed value of residual function for initial parameters hook.f_pin = f_pin; ## passed options hook.max_fract_change = max_fract_change; hook.fract_prec = fract_prec; ## hook.TolFun = ; # set before ## hook.MaxIter = ; # set before hook.weights = weights; hook.fixed = fixed; hook.user_interaction = user_interaction; #### call backend [p, resid, cvg, outp] = backend (f, pin, hook); if (pin_struct) if (pnonscalar) p = cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1); else p = cell2struct (num2cell (p), pord, 1); endif endif endfunction function backend = map_matlab_algorithm_names (backend) switch (backend) case "levenberg-marquardt" backend = "lm_svd_feasible"; warning ("algorithm 'levenberg-marquardt' mapped to 'lm_svd_feasible'"); endswitch endfunction function backend = map_backend (backend) switch (backend) case "lm_svd_feasible" backend = "__lm_svd__"; otherwise error ("no backend implemented for algorithm '%s'", backend); endswitch backend = str2func (backend); endfunction function [p, resid, cvg, outp] = backend_wrapper (backend, fixed, f, p, hook) [tp, resid, cvg, outp] = backend (f, p(! fixed), hook); p(! fixed) = tp; endfunction function lval = assign (lval, lidx, rval) lval(lidx) = rval; endfunction function ret = apply_idx_if_given (ret, varargin) if (nargin > 1) ret = ret(varargin{1}); endif endfunction optim-1.4.1/inst/private/__dfdp__.m0000644000175000017500000002343512477523726015635 0ustar olafolaf## Copyright (C) 1992-1994 Richard Shrager ## Copyright (C) 1992-1994 Arthur Jutan ## Copyright (C) 1992-1994 Ray Muzic ## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . function prt = __dfdp__ (p, func, hook) ## Meant to be called by interfaces "dfxpdp.m" and "dfpdp.m", see there. if (nargin > 2 && isfield (hook, "f")) f = hook.f; else f = func (p); f = f(:); endif m = length (f); n = length (p); if (nargin > 2) if (isfield (hook, "fixed")) fixed = hook.fixed; else fixed = false (n, 1); endif if (isfield (hook, "diffp")) diffp = hook.diffp; else diffp = .001 * ones (n, 1); endif if (isfield (hook, "diff_onesided")) diff_onesided = hook.diff_onesided; else diff_onesided = false (n, 1); endif if (isfield (hook, "lbound")) lbound = hook.lbound; else lbound = - Inf (n, 1); endif if (isfield (hook, "ubound")) ubound = hook.ubound; else ubound = Inf (n, 1); endif if (isfield (hook, "plabels")) plabels = hook.plabels; else plabels = num2cell (num2cell ((1:n).')); endif if (isfield (hook, "parallel_local")) parallel_local = hook.parallel_local; else parallel_local = false; endif else fixed = false (n, 1); diff_onesided = fixed; diffp = .001 * ones (n, 1); lbound = - Inf (n, 1); ubound = Inf (n, 1); plabels = num2cell (num2cell ((1:n).')); parallel_local = false; endif prt = zeros (m, n); # initialise Jacobian to Zero del = diffp .* p; idxa = p == 0; del(idxa) = diffp(idxa); del(diff_onesided) = - del(diff_onesided); # keep course of # optimization of previous versions absdel = abs (del); idxd = ~(diff_onesided | fixed); # double sided interval p1 = zeros (n, 1); p2 = p1; idxvs = false (n, 1); idx1g2w = idxvs; idx1le2w = idxvs; ## p may be slightly out of bounds due to inaccuracy, or exactly at ## the bound -> single sided interval idxvl = p <= lbound; idxvg = p >= ubound; p1(idxvl) = min (p(idxvl, 1) + absdel(idxvl, 1), ubound(idxvl, 1)); idxd(idxvl) = false; p1(idxvg) = max (p(idxvg, 1) - absdel(idxvg, 1), lbound(idxvg, 1)); idxd(idxvg) = false; idxs = ~(fixed | idxd); # single sided interval idxnv = ~(idxvl | idxvg); # current paramters within bounds idxnvs = idxs & idxnv; # within bounds, single sided interval idxnvd = idxd & idxnv; # within bounds, double sided interval ## remaining single sided intervals p1(idxnvs) = p(idxnvs) + del(idxnvs); # don't take absdel, this could # change course of optimization without # bounds with respect to previous # versions ## remaining single sided intervals, violating a bound -> take largest ## possible direction of single sided interval idxvs(idxnvs) = p1(idxnvs, 1) < lbound(idxnvs, 1) | ... p1(idxnvs, 1) > ubound(idxnvs, 1); del1 = p(idxvs, 1) - lbound(idxvs, 1); del2 = ubound(idxvs, 1) - p(idxvs, 1); idx1g2 = del1 > del2; idx1g2w(idxvs) = idx1g2; idx1le2w(idxvs) = ~idx1g2; p1(idx1g2w) = max (p(idx1g2w, 1) - absdel(idx1g2w, 1), ... lbound(idx1g2w, 1)); p1(idx1le2w) = min (p(idx1le2w, 1) + absdel(idx1le2w, 1), ... ubound(idx1le2w, 1)); ## double sided interval p1(idxnvd) = min (p(idxnvd, 1) + absdel(idxnvd, 1), ... ubound(idxnvd, 1)); p2(idxnvd) = max (p(idxnvd, 1) - absdel(idxnvd, 1), ... lbound(idxnvd, 1)); del(idxs) = p1(idxs) - p(idxs); del(idxd) = p1(idxd) - p2(idxd); info.f = f; info.parallel = parallel_local; if (parallel_local) ## symplicistic approach, fork for each computation and leave all ## scheduling to kernel; otherwise arguments would have to be passed ## over pipes, not sure whether this would be faster n_ss = sum (idxs); n_ds = sum (idxd); n_childs = n_ss + 2 * n_ds; child_data = zeros (n_childs, 5); # pipe desriptor for reading, pid, # side (zero for one-sided), line # number, parameter number child_data(:, 4) = 1 : n_childs; active_childs = true (n_childs, 1); tp_ss = zeros (m, n); # results for single sided tp_ds = zeros (m, n, 2); # results for double sided unwind_protect ready = false; lerrm = lasterr (); lasterr (""); cid = 0; for j = 1:n if (! fixed(j)) cid++; child_data(cid, 5) = j; info.plabels = plabels(j, :); ps = p; ps(j) = p1(j); [pd1, pd2, err, msg] = pipe (); if (err) error ("could not create pipe: %s", msg); endif child_data(cid, 1) = pd1; if (idxs(j)) info.side = 0; # onesided interval if ((pid = fork ()) == 0) ## child pclose (pd1); unwind_protect tp = func (ps, info); __bw_psend__ (pd2, tp); unwind_protect_cleanup pclose (pd2); __internal_exit__ (); end_unwind_protect ## end child elseif (pid > 0) child_data(cid, 2) = pid; ## child_data(cid, 3) is already 0 pclose (pd2); else ## fork error error ("could not fork"); endif else info.side = 1; # centered interval, side 1 if ((pid = fork ()) == 0) ## child pclose (pd1); unwind_protect tp = func (ps, info); __bw_psend__ (pd2, tp); unwind_protect_cleanup pclose (pd2); __internal_exit__ (); end_unwind_protect ## end child elseif (pid > 0) child_data(cid, 2) = pid; child_data(cid, 3) = 1; pclose (pd2); else ## fork error error ("could not fork"); endif cid++; child_data(cid, 5) = j; ps(j) = p2(j); info.side = 2; # centered interval, side 2 [pd1, pd2, err, msg] = pipe (); if (err) error ("could not create pipe: %s", msg); endif child_data(cid, 1) = pd1; if ((pid = fork ()) == 0) ## child pclose (pd1); unwind_protect tp = func (ps, info); __bw_psend__ (pd2, tp); unwind_protect_cleanup pclose (pd2); __internal_exit__ (); end_unwind_protect ## end child elseif (pid > 0) child_data(cid, 2) = pid; child_data(cid, 3) = 2; pclose (pd2); else ## fork error error ("could not fork"); endif endif endif # (! fixed(j)) endfor while (any (active_childs)) [~, act] = select (child_data(active_childs, 1), [], [], -1); act_idx = child_data(active_childs, 4)(act); for id = act_idx.' res = __bw_prcv__ (child_data(id, 1)); if (isnumeric (res)) error ("child closed pipe without sending"); endif res = res.psend_var; pclose (child_data(id, 1)); child_data(id, 1) = 0; waitpid (child_data(id, 2)); child_data(id, 2) = 0; active_childs(id) = false; if (child_data(id, 3)) # double sided tp_ds(:, child_data(id, 5), child_data(id, 3)) = res; else # single sided tp_ss(:, child_data(id, 5)) = res; endif endfor endwhile ready = true; # try/catch would not handle ctrl-c unwind_protect_cleanup if (! ready) for (id = 1 : n_childs) if (child_data(id, 1)) pclose (child_data(id, 1)); endif if (child_data(id, 2)) kill (child_data(id, 2), 9); waitpid (child_data(id, 2)); endif endfor nerrm = lasterr (); error ("no success, last error message: %s", nerrm); endif lasterr (lerrm); end_unwind_protect prt(:, idxs) = (tp_ss(:, idxs) - f(:, ones (1, n_ss))) ./ ... del(idxs).'(ones (1, m), :); prt(:, idxd) = (tp_ds(:, idxd, 1) - tp_ds(:, idxd, 2)) ./ ... del(idxd).'(ones (1, m), :); else # not parallel for j = 1:n if (~fixed(j)) info.plabels = plabels(j, :); ps = p; ps(j) = p1(j); if (idxs(j)) info.side = 0; # onesided interval tp1 = func (ps, info); prt(:, j) = (tp1(:) - f) / del(j); else info.side = 1; # centered interval, side 1 tp1 = func (ps, info); ps(j) = p2(j); info.side = 2; # centered interval, side 2 tp2 = func (ps, info); prt(:, j) = (tp1(:) - tp2(:)) / del(j); endif endif endfor endif endfunction optim-1.4.1/inst/private/__plot_cmds__.m0000644000175000017500000000265012477523726016700 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . function __plot_cmds__ (x, y, f) persistent lgnd; persistent use_x; if (nargin == 0) ## reset function lgnd = []; return; endif if (length (size (f)) > 2) return; endif if (isempty (lgnd)); n = size (y, 2); if (n == 1) lgnd = {"data", "fit"}; else id = num2str ((1:n).'); lgnd1 = cat (2, repmat ("data ", n, 1), id); lgnd2 = cat (2, repmat ("fit ", n, 1), id); lgnd = cat (1, cellstr (lgnd1), cellstr (lgnd2)); endif use_x = size (x, 1) == size (y, 1); endif x = x(:, 1); if (use_x) plot (x, y, "marker", "+", "linestyle", "none", x, f); else plot (y, "marker", "+", "linestyle", "none", f); endif legend (lgnd); drawnow; endfunction optim-1.4.1/inst/private/optim_problems_p_r_y.data0000644000175000017500000003074112477523726021022 0ustar olafolaf 0.74474 0.704907943019318 0.6783600455187353 0.6638710199213019 0.6493922050536766 0.644562529854532 0.6373129116909114 0.6409377207727217 0.6312783703744327 0.6312783703744327 0.6300735042570986 0.6252438290579542 0.6228340968232858 0.6204141538588096 0.6192092877414755 0.6143796125423311 0.6095601480729944 0.6107650141903287 0.6047304728738501 0.603515396026708 0.5999007976747056 0.5950711224755612 0.5950711224755612 0.5902414472764167 0.5914463133937508 0.5793772307607936 0.5769674985261253 0.5733426894443149 0.5673081481278364 0.5600687406940237 0.5552390654948792 0.5516142564130688 0.5455797150965902 0.5407500398974457 0.533510632463633 0.5274760911471544 0.5214415498306758 0.5117821994323869 0.5057476581159083 0.4973033845647614 0.4912688432482828 0.4852343019318041 0.4743700854161811 0.4695404102170366 0.4586761937014135 0.4514367862676008 0.442982301986646 0.432118085471023 0.4248786780372102 0.4152193276389213 0.4055701879704403 0.3959108375721513 0.3862514871738624 0.3765921367755734 0.3669429971070924 0.3560787805914694 0.3452145640758463 0.3355552136775573 0.3234861310446001 0.3126219145289771 0.301757698013354 0.2908934814977309 0.2788243988647737 0.2703801253136268 0.2595159087980037 0.2486516922823807 0.2389923418840917 0.2281281253684687 0.2136493105008432 0.2063996923372225 0.1943306097042653 0.1858863361531184 0.1738172535201612 0.1641579031218723 0.1557034188409175 0.1448494130551023 0.1351900626568134 0.1255307122585245 0.1158713618602355 0.1062222221917545 0.1001876808752759 0.08811859824231871 0.08207384619603218 0.07242470652755117 0.74474 0.71282 0.65963 0.61589 0.57451 0.54614 0.52723 0.51186 0.5024 0.49058 0.47994 0.4764 0.47403 0.46931 0.46458 0.46103 0.45867 0.45394 0.45039 0.45039 0.44921 0.4433 0.44093 0.43739 0.43266 0.43029 0.43148 0.42675 0.42202 0.41966 0.41611 0.41138 0.40665 0.40192 0.39838 0.39483 0.38892 0.38419 0.38183 0.37473 0.37001 0.3641 0.35819 0.35346 0.34873 0.33927 0.33218 0.32509 0.31799 0.31326 0.30617 0.29908 0.28962 0.28135 0.27189 0.26598 0.25889 0.24943 0.24234 0.2317 0.22106 0.21278 0.20333 0.19505 0.18796 0.1785 0.17023 0.15841 0.14895 0.14422 0.13476 0.12412 0.11703 0.10994 0.09693 0.08984 0.08037999999999999 0.07328999999999999 0.06738 0.05674 0.0532 0.04492 0.04137 0.03428 0.74474 0.711827416514371 0.6594168081630392 0.6106665905189123 0.5643497593730962 0.5314371758874674 0.5046183696073544 0.4863370379908068 0.474149483579775 0.4619619291687433 0.4534244545051779 0.4448972908013512 0.4424535933433017 0.4351431228886304 0.4314827321814254 0.4290493456831145 0.4253889549759095 0.4205118710195492 0.4168617912720828 0.4180784845212382 0.4132014005648778 0.4107680140665669 0.4095513208174115 0.407107623359362 0.4022305394030015 0.403447232652157 0.3973534554466411 0.3961367621974857 0.3924866824500193 0.3900429849919698 0.3888262917428144 0.3839492077864539 0.3778554305809381 0.3766387373317826 0.3741950398737331 0.3681012626682172 0.3668845694190618 0.3620074854627014 0.357130401506341 0.3522636285097192 0.3473865445533587 0.3437261538461538 0.3376323766406379 0.3327552926842775 0.3303219061859666 0.3254448222296062 0.3181343517749349 0.3108135703605249 0.3059364864041645 0.2974093227003378 0.2925322387439774 0.285221768289306 0.2791279910837902 0.2718072096693803 0.2608466594672426 0.258402962009193 0.2486487940964723 0.2437820210998504 0.235244546436285 0.2291507692307692 0.2181799080688929 0.2096527443650661 0.2011152697015008 0.192588105997674 0.1828339380849532 0.1779568541285927 0.1682026862158719 0.1608922157612006 0.1511380478484798 0.1450442706429639 0.1365171069391372 0.1267629390264163 0.1170087711136955 0.1084816074098687 0.1011711369551974 0.09141696904247659 0.08410649858780528 0.07556902392423989 0.06582516697125766 0.06216477626405272 0.05485430580938139 0.04753352439497147 0.04265644043861105 0.03169589023647339 0.02559180207121891 0.74474 0.724082753053198 0.6706308419236873 0.6171686535568895 0.5637167424273788 0.5175513925343269 0.482321023114607 0.4543772149313461 0.4325072539846822 0.4154984262747533 0.4021277405644104 0.3924157513282274 0.3802680568550335 0.3754069236182985 0.3693330763817015 0.3644719431449665 0.3583980959083695 0.3571853819085076 0.3535369626716345 0.3498885434347616 0.3462504014351755 0.3438146961981646 0.3413892681984406 0.3401765541985786 0.3365281349617056 0.3328797157248327 0.3304542877251087 0.3292415737252467 0.3243804404885117 0.3219447352515007 0.3207320212516387 0.3207320212516387 0.3170938792520527 0.3134454600151797 0.3110097547781688 0.3025104795418478 0.3025104795418478 0.3012977655419857 0.2988620603049748 0.2927882130683778 0.2915754990685158 0.2830762238321949 0.2782150905954598 0.2794278045953219 0.2733539573587249 0.2697055381218519 0.263631690885255 0.25877055764852 0.253909424411785 0.249058568412337 0.244197435175602 0.239336301938867 0.2332624547022701 0.2247529022286621 0.2186790549920651 0.2186790549920651 0.2113924937556061 0.2041059325191472 0.1955963800455392 0.1907352468088042 0.1834486855723453 0.1761621243358863 0.1700882770992893 0.1640144298626923 0.1555048773890844 0.1494310301524874 0.1457928881529014 0.1336349164424205 0.1287840604429725 0.1202745079693645 0.1142006607327675 0.1044783942592976 0.09840454702270061 0.09111798578624163 0.08383142454978265 0.07532187207617469 0.07046073883943972 0.06438689160284275 0.05588761636652177 0.04860105513006278 0.04373992189332781 0.03523036941971986 0.03159222742013386 0.02673109418339888 0.0218699609466639 0.01822154170979093 0.74474 0.7339284058287399 0.6894829989630519 0.6306289073841619 0.5693665941712601 0.5069052507777111 0.4528472799214102 0.407202842875075 0.3663646878240462 0.3387361873055722 0.3099086566064509 0.2918927200785897 0.272667592097364 0.258258907384162 0.2486463433935491 0.2390337794029362 0.2306304068656879 0.2246250946897342 0.2198188126944278 0.2150125306991213 0.2090072185231676 0.2053999667085084 0.2042009365278611 0.1993946545325547 0.1957975639906129 0.1921903121759537 0.1909912819953064 0.1885830603612945 0.1849859698193527 0.1825777481853408 0.1801796878240463 0.1717661540140807 0.1777816274627517 0.1741743756480926 0.1705671238334334 0.1681690634721388 0.1669700332914915 0.1657608418381269 0.1621637512961851 0.1609647211155378 0.1573574693008787 0.1525511873055722 0.1525511873055722 0.148943935490913 0.1477449053102658 0.1429386233149593 0.1405405629536648 0.1393415327730175 0.1345352507777111 0.1297289687824046 0.1261217169677454 0.1249226867870982 0.1201164047917917 0.1153101227964853 0.1117130322545435 0.106906750259237 0.1032994984445779 0.09729418626862413 0.09489612590732958 0.09008984391202314 0.0852835619167167 0.08047727992141025 0.07688018937946843 0.07327293756480925 0.06966568575015007 0.06486956502756099 0.06006328303225454 0.05405797085630082 0.05164974922228893 0.04444540686568793 0.04083815505102876 0.03724106450908694 0.03243478251378049 0.02882753069912132 0.02642947033782677 0.02042415816187305 0.01561787616656661 0.01081159417126016 0.00961256399061289 0.007204342356600993 0 -0.004806281995306445 -0.003607251814659171 -0.01320965453255471 0.74474 0.7375021906482698 0.7061757284822601 0.6519431288326764 0.5928989150788435 0.5278197954993429 0.4675726782194481 0.4073153668418747 0.3591074789202803 0.3157315932982917 0.2819891299824792 0.2518553772448532 0.2289594338589575 0.2096823951489269 0.1964300681668857 0.183167547087166 0.1735341247809899 0.1626876048510731 0.1542468919732808 0.1482221802452913 0.143400372043364 0.1397916614651774 0.1361727567893123 0.131350948587385 0.1289451415352606 0.1253262368593955 0.1217175262812089 0.1168957180792816 0.1168957180792816 0.1144797169294788 0.1120739098773544 0.1120739098773544 0.1096579087275515 0.1072521016754271 0.1060491981493649 0.1024302934734998 0.1024302934734998 0.09881138879763469 0.0976084852715725 0.0976084852715725 0.0964055817455103 0.0927866770696452 0.09038087001752081 0.09038087001752081 0.08555906181559352 0.08676196534165571 0.08435615828953132 0.07953435008760404 0.07712854303547963 0.07591544541173893 0.07350963835961453 0.07110383130749015 0.06868783015768724 0.06386602195575997 0.06507911957950065 0.06146021490363557 0.05904421375383268 0.05423259964958388 0.05301950202584318 0.04941079144765659 0.0457918867717915 0.04097007856986421 0.03736136799167761 0.03494536684187472 0.03133665626368813 0.03253955978975032 0.02771775158782304 0.02651484806176084 0.02409884691195795 0.01927703871003066 0.01566832813184407 0.01325232698204117 0.01084651992991678 0.009643616403854577 0.007227615254051686 0.001202903526062199 0.002405807052124398 -0.002405807052124398 -0.001202903526062199 -0.00361890467586509 -0.00361890467586509 -0.004821808201927289 -0.006024711727989488 -0.006024711727989488 0.74474 0.7374888614235295 0.7229763569947404 0.6842967997363325 0.6347251936994465 0.5754785847100344 0.5089910644200003 0.4473308183303808 0.3844637536906576 0.3300546461775086 0.2768625844903116 0.2357591628558481 0.1982761968716956 0.1680545967398618 0.1438670893585465 0.1257341292794463 0.1100148063005534 0.09430571059750889 0.08342388909487909 0.07374888614235295 0.07011820321619357 0.06407388318982683 0.05923638171356377 0.05440910751314904 0.04957160603688598 0.04473410456062291 0.0459409231107266 0.04352728601051923 0.04231024018456721 0.03989660308435986 0.03748296598415249 0.03748296598415249 0.03626592015820047 0.03264546450788942 0.0338522830579931 0.03264546450788942 0.03264546450788942 0.03143864595778574 0.03022160013183373 0.02780796303162636 0.02660114448152268 0.02660114448152268 0.02660114448152268 0.02538409865557066 0.0229704615553633 0.02417728010546698 0.02055682445515594 0.02055682445515594 0.01933977862920392 0.02176364300525961 0.01692614152899655 0.01692614152899655 0.01571932297889287 0.01451250442878919 0.01208864005273349 0.01208864005273349 0.008457957126574107 0.009675002952526126 0.009675002952526126 0.008457957126574107 0.007251138576470427 0.004837501476263063 0.002413637100207363 0 0.001206818550103682 0 0 -0.001206818550103682 -0.003630682926159381 -0.008457957126574107 -0.002413637100207363 -0.004837501476263063 -0.004837501476263063 -0.006044320026366745 -0.004837501476263063 -0.007251138576470427 -0.007251138576470427 -0.007251138576470427 -0.008457957126574107 -0.008457957126574107 -0.008457957126574107 -0.008457957126574107 -0.009675002952526126 -0.007251138576470427 0.74474 0.745952713999862 0.73501773352653 0.7143604865797281 0.6730562699234113 0.6244552147933484 0.5685778756641137 0.5054036980611329 0.4397938152211412 0.3814807708548955 0.3170938792520527 0.2672801101221279 0.2186790549920651 0.1773748383357483 0.1409317549161664 0.1117752327330435 0.08990527178637962 0.07046073883943972 0.05467490236665976 0.04373992189332781 0.03280494141999585 0.02794380818326088 0.0218699609466639 0.01822154170979093 0.01458339971020492 0.01093498047333195 0.009722266473469951 0.008499275236320982 0.006073847236596978 0.004861133236734975 0.003648419236872973 0.002425427999724004 0.002425427999724004 0.002425427999724004 0.003648419236872973 0 0.001212713999862002 -0.001212713999862002 0.002425427999724004 -0.001212713999862002 -0.001212713999862002 -0.001212713999862002 -0.001212713999862002 -0.002425427999724004 -0.002425427999724004 -0.002425427999724004 -0.004861133236734975 -0.004861133236734975 -0.002425427999724004 -0.004861133236734975 -0.003648419236872973 -0.003648419236872973 -0.006073847236596978 -0.004861133236734975 -0.004861133236734975 -0.00728656123645898 -0.004861133236734975 -0.006073847236596978 -0.006073847236596978 -0.00728656123645898 -0.00728656123645898 -0.006073847236596978 -0.00728656123645898 -0.009722266473469951 -0.008499275236320982 -0.00728656123645898 -0.00728656123645898 -0.008499275236320982 -0.008499275236320982 -0.008499275236320982 -0.00728656123645898 -0.01093498047333195 -0.00728656123645898 -0.009722266473469951 -0.00728656123645898 -0.00728656123645898 -0.008499275236320982 -0.008499275236320982 -0.01093498047333195 -0.008499275236320982 -0.009722266473469951 -0.008499275236320982 -0.009722266473469951 -0.00728656123645898 optim-1.4.1/inst/private/__d2_min__.m0000644000175000017500000001610212477523726016061 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn{Function File} {[@var{p_res}, @var{objf}, @var{cvg}, @var{outp}] =} __d2_min__ (@var{f}, @var{pin}, @var{hook}) ## Undocumented internal function. ## @end deftypefn function [p_res, objf, cvg, outp] = __d2_min__ (f, pin, hook) ### modified by Olaf Till n = length (pin); ## constants maxinner = 30; tcoeff = 0.5; # Discount on total weight ncoeff = 0.5; # Discount on weight of newton ocoeff = 1.5; # Factor for outwards searching ## passed function for gradient of objective function grad_f = hook.dfdp; ## passed function for hessian of objective function if (isempty (hess_f = hook.hessian)) error ("this backend requires a supplied Hessian function"); endif ## is it the inverse of the hessian? inverse_hessian = hook.inverse_hessian; ## passed options ftol = hook.TolFun; if (isempty (utol = hook.TolX)) utol = 10 * sqrt (eps); endif if (isempty (maxout = hook.MaxIter)) maxout = 1000; endif fixed = hook.fixed; verbose = strcmp (hook.Display, "iter"); prudent = strcmp (hook.FunValCheck, "on"); user_interaction = hook.user_interaction; ## some useful variables derived from passed variables n = numel (pin); ## backend-specific checking of options and constraints if (all (fixed)) error ("no free parameters"); endif ## fill constant fields of hook for derivative-functions; some fields ## may be backend-specific dfdp_hook.fixed = fixed; # this may be handled by the frontend, but # the backend still may add to it ## set up for iterations p = pbest = pin; vf = fbest = eval_objf (f, pin, prudent); nobjf = 1; if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", 0, "fval", vf), "init"))) p_res = p; outp.niter = 0; outp.nobj = nobjf; objf = vf; cvg = -1; return; endif for (niter = 1 : maxout) [grad, hessian] = ... eval_grad_hessian (grad_f, hess_f, p, n, prudent, setfield (dfdp_hook, "f", vf)); grad = grad(:); if (inverse_hessian) inv_hessian = hessian; else inv_hessian = pinv (hessian); endif fold = vf; if (verbose) printf ("d2_min: niter=%d, objf=%8.3g\n", niter, vf); endif dnewton = - inv_hessian * grad; # Newton step if (dnewton' * grad > 0) ## Heuristic for negative hessian dnewton = -100 * grad; endif wn = 1; # Weight of Newton step wt = 1; # Total weight done_inner = false; # false = not found, true = ready to quit inner loop for (ninner = 1 : maxinner) # inner loop dp = wt * (wn * dnewton - (1 - wn) * grad); pnew = p + dp; if (verbose) printf ("total weight: %8.3g, newtons weight: %8.3g, objf=%8.3g, newton norm: %8.3g, deriv norm: %8.3g\n",... wt, wn, fbest, norm (wt * wn * dnewton), norm (wt * (1 - wn) * d)); endif fnew = eval_objf (f, pnew, prudent); nobjf++; if (fnew < fbest) dbest = dp; fbest = fnew; pbest = pnew; done_inner = true; # will go out at next increase if (verbose) printf ("d2_min: found better value\n"); endif elseif (done_inner) if (verbose) printf ("d2_min: quitting %d th inner loop\n", ninner); endif break; endif wt *= tcoeff; # reduce norm of proposed step wn *= ncoeff; # and bring it closer to derivative endfor # end of inner loop if (ninner == maxinner) printf ("d2_min: too many inner loops (objf: %8.3g)\n", fnew); wbest = 0; else ## look for improvement along dbest wbest = 1; wn = ocoeff; pnew = p + wn * dbest; fnew = eval_objf (f, pnew, prudent); nobjf++; while (fnew < fbest) fbest = fnew; wbest = wn; pbest = pnew; wn *= ocoeff; pnew = p + wn * dbest; fnew = eval_objf (f, pnew, prudent); nobjf++; if (verbose) printf ("d2_min: looking further: objf: %8.3g\n", fnew); endif endwhile endif if (verbose) printf ("d2_min: inner loop: fbest: %8.5g, fold: %8.5g\n", fbest, fold); endif if (fbest < fold) ## improvement found vf = fbest; p = pbest; endif if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p, struct ("iteration", niter, "fval", vf), "iter"))) p_res = p; outp.niter = niter; outp.nobjf = nobjf; objf = vf; cvg = -1; return; endif if (fold - fbest < ((abs (fold) + sqrt (eps)) * abs (ftol))) if (verbose) printf ("d2_min: quitting, niter: %-3d, objf: %8.3g, fold: %8.3g\n", niter, vf, fold); endif cvg = 3; break; elseif (max (abs (wbest * dbest)) < ... (max (abs (pbest)) + sqrt (eps)) * abs (utol)) cvg = 2; break; elseif (niter == maxout) cvg = 0 endif pbest = p; endfor ## return result p_res = pbest; objf = fbest; outp.niter = niter; outp.nobjf = nobjf; if (([stop, outp.user_interaction] = ... __do_user_interaction__ (user_interaction, p_res, struct ("iteration", niter, "fval", objf), "done"))) cvg = -1; endif endfunction function ret = eval_objf (f, p, prudent) ret = f (p); if (prudent && (! isnumeric (ret) || isnan (ret) || numel (ret) > 1)) error ("objective function returns inadequate output"); endif endfunction function [grad, hessian] = ... eval_grad_hessian (grad_f, hess_f, p, n, prudent, hook) persistent first_call = true; grad = grad_f (p, hook); hessian = hess_f (p); if (first_call) first_call = false; if (prudent && (! isnumeric (grad) || numel (grad) != n)) error ("gradient function returns inadequate output"); endif if (prudent && (! isnumeric (hessian) || any (size (hessian) != n))) error ("hessian function returns inadequate output"); endif endif endfunction optim-1.4.1/inst/private/__octave_sqp_wrapper__.m0000644000175000017500000000721712477523726020624 0ustar olafolaf## Copyright (C) 2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . function [p_res, objf, cvg, outp] = __octave_sqp_wrapper__ (f, pin, hook) ## clear persisten variables select_constr (); select_d_constr (); n = length (pin); ## passed constraints mc = hook.mc; # matrix of linear constraints vc = hook.vc; # vector of linear constraints f_cstr = hook.f_cstr; # function of all constraints df_cstr = hook.df_cstr; # function of derivatives of all constraints n_gencstr = hook.n_gencstr; # number of non-linear constraints eq_idx = hook.eq_idx; # logical index of equality constraints in all # constraints lbound = hook.lbound; # bounds, subset of linear inequality ubound = hook.ubound; # constraints in mc and vc ## passed function for gradient of objective function grad_f = hook.dfdp; ## passed function for hessian of objective function hessian = hook.hessian; ## passed options tolerance = hook.octave_sqp_tolerance; niter = hook.MaxIter; fixed = hook.fixed; ## some useful variables derived from passed variables ## n_cstr = size (vc, 1) + n_gencstr; # number of all constraints ac_idx = true (n_cstr, 1); # index of all constraints ## backend-specific checking of options and constraints ## ## ... ## fill constant fields of hook for derivative-functions; some fields ## may be backend-specific dfdp_hook.fixed = fixed; # this may be handled by the frontend, but # the backend still may add to it ## process arguments for calling sqp grad_f = @ (p) grad_f (p, dfdp_hook)(:); # sqp expects column vector f_cstr = @ (p) f_cstr (p, ac_idx); df_cstr = @ (p) df_cstr (p, ac_idx, setfield (dfdp_hook, "f", f_cstr (p))); if (isempty (hessian)) passed_f = {f, grad_f}; else passed_f = {f, grad_f, hessian}; endif inequc = @ (p) select_constr (f_cstr, p, ! eq_idx); dinequc = @ (p) select_d_constr (df_cstr, p, ! eq_idx); equc = @ (p) select_constr (f_cstr, p, eq_idx); dequc = @ (p) select_d_constr (df_cstr, p, eq_idx); ## call sqp [p_res, objf, info, outp.niter, outp.nobjf, outp.lambda] = ... sqp (pin, passed_f, {equc, dequc}, {inequc, dinequc}, -Inf, Inf, niter, tolerance); ## map return code switch (info) case 101 cvg = 1; case 102 cvg = -4; case 103 cvg = 0; otherwise warning ("return code %i of sqp not recognized", info); endswitch endfunction function ret = select_constr (cf, p, idx) persistent storep = struct (); persistent storeret = []; if (! nargin ()) storep = struct (); return; endif if (! isequal (storep, p)) storep = p; storeret = cf (p); endif ret = storeret(idx); endfunction function ret = select_d_constr (dcf, p, idx) persistent storep = struct (); persistent storeret = []; if (! nargin ()) storep = struct (); return; endif if (! isequal (storep, p)) storep = p; storeret = dcf (p); endif ret = storeret(idx, :); endfunction optim-1.4.1/inst/test_nelder_mead_min_1.m0000644000175000017500000001102512477523726017023 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## Checks wether the function 'nelder_mead_min' works, by making it minimize a ## quadratic function. ok = 1; cnt = 1; if ! exist ("verbose"), verbose = 0; end if verbose, printf (" test_nelder_mead : \n"); end if ! exist ("inspect"), inspect = 0; end tol = 100*sqrt (eps); R = 3 ; C = 2; if verbose, printf (" optimization problem has dimension %i\n",R*C); end function c = my_quad_func (x,y,z) c = 1 + sum (vec(x-y)'*z*(vec(x-y))); end function c = non_quad_func_1 (x,y,z) tmp = sum (vec(x-y)'*z*(vec(x-y))); c = 1 + 1.1*tmp + sin (sqrt(tmp)); end function c = non_quad_func_2 (x,y,z) tmp1 = sum (vec(x-y)'*z*(vec(x-y))); tmp2 = max (abs (vec(x-y)))^2; c = 1 + 1.1*tmp1 + tmp2 ; end ## dt = mytic() ## ## Returns the cputime since last call to 'mytic'. function dt = mytic() persistent last_mytic = 0 ; [t,u,s] = cputime() ; dt = t - last_mytic ; last_mytic = t ; endfunction fnames = { "my_quad_func", "non_quad_func_1", "non_quad_func_2"}; x0 = randn(R,C) ; x1 = x0 + randn(R,C) ; z = randn (R*C); z = z*z'; for i = 1:length (fnames) fname = fnames{i}; if verbose, printf ("trying to minimize '%s'\n", fname); end ctl = nan*zeros (1,6); mytic (); [x2,v,nf] = nelder_mead_min (fname, {x1,x0,z}, ctl) ; t0 = mytic (); if any (abs (x2-x0)(:) > 100*tol), if verbose || inspect, printf ("not ok %i\n",cnt); end [max(abs (x2-x0)(:)), 100*tol] if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n function evaluations = %i\n",cnt,nf); end end cnt++; # Use vanilla nelder_mead_min mytic (); [x2,v,nf] = nelder_mead_min (fname, {x1,x0,z}) ; t1 = mytic (); if any (abs (x2-x0)(:) > 100*tol), if verbose || inspect, printf ("not ok %i\n",cnt); end [max(abs (x2-x0)(:)), 100*tol] if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n function evaluations = %i\n",cnt,nf); end end cnt++; # Optimize wrt 2nd arg. ctl = nan * zeros (1,6); ctl(6) = 0; ctl(3) = 2; mytic (); [x2,v,nf] = nelder_mead_min (fname, {x1,x0,z}, ctl) ; t0 = mytic (); if any (abs (x2-x1)(:) > 100*tol), if verbose || inspect, printf ("not ok %i\n",cnt); end [max(abs (x2-x0)(:)), 100*tol] if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n function evaluations = %i\n",cnt,nf); end end cnt++; # Optimize wrt 2nd arg. ctl = nan * zeros (1,6); ctl(3) = 2; mytic (); [x2,v,nf] = nelder_mead_min (fname, {x1,x0,z}, ctl) ; t1 = mytic (); if any (abs (x2-x1)(:) > tol), if verbose || inspect, printf ("not ok %i\n",cnt); end [max(abs (x2-x0)(:)), 100*tol] if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n function evaluations = %i\n",cnt,nf); end end cnt++; if 0 # Check with struct control variable ctls = struct ("narg", 2); [x2bis,vbis,nfbis] = nelder_mead_min (fname, {x1,x0,z}, ctls) ; t1 = mytic (); ## [nf,nfbis] if any ((x2-x2bis)(:)) if verbose || inspect, printf ("not ok %i\n",cnt); end printf (" struct ctl : x2 - x2bis -> %g\n", max(abs (x2-x2bis)(:))); if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n function evaluations = %i\n",cnt,nfbis); end end cnt++; # Check with named args [x2bis,vbis,nfbis] = nelder_mead_min (fname, {x1,x0,z}, "narg", 2) ; t1 = mytic (); ## [nf,nfbis] if any ((x2-x2bis)(:)) if verbose || inspect, printf ("not ok %i\n",cnt); end printf (" named arg : x2 - x2bis -> %g\n", max(abs (x2-x2bis)(:))); if inspect, keyboard; end ok = 0 ; else if verbose, printf ("ok %i\n function evaluations = %i\n",cnt,nfbis); end end cnt++; end end if verbose && ok printf ("All tests ok\n"); end optim-1.4.1/inst/nonlin_curvefit.m0000644000175000017500000000754312477523726015660 0ustar olafolaf## Copyright (C) 2010-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{p}, @var{fy}, @var{cvg}, @var{outp}] =} nonlin_curvefit (@var{f}, @var{pin}, @var{x}, @var{y}) ## @deftypefnx {Function File} {[@var{p}, @var{fy}, @var{cvg}, @var{outp}] =} nonlin_curvefit (@var{f}, @var{pin}, @var{x}, @var{y}, @var{settings}) ## Frontend for nonlinear fitting of values, computed by a model ## function, to observed values. ## ## Please refer to the description of @code{nonlin_residmin}. The ## differences to @code{nonlin_residmin} are the additional arguments ## @var{x} (independent values, mostly, but not necessarily, an array of ## the same dimensions or the same number of rows as @var{y}) and ## @var{y} (array of observations), the returned value @var{fy} (final ## guess for observed values) instead of @var{resid}, that the model ## function has a second obligatory argument which will be set to ## @var{x} and is supposed to return guesses for the observations (with ## the same dimensions), and that the possibly user-supplied function ## for the jacobian of the model function has also a second obligatory ## argument which will be set to @var{x}. ## ## @c The following block will be cut out in the package info file. ## @c BEGIN_CUT_TEXINFO ## ## Also, if the setting @code{user_interaction} is given, additional ## information is passed to these functions. Type @code{optim_doc ## ("Common optimization options")} for this setting. ## ## @c END_CUT_TEXINFO ## ## @seealso {nonlin_residmin} ## @end deftypefn function [p, fy, cvg, outp] = nonlin_curvefit (f, pin, x, y, settings) if (nargin == 1) p = __nonlin_residmin__ (f); return; endif if (nargin < 4 || nargin > 5) print_usage (); endif if (nargin == 4) settings = struct (); endif if (! isempty (dfdp = optimget (settings, "dfdp"))) if (ischar (dfdp)) dfdp = str2func (dfdp); endif ## settings = optimset \ ## (settings, "dfdp", @ (p, varargin) dfdp (p, x, varargin{:})); settings.dfdp = @ (p, varargin) dfdp (p, x, varargin{:}); endif if (! isempty (uia = optimget (settings, "user_interaction", {}))) if (! iscell (uia)) uia = {uia}; endif uia = cellfun (@ (f_handle) @ (p, v, s) ... f_handle (p, cell2fields ({x, y}, {"model_x", "observations"}, 2, v), s), uia(:), "UniformOutput", false); settings.user_interaction = uia; endif [p, fy, cvg, outp] = __nonlin_residmin__ ... (@ (p) f (p, x), pin, settings, struct ("observations", y)); fy += y; endfunction %!demo %! ## Example for linear inequality constraints %! ## (see also the same example in 'demo nonlin_residmin') %! %! ## independents and observations %! indep = 1:5; %! obs = [1, 2, 4, 7, 14]; %! ## model function: %! f = @ (p, x) p(1) * exp (p(2) * x); %! ## initial values: %! init = [.25; .25]; %! ## linear constraints, A.' * parametervector + B >= 0 %! A = [1; -1]; B = 0; # p(1) >= p(2); %! settings = optimset ("inequc", {A, B}); %! %! ## start optimization %! [p, model_values, cvg, outp] = nonlin_curvefit (f, init, indep, obs, settings) optim-1.4.1/inst/dfxpdp.m0000644000175000017500000000416512477523726013736 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## function jac = dfxpdp (x, p, func[, hook]) ## ## Returns Jacobian of func (p, x) with respect to p with finite ## differencing. The optional argument hook is a structure which can ## contain the following fields at the moment: ## ## hook.f: value of func(p, x) for p and x as given in the arguments ## ## hook.diffp: positive vector of fractional steps from given p in ## finite differencing (actual steps may be smaller if bounds are ## given). The default is .001 * ones (size (p)); ## ## hook.diff_onesided: logical vector, indexing elements of p for ## which only one-sided differences should be computed (faster); even ## if not one-sided, differences might not be exactly central if ## bounds are given. The default is false (size (p)). ## ## hook.fixed: logical vector, indexing elements of p for which zero ## should be returned instead of the guessed partial derivatives ## (useful in optimization if some parameters are not optimized, but ## are 'fixed'). ## ## hook.lbound, hook.ubound: vectors of lower and upper parameter ## bounds (or -Inf or +Inf, respectively) to be respected in finite ## differencing. The consistency of bounds is not checked. function ret = dfxpdp (varargin) ## This is an interface to __dfdp__.m. if (ischar (varargin{3})) varargin{3} = @ (p) str2func (varargin{3}) ... (p, varargin{1}); else varargin{3} = @ (p) varargin{3} (p, varargin{1}); endif ret = __dfdp__ (varargin{2:end}); endfunction optim-1.4.1/inst/fmins.m0000644000175000017500000000620512477523726013562 0ustar olafolaf## Copyright (C) 2003 Andy Adler ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{x}] =} fmins (@var{f},@var{X0},@var{options},@var{grad},@var{P1},@var{P2}, ...) ## ## Find the minimum of a funtion of several variables. ## By default the method used is the Nelder&Mead Simplex algorithm ## ## Example usage: ## fmins(inline('(x(1)-5).^2+(x(2)-8).^4'),[0;0]) ## ## @strong{Inputs} ## @c use @asis and explicite @var in @table to avoid makeinfo warning ## @c `unlikely character , in @var' for `P1, P2, ...'. ## @table @asis ## @item @var{f} ## A string containing the name of the function to minimize ## @item @var{X0} ## A vector of initial parameters fo the function @var{f}. ## @item @var{options} ## Vector with control parameters (not all parameters are used) ## @verbatim ## options(1) - Show progress (if 1, default is 0, no progress) ## options(2) - Relative size of simplex (default 1e-3) ## options(6) - Optimization algorithm ## if options(6)==0 - Nelder & Mead simplex (default) ## if options(6)==1 - Multidirectional search Method ## if options(6)==2 - Alternating Directions search ## options(5) ## if options(6)==0 && options(5)==0 - regular simplex ## if options(6)==0 && options(5)==1 - right-angled simplex ## Comment: the default is set to "right-angled simplex". ## this works better for me on a broad range of problems, ## although the default in nmsmax is "regular simplex" ## options(10) - Maximum number of function evaluations ## @end verbatim ## @item @var{grad} ## Unused (For compatibility with Matlab) ## @item @var{P1}, @var{P2}, ... ## Optional parameters for function @var{f} ## ## @end table ## @end deftypefn function ret=fmins(funfun, X0, options, grad, varargin) stopit = [1e-3, inf, inf, 1, 0, -1]; minfun = 'nmsmax'; if nargin < 3; options=[]; end if length(options)>=1; stopit(5)= options(1); end if length(options)>=2; stopit(1)= options(2); end if length(options)>=5; if options(6)==0; minfun= 'nmsmax'; if options(5)==0; stopit(4)= 0; elseif options(5)==1; stopit(4)= 1; else error('options(5): no associated simple strategy'); end elseif options(6)==1; minfun= 'mdsmax'; elseif options(6)==2; minfun= 'adsmax'; else error('options(6) does not correspond to known algorithm'); end end if length(options)>=10; stopit(2)= options(10); end ret = feval(minfun, funfun, X0, stopit, [], varargin{:}); endfunction optim-1.4.1/inst/test_min_3.m0000644000175000017500000000566012477523726014516 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2004 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## ok - Test that bfgs works with extra ## arguments ## ## Defines some simple functions and verifies that calling ## bfgs on them returns the correct minimum. ## ## Sets 'ok' to 1 if success, 0 otherwise if ! exist ("optim_func"), optim_func = "bfgsmin"; end ok = 1; if ! exist ("verbose"), verbose = 0; end P = 2; R = 3; ## Make tests reproducible ## obsmat = randn(R,P) ; obsmat = zeros (R,P); obsmat(sub2ind([R,P],1:R,1+rem(0:R-1,P))) = 1:R; ## Make test_min_2 repeatable by using fixed starting point ## truep = randn(P,1) ; ## xinit = randn(P,1) ; truep = rem (1:P, P/4)'; xinit = truep + 2*(1:P)'/(P); ## global obses ; obses = obsmat*truep ; extra = {obsmat, obses}; function v = ff(x, obsmat, obses) v = mean ( (obses - obsmat*x)(:).^2 ) + 1 ; endfunction function dv = dff(x, obsmat, obses) er = -obses + obsmat*x ; dv = 2*er'*obsmat / rows(obses) ; ## dv = 2*er'*obsmat ; endfunction if verbose printf (" Checking that extra arguments are accepted\n\n"); printf ([" Set 'optim_func' to the name of the optimization\n",... " function you want to test (must have same synopsis\n",... " as 'bfgs')\n\n"]); printf (" Tested function : %s\n",optim_func); printf (" Nparams = P = %i, Nobses = R = %i\n",P,R); fflush (stdout); end function dt = mytic() persistent last_mytic = 0 ; [t,u,s] = cputime() ; dt = t - last_mytic ; last_mytic = t ; endfunction ctl.df = "dff"; mytic() ; ## [xlev,vlev,nlev] = feval (optim_func, "ff", "dff", xinit, "extra", extra) ; ## [xlev,vlev,nlev] = feval \ ## (optim_func, "ff", "dff", list (xinit, obsmat, obses)); if strcmp(optim_func,"bfgsmin") ctl = {-1,2,1,1}; endif [xlev,vlev,nlev] = feval ... (optim_func, "ff", {xinit, obsmat, obses}, ctl); tlev = mytic() ; if max (abs(xlev-truep)) > 1e-4, if verbose, printf ("Error is too big : %8.3g\n", max (abs (xlev-truep))); end ok = 0; end if verbose, printf (" Costs : init=%8.3g, final=%8.3g, best=%8.3g\n",... ff(xinit,obsmat,obses), vlev, ff(truep,obsmat,obses)); end if verbose printf ( " time : %8.3g\n",tlev); end if verbose && ok printf ( "All tests ok\n"); end optim-1.4.1/inst/powell.m0000644000175000017500000001407412477523726013753 0ustar olafolaf## Copyright (C) 2011 Nir Krakauer ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{p}, @var{obj_value}, @var{convergence}, @var{iters}, @var{nevs}] =} powell (@var{f}, @var{p0}, @var{control}) ## Multidimensional minimization (direction-set method). Implements a direction-set (Powell's) method for multidimensional minimization of a function without calculation of the gradient [1, 2] ## ## @subheading Arguments ## ## @itemize @bullet ## @item ## @var{f}: name of function to minimize (string or handle), which should accept one input variable (see example for how to pass on additional input arguments) ## ## @item ## @var{p0}: An initial value of the function argument to minimize ## ## @item ## @var{options}: an optional structure, which can be generated by optimset, with some or all of the following fields: ## @itemize @minus ## @item ## MaxIter: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) ## @item ## TolFun: minimum amount by which function value must decrease in each iteration to continue (default is 1E-8) ## @item ## MaxFunEvals: maximum function evaluations (positive integer, or -1 or Inf for unlimited (default)) ## @item ## SearchDirections: an n*n matrix whose columns contain the initial set of (presumably orthogonal) directions to minimize along, where n is the number of elements in the argument to be minimized for; or an n*1 vector of magnitudes for the initial directions (defaults to the set of unit direction vectors) ## @end itemize ## @end itemize ## ## @subheading Examples ## ## @example ## @group ## y = @@(x, s) x(1) ^ 2 + x(2) ^ 2 + s; ## o = optimset('MaxIter', 100, 'TolFun', 1E-10); ## s = 1; ## [x_optim, y_min, conv, iters, nevs] = powell(@@(x) y(x, s), [1 0.5], o); %pass y wrapped in an anonymous function so that all other arguments to y, which are held constant, are set ## %should return something like x_optim = [4E-14 3E-14], y_min = 1, conv = 1, iters = 2, nevs = 24 ## @end group ## ## @end example ## ## @subheading Returns: ## ## @itemize @bullet ## @item ## @var{p}: the minimizing value of the function argument ## @item ## @var{obj_value}: the value of @var{f}() at @var{p} ## @item ## @var{convergence}: 1 if normal convergence, 0 if not ## @item ## @var{iters}: number of iterations performed ## @item ## @var{nevs}: number of function evaluations ## @end itemize ## ## @subheading References ## ## @enumerate ## @item ## Powell MJD (1964), An efficient method for finding the minimum of a function of several variables without calculating derivatives, @cite{Computer Journal}, 7 :155-162 ## ## @item ## Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (1992). @cite{Numerical Recipes in Fortran: The Art of Scientific Computing} (2nd Ed.). New York: Cambridge University Press (Section 10.5) ## @end enumerate ## @end deftypefn ## PKG_ADD: __all_opts__ ("powell"); function [p, obj_value, convergence, iters, nevs] = powell (f, p0, options = struct ()) if (nargin == 1 && ischar (f) && strcmpi (f, "defaults")) p = optimset ("MaxIter", Inf, "TolFun", 1e-8, "MaxFunEvals", Inf, "SearchDirections", []); return elseif (nargin < 2 || nargin > 3) print_usage (); endif xi_set = 0; xi = optimget (options, 'SearchDirections'); if (! isempty (xi)) if (isvector (xi)) # assume that xi is is n*1 or 1*n xi = diag (xi); endif xi_set = 1; endif MaxIter = optimget (options, 'MaxIter', Inf); if (MaxIter < 0) MaxIter = Inf; endif MaxFunEvals = optimget (options, 'MaxFunEvals', Inf); TolFun = optimget (options, 'TolFun', 1E-8); nevs = 0; iters = 0; convergence = 0; p = p0; # initial value of the argument being minimized try obj_value = f (p); catch error ("powell: F does not exist or cannot be evaluated"); end_try_catch nevs++; n = numel (p); # number of dimensions to minimize over xit = zeros (n, 1); if (! xi_set) xi = eye (n); endif ## do an iteration while (iters <= MaxIter && nevs <= MaxFunEvals && ! convergence) iters++; pt = p; # best point as iteration begins fp = obj_value; # value of the objective function as iteration begins ibig = 0; # will hold direction along which the objective function decreased the most in this iteration dlt = 0; # will hold decrease in objective function value in this iteration for i = 1:n xit = reshape (xi(:, i), size (p)); fptt = obj_value; [a, obj_value, nev] = line_min (f, xit, {p}); nevs = nevs + nev; p = p + a*xit; change = fptt - obj_value; if (change > dlt) dlt = change; ibig = i; endif endfor if (2 * abs (fp - obj_value) <= TolFun * (abs (fp) + abs (obj_value))) convergence = 1; return endif if (iters == MaxIter) disp ("iteration maximum exceeded"); return endif ## attempt parabolic extrapolation ptt = 2*p - pt; xit = p - pt; fptt = f(ptt); nevs++; if (fptt < fp) # check whether the extrapolation actually makes the objective function smaller t = 2 * (fp - 2*obj_value + fptt) * (fp-obj_value-dlt)^2 - dlt * (fp-fptt)^2; if (t < 0) p = ptt; [a, obj_value, nev] = line_min (f, xit, {p}); nevs = nevs + nev; p = p + a*xit; ## add the net direction from this iteration to the direction set xi(:, ibig) = xi(:, n); xi(:, n) = xit(:); endif endif endwhile endfunction optim-1.4.1/inst/samin_example.m0000644000175000017500000000433212477523726015267 0ustar olafolaf## Copyright (C) 2004 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . # samin_example: example script that contains examples of how to call # samin for minimization using simulated annealing. # Edit the script to see how samin may be used. # # usage: samin_example 1; # this is a script file # Example objective function # remember that cos(0)=1, so # "a" has a local minimum at 0 (each dimension) # "b" makes the function value 0 at min # "c" adds some curvature to make the min # at (0,0,...,0) global. # the closer is "curvature" to zero the more alike are # the local mins, so the harder the global min is to find function f = obj(theta, curvature); dim = rows(theta); a = sum(exp(-cos(theta))); b = - dim*exp(-1); c = sum(curvature*theta .^ 2); f = a + b + c; endfunction k = 5; # dimensionality theta = rand(k,1)*10 - 5; # random start value # if you set "curvature" very small, # you will need to increase nt, ns, and rt # to minimize sucessfully curvature = 0.01; # SA controls ub = 10*ones(rows(theta),1); lb = -ub; # setting ub and lb to same value restricts that parameter, and the algorithm does not search ub(1,:) = 0; lb(1,:) = 0; theta(1,:) = 0; # must satisfy restriction nt = 20; ns = 5; rt = 0.5; # careful - this is too low for many problems maxevals = 1e10; neps = 5; functol = 1e-10; paramtol = 1e-3; verbosity = 1; # only final results. Inc minarg = 1; control = { lb, ub, nt, ns, rt, maxevals, neps, functol, paramtol, verbosity, 1}; # do sa t=cputime(); [theta, obj_value, convergence] = samin("obj", {theta, curvature}, control); t = cputime() - t; printf("Elapsed time = %f\n\n\n",t); optim-1.4.1/inst/poly_2_ex.m0000644000175000017500000000376212477523726014353 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann. All rights reserved. ## ## This program is free software; you can redistribute it and/or modify it ## under the terms of the GNU General Public License as published by the ## Free Software Foundation; either version 2, or (at your option) any ## later version. ## ## This is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License ## for more details. ## ex = poly_2_ex (l, f) - Extremum of a 1-var deg-2 polynomial ## ## l : 3 : Values of variable at which polynomial is known. ## f : 3 : f(i) = Value of the degree-2 polynomial at l(i). ## ## ex : 1 : Value for which f reaches its extremum ## ## Assuming that f(i) = a*l(i)^2 + b*l(i) + c = P(l(i)) for some a, b, c, ## ex is the extremum of the polynome P. ## ## This function will be removed from future versions of the optim ## package since it is not related to optimization. function ex = poly_2_ex (l, f) persistent warned = false; if (! warned) warned = true; warning ("Octave:deprecated-function", "The function `poly_2_ex' will be removed from future versions of the optim package since it is not related to optimization."); endif ### This somewhat helps if solution is very close to one of the points. [f,i] = sort (f); l = l(i); m = (l(2) - l(1))/(l(3) - l(1)); d = (2*(f(1)*(m-1)+f(2)-f(3)*m)); if abs (d) < eps, printf ("poly_2_ex : divisor is small (solution at infinity)\n"); printf ("%8.3e %8.3e %8.3e, %8.3e %8.3e\n",... f(1), diff (f), diff (sort (l))); ex = (2*(l(1)>l(2))-1)*inf; ## keyboard else ex = ((l(3) - l(1))*((f(1)*(m^2-1) + f(2) - f(3)*m^2))) / d ; ## Not an improvement # n = ((l(2)+l(3))*(l(2)-l(3)) + 2*(l(3)-l(2))*l(1)) / (l(3)-l(1))^2 ; # ex = ((l(3) - l(1))*((f(1)*n + f(2) - f(3)*m^2))) / \ # (2*(f(1)*(m-1)+f(2)-f(3)*m)); # if ex != ex0, # ex - ex0 # end ex = l(1) + ex; endoptim-1.4.1/inst/gjp.m0000644000175000017500000000415312477523726013226 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## m = gjp (m, k[, l]) ## ## m: matrix; k, l: row- and column-index of pivot, l defaults to k. ## ## Gauss-Jordon pivot as defined in Bard, Y.: Nonlinear Parameter ## Estimation, p. 296, Academic Press, New York and London 1974. In ## the pivot column, this seems not quite the same as the usual ## Gauss-Jordan(-Clasen) pivot. Bard gives Beaton, A. E., 'The use of ## special matrix operators in statistical calculus' Research Bulletin ## RB-64-51 (1964), Educational Testing Service, Princeton, New Jersey ## as a reference, but this article is not easily accessible. Another ## reference, whose definition of gjp differs from Bards by some ## signs, is Clarke, R. B., 'Algorithm AS 178: The Gauss-Jordan sweep ## operator with detection of collinearity', Journal of the Royal ## Statistical Society, Series C (Applied Statistics) (1982), 31(2), ## 166--168. function m = gjp (m, k, l) if (nargin < 3) l = k; endif p = m(k, l); if (p == 0) error ("pivot is zero"); endif ## This is a case where I really hate to remain Matlab compatible, ## giving so many indices twice. m(k, [1:l-1, l+1:end]) = m(k, [1:l-1, l+1:end]) / p; # pivot row m([1:k-1, k+1:end], [1:l-1, l+1:end]) = ... # except pivot row and col m([1:k-1, k+1:end], [1:l-1, l+1:end]) - ... m([1:k-1, k+1:end], l) * m(k, [1:l-1, l+1:end]); m([1:k-1, k+1:end], l) = - m([1:k-1, k+1:end], l) / p; # pivot column m(k, l) = 1 / p; endfunction optim-1.4.1/inst/nonlin_min.m0000644000175000017500000011436612477523726014616 0ustar olafolaf## Copyright (C) 2012-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify ## it under the terms of the GNU General Public License as published by ## the Free Software Foundation; either version 3 of the License, or ## (at your option) any later version. ## ## This program is distributed in the hope that it will be useful, ## but WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ## GNU General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with this program; If not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{p}, @var{objf}, @var{cvg}, @var{outp}] =} nonlin_min (@var{f}, @var{pin}) ## @deftypefnx {Function File} {[@var{p}, @var{objf}, @var{cvg}, @var{outp}] =} nonlin_min (@var{f}, @var{pin}, @var{settings}) ## Frontend for nonlinear minimization of a scalar objective function. ## ## The functions supplied by the user have a minimal interface; any ## additionally needed constants can be supplied by wrapping the user ## functions into anonymous functions. ## ## The following description applies to usage with vector-based ## parameter handling. Differences in usage for structure-based ## parameter handling will be explained separately. ## ## @var{f}: objective function. It gets a column vector of real ## parameters as argument. In gradient determination, this function may ## be called with an informational second argument, whose content ## depends on the function for gradient determination. ## ## @var{pin}: real column vector of initial parameters. ## ## @var{settings}: structure whose fields stand for optional settings ## referred to below. The fields can be set by @code{optimset()}. ## ## The returned values are the column vector of final parameters ## @var{p}, the final value of the objective function @var{objf}, an ## integer @var{cvg} indicating if and how optimization succeeded or ## failed, and a structure @var{outp} with additional information, ## curently with possible fields: @code{niter}, the number of ## iterations, @code{nobjf}, the number of objective function calls ## (indirect calls by gradient function not counted), @code{lambda}, the ## lambda of constraints at the result, and @code{user_interaction}, ## information on user stops (see settings). The backend may define ## additional fields. @var{cvg} is greater than zero for success and ## less than or equal to zero for failure; its possible values depend on ## the used backend and currently can be @code{0} (maximum number of ## iterations exceeded), @code{1} (success without further specification ## of criteria), @code{2} (parameter change less than specified ## precision in two consecutive iterations), @code{3} (improvement in ## objective function less than specified), @code{-1} (algorithm aborted ## by a user function), or @code{-4} (algorithm got stuck). ## ## @c The following block will be cut out in the package info file. ## @c BEGIN_CUT_TEXINFO ## ## For settings, type @code{optim_doc ("nonlin_min")}. ## ## For desription of structure-based parameter handling, type ## @code{optim_doc ("parameter structures")}. ## ## For description of individual backends (currently only one), type ## @code{optim_doc ("scalar optimization")} and choose the backend in ## the menu. ## ## @c END_CUT_TEXINFO ## ## @end deftypefn ## PKG_ADD: __all_opts__ ("nonlin_min"); function [p, objf, cvg, outp] = nonlin_min (f, pin, settings) ## some scalar defaults; some defaults are backend specific, so ## lacking elements in respective constructed vectors will be set to ## NA here in the frontend diffp_default = .001; stol_default = .0001; cstep_default = 1e-20; if (nargin == 1 && ischar (f) && strcmp (f, "defaults")) p = optimset ("param_config", [], ... "param_order", [], ... "param_dims", [], ... "f_inequc_pstruct", false, ... "f_equc_pstruct", false, ... "objf_pstruct", false, ... "df_inequc_pstruct", false, ... "df_equc_pstruct", false, ... "grad_objf_pstruct", false, ... "hessian_objf_pstruct", false, ... "lbound", [], ... "ubound", [], ... "objf_grad", [], ... "objf_hessian", [], ... "inverse_hessian", false, ... "cpiv", @ cpiv_bard, ... "max_fract_change", [], ... "fract_prec", [], ... # vector, TolX is a scalar "diffp", [], ... "diff_onesided", [], ... "complex_step_derivative_objf", false, ... "complex_step_derivative_inequc", false, ... "complex_step_derivative_equc", false, ... "cstep", cstep_default, ... "fixed", [], ... "inequc", [], ... "equc", [], ... "f_inequc_idx", false, ... "df_inequc_idx", false, ... "f_equc_idx", false, ... "df_equc_idx", false, ... "TolFun", stol_default, ... "TolX", [], ... "MaxIter", [], ... "Display", "off", ... "Algorithm", "lm_feasible", ... "parallel_local", false, ... # Matlabs UseParallel # works differently "user_interaction", {}, ... "T_init", .01, ... "T_min", 1.0e-5, ... "mu_T", 1.005, ... "iters_fixed_T", 10, ... "max_rand_step", [], ... "stoch_regain_constr", false, ... "trace_steps", false, ... "siman_log", false, ... "debug", false, ... "FunValCheck", "off", ... "save_state", "", ... "recover_state", "", ... "octave_sqp_tolerance", []); return; endif if (nargin < 2 || nargin > 3) print_usage (); endif if (nargin == 2) settings = struct (); endif if (ischar (f)) f = str2func (f); endif if (! (pin_struct = isstruct (pin))) if (! isvector (pin) || columns (pin) > 1) error ("initial parameters must be either a structure or a column vector"); endif endif #### processing of settings and consistency checks backend = optimget (settings, "Algorithm", "lm_feasible"); backend = map_matlab_algorithm_names (backend); [backend, path_bounds] = map_backend (backend); pconf = optimget (settings, "param_config"); pord = optimget (settings, "param_order"); pdims = optimget (settings, "param_dims"); f_inequc_pstruct = optimget (settings, "f_inequc_pstruct", false); f_equc_pstruct = optimget (settings, "f_equc_pstruct", false); f_pstruct = optimget (settings, "objf_pstruct", false); dfdp_pstruct = optimget (settings, "grad_objf_pstruct", f_pstruct); hessian_pstruct = optimget (settings, "hessian_objf_pstruct", f_pstruct); df_inequc_pstruct = optimget (settings, "df_inequc_pstruct", ... f_inequc_pstruct); df_equc_pstruct = optimget (settings, "df_equc_pstruct", ... f_equc_pstruct); lbound = optimget (settings, "lbound"); ubound = optimget (settings, "ubound"); dfdp = optimget (settings, "objf_grad"); if (ischar (dfdp)) dfdp = str2func (dfdp); endif hessian = optimget (settings, "objf_hessian"); max_fract_change = optimget (settings, "max_fract_change"); fract_prec = optimget (settings, "fract_prec"); diffp = optimget (settings, "diffp"); diff_onesided = optimget (settings, "diff_onesided"); fixed = optimget (settings, "fixed"); do_cstep = optimget (settings, "complex_step_derivative_objf", false); cstep = optimget (settings, "cstep", cstep_default); if (do_cstep && ! isempty (dfdp)) error ("both 'complex_step_derivative_objf' and 'objf_grad' are set"); endif do_cstep_inequc = ... optimget (settings, "complex_step_derivative_inequc", false); do_cstep_equc = optimget (settings, "complex_step_derivative_equc", ... false); if (! iscell (user_interaction = ... optimget (settings, "user_interaction", {}))) user_interaction = {user_interaction}; endif max_rand_step = optimget (settings, "max_rand_step"); any_vector_conf = ! (isempty (lbound) && isempty (ubound) && ... isempty (max_fract_change) && ... isempty (fract_prec) && isempty (diffp) && ... isempty (diff_onesided) && isempty (fixed) && ... isempty (max_rand_step)); ## collect constraints [mc, vc, f_genicstr, df_gencstr, user_df_gencstr] = ... __collect_constraints__ (optimget (settings, "inequc"), ... do_cstep_inequc, "inequality constraints"); [emc, evc, f_genecstr, df_genecstr, user_df_genecstr] = ... __collect_constraints__ (optimget (settings, "equc"), ... do_cstep_equc, "equality constraints"); mc_struct = isstruct (mc); emc_struct = isstruct (emc); ## correct "_pstruct" settings if functions are not supplied, handle ## constraint functions not honoring indices if (isempty (dfdp)) dfdp_pstruct = false; endif if (isempty (hessian)) hessian_pstruct = false; endif if (isempty (f_genicstr)) f_inequc_pstruct = false; elseif (! optimget (settings, "f_inequc_idx", false)) f_genicstr = @ (p, varargin) apply_idx_if_given ... (f_genicstr (p, varargin{:}), varargin{:}); endif if (isempty (f_genecstr)) f_equc_pstruct = false; elseif (! optimget (settings, "f_equc_idx", false)) f_genecstr = @ (p, varargin) apply_idx_if_given ... (f_genecstr (p, varargin{:}), varargin{:}); endif if (user_df_gencstr) if (! optimget (settings, "df_inequc_idx", false)) df_gencstr = @ (varargin) df_gencstr (varargin{:})(varargin{2}, :); endif else df_inequc_pstruct = false; endif if (user_df_genecstr) if (! optimget (settings, "df_equc_idx", false)) df_genecstr = @ (varargin) df_genecstr (varargin{:})(varargin{2}, :); endif else df_equc_pstruct = false; endif ## some settings require a parameter order if (pin_struct || ! isempty (pconf) || f_inequc_pstruct || ... f_equc_pstruct || f_pstruct || dfdp_pstruct || ... hessian_pstruct || df_inequc_pstruct || df_equc_pstruct || ... mc_struct || emc_struct) if (isempty (pord)) if (pin_struct) if (any_vector_conf || ... ! (f_pstruct && ... (f_inequc_pstruct || isempty (f_genicstr)) && ... (f_equc_pstruct || isempty (f_genecstr)) && ... (dfdp_pstruct || isempty (dfdp)) && ... (hessian_pstruct || isempty (hessian)) && ... (df_inequc_pstruct || ! user_df_gencstr) && ... (df_equc_pstruct || ! user_df_genecstr) && ... (mc_struct || isempty (mc)) && ... (emc_struct || isempty (emc)))) error ("no parameter order specified and constructing a parameter order from the structure of initial parameters can not be done since not all configuration or given functions are structure based"); else pord = fieldnames (pin); endif else error ("given settings require specification of parameter order or initial parameters in the form of a structure"); endif endif pord = pord(:); if (pin_struct && ! all (isfield (pin, pord))) error ("some initial parameters lacking"); endif if ((nnames = rows (unique (pord))) < rows (pord)) error ("duplicate parameter names in 'param_order'"); endif if (isempty (pdims)) if (pin_struct) pdims = cellfun ... (@ size, fields2cell (pin, pord), "UniformOutput", false); else pdims = num2cell (ones (nnames, 2), 2); endif else pdims = pdims(:); if (pin_struct && ... ! all (cellfun (@ (x, y) prod (size (x)) == prod (y), ... struct2cell (pin), pdims))) error ("given param_dims and dimensions of initial parameters do not match"); endif endif if (nnames != rows (pdims)) error ("lengths of 'param_order' and 'param_dims' not equal"); endif pnel = cellfun (@ prod, pdims); ppartidx = pnel; if (any (pnel > 1)) pnonscalar = true; cpnel = num2cell (pnel); prepidx = cat (1, cellfun ... (@ (x, n) x(ones (1, n), 1), ... num2cell ((1:nnames).'), cpnel, ... "UniformOutput", false){:}); epord = pord(prepidx, 1); psubidx = cat (1, cellfun ... (@ (n) (1:n).', cpnel, ... "UniformOutput", false){:}); else pnonscalar = false; # some less expensive interfaces later prepidx = (1:nnames).'; epord = pord; psubidx = ones (nnames, 1); endif else pord = []; # spares checks for given but not needed endif if (pin_struct) np = sum (pnel); else np = length (pin); if (! isempty (pord) && np != sum (pnel)) error ("number of initial parameters not correct"); endif endif plabels = num2cell (num2cell ((1:np).')); if (! isempty (pord)) plabels = cat (2, plabels, num2cell (epord), ... num2cell (num2cell (psubidx))); endif ## some useful vectors zerosvec = zeros (np, 1); NAvec = NA (np, 1); Infvec = Inf (np, 1); falsevec = false (np, 1); sizevec = [np, 1]; ## collect parameter-related configuration if (! isempty (pconf)) ## use supplied configuration structure ## parameter-related configuration is either allowed by a structure ## or by vectors if (any_vector_conf) error ("if param_config is given, its potential items must not \ be configured in another way"); endif ## supplement parameter names lacking in param_config nidx = ! isfield (pconf, pord); pconf = cell2fields ({struct()}(ones (1, sum (nidx))), ... pord(nidx), 2, pconf); pconf = structcat (1, fields2cell (pconf, pord){:}); ## in the following, use reshape with explicit dimensions (instead ## of x(:)) so that errors are thrown if a configuration item has ## incorrect number of elements lbound = - Infvec; if (isfield (pconf, "lbound")) idx = ! fieldempty (pconf, "lbound"); if (pnonscalar) lbound (idx(prepidx), 1) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).lbound}.', ... cpnel(idx), "UniformOutput", false){:}); else lbound(idx, 1) = cat (1, pconf.lbound); endif endif ubound = Infvec; if (isfield (pconf, "ubound")) idx = ! fieldempty (pconf, "ubound"); if (pnonscalar) ubound (idx(prepidx), 1) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).ubound}.', ... cpnel(idx), "UniformOutput", false){:}); else ubound(idx, 1) = cat (1, pconf.ubound); endif endif max_fract_change = fract_prec = NAvec; if (isfield (pconf, "max_fract_change")) idx = ! fieldempty (pconf, "max_fract_change"); if (pnonscalar) max_fract_change(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).max_fract_change}.', ... cpnel(idx), ... "UniformOutput", false){:}); else max_fract_change(idx) = [pconf.max_fract_change]; endif endif if (isfield (pconf, "fract_prec")) idx = ! fieldempty (pconf, "fract_prec"); if (pnonscalar) fract_prec(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).fract_prec}.', cpnel(idx), ... "UniformOutput", false){:}); else fract_prec(idx) = [pconf.fract_prec]; endif endif diffp = zerosvec; diffp(:) = diffp_default; if (isfield (pconf, "diffp")) idx = ! fieldempty (pconf, "diffp"); if (pnonscalar) diffp(idx(prepidx)) = ... cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).diffp}.', cpnel(idx), ... "UniformOutput", false){:}); else diffp(idx) = [pconf.diffp]; endif endif diff_onesided = fixed = falsevec; if (isfield (pconf, "diff_onesided")) idx = ! fieldempty (pconf, "diff_onesided"); if (pnonscalar) diff_onesided(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).diff_onesided}.', cpnel(idx), ... "UniformOutput", false){:})); else diff_onesided(idx) = logical ([pconf.diff_onesided]); endif endif if (isfield (pconf, "fixed")) idx = ! fieldempty (pconf, "fixed"); if (pnonscalar) fixed(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).fixed}.', cpnel(idx), ... "UniformOutput", false){:})); else fixed(idx) = logical ([pconf.fixed]); endif endif max_rand_step = NAvec; if (isfield (pconf, "max_rand_step")) idx = ! fieldempty (pconf, "max_rand_step"); if (pnonscalar) max_rand_step(idx(prepidx)) = ... logical ... (cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... {pconf(idx).max_rand_step}.', cpnel(idx), ... "UniformOutput", false){:})); else max_rand_step(idx) = logical ([pconf.max_rand_step]); endif endif else ## use supplied configuration vectors if (isempty (lbound)) lbound = - Infvec; elseif (any (size (lbound) != sizevec)) error ("bounds: wrong dimensions"); endif if (isempty (ubound)) ubound = Infvec; elseif (any (size (ubound) != sizevec)) error ("bounds: wrong dimensions"); endif if (isempty (max_fract_change)) max_fract_change = NAvec; elseif (any (size (max_fract_change) != sizevec)) error ("max_fract_change: wrong dimensions"); endif if (isempty (fract_prec)) fract_prec = NAvec; elseif (any (size (fract_prec) != sizevec)) error ("fract_prec: wrong dimensions"); endif if (isempty (diffp)) diffp = zerosvec; diffp(:) = diffp_default; else if (any (size (diffp) != sizevec)) error ("diffp: wrong dimensions"); endif diffp(isna (diffp)) = diffp_default; endif if (isempty (diff_onesided)) diff_onesided = falsevec; else if (any (size (diff_onesided) != sizevec)) error ("diff_onesided: wrong dimensions") endif diff_onesided(isna (diff_onesided)) = false; diff_onesided = logical (diff_onesided); endif if (isempty (fixed)) fixed = falsevec; else if (any (size (fixed) != sizevec)) error ("fixed: wrong dimensions"); endif fixed(isna (fixed)) = false; fixed = logical (fixed); endif if (isempty (max_rand_step)) max_rand_step = NAvec; elseif (any (size (max_rand_step) != sizevec)) error ("max_rand_step: wrong dimensions"); endif endif ## guaranty all (lbound <= ubound) if (any (lbound > ubound)) error ("some lower bounds larger than upper bounds"); endif ## pass bounds only if the backend respects bounds even during the ## course of optimization if (path_bounds) jac_lbound = lbound; jac_ubound = ubound; else jac_lbound = - Infvec; jac_ubound = Infvec; endif #### consider whether initial parameters and functions are based on #### parameter structures or parameter vectors; wrappers for call to #### default function for jacobians ## initial parameters if (pin_struct) if (pnonscalar) pin = cat (1, cellfun (@ (x, n) reshape (x, n, 1), ... fields2cell (pin, pord), cpnel, ... "UniformOutput", false){:}); else pin = cat (1, fields2cell (pin, pord){:}); endif endif ## objective function if (f_pstruct) if (pnonscalar) f = @ (p, varargin) ... f (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f = @ (p, varargin) ... f (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif ## gradient of objective function if (isempty (dfdp)) if (do_cstep) dfdp = @ (p, hook) jacobs (p, f, hook); else dfdp = @ (p, hook) __dfdp__ (p, f, hook); endif endif if (dfdp_pstruct) if (pnonscalar) dfdp = @ (p, hook) ... cat (2, ... fields2cell ... (dfdp (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), hook), ... pord){:}); else dfdp = @ (p, hook) ... cat (2, ... fields2cell ... (dfdp (cell2struct (num2cell (p), pord, 1), hook), ... pord){:}); endif endif ## hessian of objective function if (hessian_pstruct) if (pnonscalar) hessian = @ (p) ... hessian_struct2mat ... (hessian (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1)), pord); else hessian = @ (p) ... hessian_struct2mat ... (hessian (cell2struct (num2cell (p), pord, 1)), pord); endif endif ## function for general inequality constraints if (f_inequc_pstruct) if (pnonscalar) f_genicstr = @ (p, varargin) ... f_genicstr (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f_genicstr = @ (p, varargin) ... f_genicstr ... (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif ## note this stage possibly_pstruct_f_genicstr = f_genicstr; ## jacobian of general inequality constraints if (df_inequc_pstruct) if (pnonscalar) df_gencstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_gencstr ... (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), pord, 1), ... func, idx, hook), ... pord){:}); else df_gencstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_gencstr (cell2struct (num2cell (p), pord, 1), ... func, idx, hook), ... pord){:}); endif endif ## function for general equality constraints if (f_equc_pstruct) if (pnonscalar) f_genecstr = @ (p, varargin) ... f_genecstr (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1), varargin{:}); else f_genecstr = @ (p, varargin) ... f_genecstr ... (cell2struct (num2cell (p), pord, 1), varargin{:}); endif endif ## note this stage possibly_pstruct_f_genecstr = f_genecstr; ## jacobian of general equality constraints if (df_equc_pstruct) if (pnonscalar) df_genecstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_genecstr ... (cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), pord, 1), ... func, idx, hook), ... pord){:}); else df_genecstr = @ (p, func, idx, hook) ... cat (2, ... fields2cell ... (df_genecstr (cell2struct (num2cell (p), pord, 1), ... func, idx, hook), ... pord){:}); endif endif ## linear inequality constraints if (mc_struct) idx = isfield (mc, pord); if (rows (fieldnames (mc)) > sum (idx)) error ("unknown fields in structure of linear inequality constraints"); endif smc = mc; mc = zeros (np, rows (vc)); mc(idx(prepidx), :) = cat (1, fields2cell (smc, pord(idx)){:}); endif ## linear equality constraints if (emc_struct) idx = isfield (emc, pord); if (rows (fieldnames (emc)) > sum (idx)) error ("unknown fields in structure of linear equality constraints"); endif semc = emc; emc = zeros (np, rows (evc)); emc(idx(prepidx), :) = cat (1, fields2cell (semc, pord(idx)){:}); endif ## parameter-related configuration for jacobian functions if (dfdp_pstruct || df_inequc_pstruct || df_equc_pstruct) if(pnonscalar) s_diffp = cell2struct ... (cellfun (@ reshape, mat2cell (diffp, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_diff_onesided = cell2struct ... (cellfun (@ reshape, mat2cell (diff_onesided, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_jac_lbound = cell2struct ... (cellfun (@ reshape, mat2cell (jac_lbound, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_jac_ubound = cell2struct ... (cellfun (@ reshape, mat2cell (jac_ubound, ppartidx), ... pdims, "UniformOutput", false), pord, 1); s_plabels = cell2struct ... (num2cell ... (cat (2, cellfun ... (@ (x) cellfun ... (@ reshape, mat2cell (cat (1, x{:}), ppartidx), ... pdims, "UniformOutput", false), ... num2cell (plabels, 1), "UniformOutput", false){:}), ... 2), ... pord, 1); s_orig_fixed = cell2struct ... (cellfun (@ reshape, mat2cell (fixed, ppartidx), ... pdims, "UniformOutput", false), pord, 1); else s_diffp = cell2struct (num2cell (diffp), pord, 1); s_diff_onesided = cell2struct (num2cell (diff_onesided), pord, 1); s_jac_lbound = cell2struct (num2cell (jac_lbound), pord, 1); s_jac_ubound = cell2struct (num2cell (jac_ubound), pord, 1); s_plabels = cell2struct (num2cell (plabels, 2), pord, 1); s_orig_fixed = cell2struct (num2cell (fixed), pord, 1); endif endif #### some further values and checks if (any (fixed & (pin < lbound | pin > ubound))) warning ("some fixed parameters outside bounds"); endif if (any (diffp <= 0)) error ("some elements of 'diffp' non-positive"); endif if (cstep <= 0) error ("'cstep' non-positive"); endif if ((hook.TolFun = optimget (settings, "TolFun", stol_default)) < 0) error ("'TolFun' negative"); endif if (any (fract_prec < 0)) error ("some elements of 'fract_prec' negative"); endif if (any (max_fract_change < 0)) error ("some elements of 'max_fract_change' negative"); endif ## dimensions of linear constraints if (isempty (mc)) mc = zeros (np, 0); vc = zeros (0, 1); endif if (isempty (emc)) emc = zeros (np, 0); evc = zeros (0, 1); endif [rm, cm] = size (mc); [rv, cv] = size (vc); if (rm != np || cm != rv || cv != 1) error ("linear inequality constraints: wrong dimensions"); endif [erm, ecm] = size (emc); [erv, ecv] = size (evc); if (erm != np || ecm != erv || ecv != 1) error ("linear equality constraints: wrong dimensions"); endif ## note initial values of linear constraits pin_cstr.inequ.lin_except_bounds = mc.' * pin + vc; pin_cstr.equ.lin = emc.' * pin + evc; ## note number and initial values of general constraints if (isempty (f_genicstr)) pin_cstr.inequ.gen = []; n_genicstr = 0; else n_genicstr = length (pin_cstr.inequ.gen = f_genicstr (pin)); endif if (isempty (f_genecstr)) pin_cstr.equ.gen = []; n_genecstr = 0; else n_genecstr = length (pin_cstr.equ.gen = f_genecstr (pin)); endif #### collect remaining settings parallel_local = hook.parallel_local = ... __optimget_parallel_local__ (settings, false); hook.MaxIter = optimget (settings, "MaxIter"); if (ischar (hook.cpiv = optimget (settings, "cpiv", @ cpiv_bard))) hook.cpiv = str2func (hook.cpiv); endif hook.Display = optimget (settings, "Display", "off"); hook.testing = optimget (settings, "debug", false); hook.siman.T_init = optimget (settings, "T_init", .01); hook.siman.T_min = optimget (settings, "T_min", 1.0e-5); hook.siman.mu_T = optimget (settings, "mu_T", 1.005); hook.siman.iters_fixed_T = optimget (settings, "iters_fixed_T", 10); hook.stoch_regain_constr = ... optimget (settings, "stoch_regain_constr", false); hook.trace_steps = ... optimget (settings, "trace_steps", false); hook.siman_log = ... optimget (settings, "siman_log", false); hook.save_state = optimget (settings, "save_state", ""); hook.recover_state = optimget (settings, "recover_state", ""); hook.octave_sqp_tolerance = ... optimget (settings, "octave_sqp_tolerance", []); hook.inverse_hessian = optimget (settings, "inverse_hessian", false); hook.TolX = optimget (settings, "TolX", []); hook.FunValCheck = optimget (settings, "FunValCheck", "off"); #### handle fixing of parameters orig_fixed = fixed; if (all (fixed)) error ("no free parameters"); endif nonfixed = ! fixed; if (any (fixed)) ## backend (returned values and initial parameters) backend = @ (f, pin, hook) ... backend_wrapper (backend, fixed, f, pin, hook); ## objective function f = @ (p, varargin) f (assign (pin, nonfixed, p), varargin{:}); ## gradient of objective function dfdp = @ (p, hook) ... dfdp (assign (pin, nonfixed, p), hook)(nonfixed); ## hessian of objective function if (! isempty (hessian)) hessian = @ (p) ... hessian (assign (pin, nonfixed, p))(nonfixed, nonfixed); endif ## function for general inequality constraints f_genicstr = @ (p, varargin) ... f_genicstr (assign (pin, nonfixed, p), varargin{:}); ## jacobian of general inequality constraints df_gencstr = @ (p, func, idx, hook) ... df_gencstr (assign (pin, nonfixed, p), func, idx, hook) ... (:, nonfixed); ## function for general equality constraints f_genecstr = @ (p, varargin) ... f_genecstr (assign (pin, nonfixed, p), varargin{:}); ## jacobian of general equality constraints df_genecstr = @ (p, func, idx, hook) ... df_genecstr (assign (pin, nonfixed, p), func, idx, hook) ... (:, nonfixed); ## linear inequality constraints vc += mc(fixed, :).' * (tp = pin(fixed)); mc = mc(nonfixed, :); ## linear equality constraints evc += emc(fixed, :).' * tp; emc = emc(nonfixed, :); ## _last_ of all, vectors of parameter-related configuration, ## including "fixed" itself lbound = lbound(nonfixed, :); ubound = ubound(nonfixed, :); max_fract_change = max_fract_change(nonfixed); fract_prec = fract_prec(nonfixed); max_rand_step = max_rand_step(nonfixed); fixed = fixed(nonfixed); endif #### supplement constants to jacobian functions ## gradient of objective function if (dfdp_pstruct) dfdp = @ (p, hook) ... dfdp (p, cell2fields ... ({s_diffp, s_diff_onesided, s_jac_lbound, ... s_jac_ubound, s_plabels, ... cell2fields(num2cell(hook.fixed), pord(nonfixed), ... 1, s_orig_fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); else dfdp = @ (p, hook) ... dfdp (p, cell2fields ... ({diffp, diff_onesided, jac_lbound, jac_ubound, ... plabels, assign(orig_fixed, nonfixed, hook.fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); endif ## jacobian of general inequality constraints if (df_inequc_pstruct) df_gencstr = @ (p, func, idx, hook) ... df_gencstr (p, func, idx, cell2fields ... ({s_diffp, s_diff_onesided, s_jac_lbound, ... s_jac_ubound, s_plabels, ... cell2fields(num2cell(hook.fixed), pord(nonfixed), ... 1, s_orig_fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); else df_gencstr = @ (p, func, idx, hook) ... df_gencstr (p, func, idx, cell2fields ... ({diffp, diff_onesided, jac_lbound, ... jac_ubound, plabels, ... assign(orig_fixed, nonfixed, hook.fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); endif ## jacobian of general equality constraints if (df_equc_pstruct) df_genecstr = @ (p, func, idx, hook) ... df_genecstr (p, func, idx, cell2fields ... ({s_diffp, s_diff_onesided, s_jac_lbound, ... s_jac_ubound, s_plabels, ... cell2fields(num2cell(hook.fixed), pord(nonfixed), ... 1, s_orig_fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); else df_genecstr = @ (p, func, idx, hook) ... df_genecstr (p, func, idx, cell2fields ... ({diffp, diff_onesided, jac_lbound, ... jac_ubound, plabels, ... assign(orig_fixed, nonfixed, hook.fixed), ... cstep, parallel_local}, ... {"diffp", "diff_onesided", "lbound", "ubound", ... "plabels", "fixed", "h", "parallel_local"}, ... 2, hook)); endif #### interfaces to constraints ## include bounds into linear inequality constraints tp = eye (sum (nonfixed)); lidx = lbound != - Inf; uidx = ubound != Inf; mc = cat (2, tp(:, lidx), - tp(:, uidx), mc); vc = cat (1, - lbound(lidx, 1), ubound(uidx, 1), vc); ## concatenate linear inequality and equality constraints mc = cat (2, mc, emc); vc = cat (1, vc, evc); n_lincstr = rows (vc); ## concatenate general inequality and equality constraints if (n_genecstr > 0) if (n_genicstr > 0) nidxi = 1 : n_genicstr; nidxe = n_genicstr + 1 : n_genicstr + n_genecstr; f_gencstr = @ (p, idx, varargin) ... cat (1, ... f_genicstr (p, idx(nidxi), varargin{:}), ... f_genecstr (p, idx(nidxe), varargin{:})); df_gencstr = @ (p, idx, hook) ... cat (1, ... df_gencstr (p, @ (p, varargin) ... possibly_pstruct_f_genicstr ... (p, idx(nidxi), varargin{:}), ... idx(nidxi), ... setfield (hook, "f", ... hook.f(nidxi(idx(nidxi))))), ... df_genecstr (p, @ (p, varargin) ... possibly_pstruct_f_genecstr ... (p, idx(nidxe), varargin{:}), ... idx(nidxe), ... setfield (hook, "f", ... hook.f(nidxe(idx(nidxe)))))); else f_gencstr = f_genecstr; df_gencstr = @ (p, idx, hook) ... df_genecstr (p, ... @ (p, varargin) ... possibly_pstruct_f_genecstr ... (p, idx, varargin{:}), ... idx, ... setfield (hook, "f", hook.f(idx))); endif else f_gencstr = f_genicstr; df_gencstr = @ (p, idx, hook) ... df_gencstr (p, ... @ (p, varargin) ... possibly_pstruct_f_genicstr (p, idx, varargin{:}), ... idx, ... setfield (hook, "f", hook.f(idx))); endif n_gencstr = n_genicstr + n_genecstr; ## concatenate linear and general constraints, defining the final ## function interfaces if (n_gencstr > 0) nidxl = 1:n_lincstr; nidxh = n_lincstr + 1 : n_lincstr + n_gencstr; f_cstr = @ (p, idx, varargin) ... cat (1, ... mc(:, idx(nidxl)).' * p + vc(idx(nidxl), 1), ... f_gencstr (p, idx(nidxh), varargin{:})); df_cstr = @ (p, idx, hook) ... cat (1, ... mc(:, idx(nidxl)).', ... df_gencstr (p, idx(nidxh), ... setfield (hook, "f", ... hook.f(nidxh)))); else f_cstr = @ (p, idx, varargin) mc(:, idx).' * p + vc(idx, 1); df_cstr = @ (p, idx, hook) mc(:, idx).'; endif ## define eq_idx (logical index of equality constraints within all ## concatenated constraints eq_idx = false (n_lincstr + n_gencstr, 1); eq_idx(n_lincstr + 1 - rows (evc) : n_lincstr) = true; n_cstr = n_lincstr + n_gencstr; eq_idx(n_cstr + 1 - n_genecstr : n_cstr) = true; #### prepare interface hook ## passed constraints hook.mc = mc; hook.vc = vc; hook.f_cstr = f_cstr; hook.df_cstr = df_cstr; hook.n_gencstr = n_gencstr; hook.eq_idx = eq_idx; hook.lbound = lbound; hook.ubound = ubound; ## passed values of constraints for initial parameters hook.pin_cstr = pin_cstr; ## passed function for gradient of objective function hook.dfdp = dfdp; ## passed function for hessian of objective function hook.hessian = hessian; ## passed function for complementary pivoting ## hook.cpiv = cpiv; # set before ## passed options hook.max_fract_change = max_fract_change; hook.fract_prec = fract_prec; ## hook.TolFun = ; # set before ## hook.MaxIter = ; # set before hook.fixed = fixed; hook.user_interaction = user_interaction; hook.max_rand_step = max_rand_step; #### call backend [p, objf, cvg, outp] = backend (f, pin, hook); if (pin_struct) if (pnonscalar) p = cell2struct ... (cellfun (@ reshape, mat2cell (p, ppartidx), ... pdims, "UniformOutput", false), ... pord, 1); else p = cell2struct (num2cell (p), pord, 1); endif endif endfunction function backend = map_matlab_algorithm_names (backend) ## nothing done here at the moment endfunction function [backend, path_bounds] = map_backend (backend) switch (backend) ## case "sqp_infeasible" ## backend = "__sqp__"; ## case "sqp" ## backend = "__sqp__"; case "lm_feasible" backend = "__lm_feasible__"; path_bounds = true; case "octave_sqp" backend = "__octave_sqp_wrapper__"; path_bounds = false; case "siman" backend = "__siman__"; path_bounds = true; case "d2_min" backend = "__d2_min__"; path_bounds = false; otherwise error ("no backend implemented for algorithm '%s'", backend); endswitch backend = str2func (backend); endfunction function [p, resid, cvg, outp] = backend_wrapper (backend, fixed, f, p, hook) [tp, resid, cvg, outp] = backend (f, p(! fixed), hook); p(! fixed) = tp; endfunction function lval = assign (lval, lidx, rval) lval(lidx) = rval; endfunction function m = hessian_struct2mat (s, pord) m = cell2mat (fields2cell ... (structcat (1, NA, fields2cell (s, pord){:}), pord)); idx = isna (m); m(idx) = (m.')(idx); endfunction function ret = apply_idx_if_given (ret, varargin) if (nargin > 1) ret = ret(varargin{1}); endif endfunction %!demo %! ## Example for default optimization (Levenberg/Marquardt with %! ## BFGS), one non-linear equality constraint. Constrained optimum is %! ## at p = [0; 1]. %! objective_function = @ (p) p(1)^2 + p(2)^2; %! pin = [-2; 5]; %! constraint_function = @ (p) p(1)^2 + 1 - p(2); %! [p, objf, cvg, outp] = nonlin_min (objective_function, pin, optimset ("equc", {constraint_function})) %!demo %! ## Example for simulated annealing, two parameters, "trace_steps" %! ## is true; %! t_init = .2; %! t_min = .002; %! mu_t = 1.002; %! iters_fixed_t = 10; %! init_p = [2; 2]; %! max_rand_step = [.2; .2]; %! [p, objf, cvg, outp] = nonlin_min (@ (p) (p(1)/10)^2 + (p(2)/10)^2 + .1 * (-cos(4*p(1)) - cos(4*p(2))), init_p, optimset ("algorithm", "siman", "max_rand_step", max_rand_step, "t_init", t_init, "T_min", t_min, "mu_t", mu_t, "iters_fixed_T", iters_fixed_t, "trace_steps", true)); %! p %! objf %! x = (outp.trace(:, 1) - 1) * iters_fixed_t + outp.trace(:, 2); %! x(1) = 0; %! plot (x, cat (2, outp.trace(:, 3:end), t_init ./ (mu_t .^ outp.trace(:, 1)))) %! legend ({"objective function value", "p(1)", "p(2)", "Temperature"}) %! xlabel ("subiteration") optim-1.4.1/inst/minimize.m0000644000175000017500000002662012477523726014272 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [x,v,nev,...] = minimize (f,args,...) - Minimize f ## ## ARGUMENTS ## f : string : Name of function. Must return a real value ## args : list or : List of arguments to f (by default, minimize the first) ## matrix : f's only argument ## ## RETURNED VALUES ## x : matrix : Local minimum of f. Let's suppose x is M-by-N. ## v : real : Value of f in x0 ## nev : integer : Number of function evaluations ## or 1 x 2 : Number of function and derivative evaluations (if ## derivatives are used) ## ## ## Extra arguments are either a succession of option-value pairs or a single ## list or struct of option-value pairs (for unary options, the value in the ## struct is ignored). ## ## OPTIONS : DERIVATIVES Derivatives may be used if one of these options ## --------------------- uesd. Otherwise, the Nelder-Mean (see ## nelder_mead_min) method is used. ## ## 'd2f', d2f : Name of a function that returns the value of f, of its ## 1st and 2nd derivatives : [fx,dfx,d2fx] = feval (d2f, x) ## where fx is a real number, dfx is 1x(M*N) and d2fx is ## (M*N)x(M*N). A Newton-like method (d2_min) will be used. ## ## 'hess' : Use [fx,dfx,d2fx] = leval (f, args) to compute 1st and ## 2nd derivatives, and use a Newton-like method (d2_min). ## ## 'd2i', d2i : Name of a function that returns the value of f, of its ## 1st and pseudo-inverse of second derivatives : ## [fx,dfx,id2fx] = feval (d2i, x) where fx is a real ## number, dfx is 1x(M*N) and d2ix is (M*N)x(M*N). ## A Newton-like method will be used (see d2_min). ## ## 'ihess' : Use [fx,dfx,id2fx] = leval (f, args) to compute 1st ## derivative and the pseudo-inverse of 2nd derivatives, ## and use a Newton-like method (d2_min). ## ## NOTE : df, d2f or d2i take the same arguments as f. ## ## 'order', n : Use derivatives of order n. If the n'th order derivative ## is not specified by 'df', 'd2f' or 'd2i', it will be ## computed numerically. Currently, only order 1 works. ## ## 'ndiff' : Use a variable metric method (bfgs) using numerical ## differentiation. ## ## OPTIONS : STOPPING CRITERIA Default is to use 'tol' ## --------------------------- ## 'ftol', ftol : Stop search when value doesn't improve, as tested by ## ## ftol > Deltaf/max(|f(x)|,1) ## ## where Deltaf is the decrease in f observed in the last ## iteration. Default=10*eps ## ## 'utol', utol : Stop search when updates are small, as tested by ## ## tol > max { dx(i)/max(|x(i)|,1) | i in 1..N } ## ## where dx is the change in the x that occured in the last ## iteration. ## ## 'dtol',dtol : Stop search when derivatives are small, as tested by ## ## dtol > max { df(i)*max(|x(i)|,1)/max(v,1) | i in 1..N } ## ## where x is the current minimum, v is func(x) and df is ## the derivative of f in x. This option is ignored if ## derivatives are not used in optimization. ## ## MISC. OPTIONS ## ------------- ## 'maxev', m : Maximum number of function evaluations ## ## 'narg' , narg : Position of the minimized argument in args <1> ## 'isz' , step : Initial step size (only for 0 and 1st order method) <1> ## Should correspond to expected distance to minimum ## 'verbose' : Display messages during execution ## ## 'backend' : Instead of performing the minimization itself, return ## [backend, control], the name and control argument of the ## backend used by minimize(). Minimimzation can then be ## obtained without the overhead of minimize by calling, if ## a 0 or 1st order method is used : ## ## [x,v,nev] = feval (backend, args, control) ## ## or, if a 2nd order method is used : ## ## [x,v,nev] = feval (backend, control.d2f, args, control) function [x,v,nev,varargout] = minimize (f,args,varargin) ## Reasons for deprecation have been outlined in ## ## http://lists.gnu.org/archive/html/octave-maintainers/2014-06/msg00130.html ## persistent warned = false; if (! warned) warned = true; warning ("Octave:deprecated-function", "The frontend `minimize' has been deprecated, and will be removed in the future. An alternative frontend is `nonlin_min'. The backends `nelder_mead_min' and `bfgsmin' of `minimize' can also be called directly. The backend `d2_min' is also deprecated and its algorithm is now available as a backend to `nonlin_min'."); endif ## Oldies ## ## 'df' , df : Name of a function that returns the derivatives of f ## in x : dfx = feval (df, x) where dfx is 1x(M*N). A ## variable metric method (see bfgs) will be used. ## ## 'jac' : Use [fx, dfx] = leval(f, args) to compute derivatives ## and use a variable metric method (bfgs). ## # #################################################################### # Read the options ################################################### # #################################################################### # Options with a value op1 = "ftol utol dtol df d2f d2i order narg maxev isz"; # Boolean options op0 = "verbose backend jac hess ihess ndiff" ; default = struct ("backend",0,"verbose",0,... "df","", "df", "","d2f","","d2i","", ... "hess", 0, "ihess", 0, "jac", 0,"ndiff", 0, ... "ftol" ,nan, "utol",nan, "dtol", nan,... "order",nan, "narg",nan, "maxev",nan,... "isz", nan); if nargin == 3 # Accomodation to struct and list optional tmp = varargin{1}; if isstruct (tmp) opls = {}; for [v,k] = tmp # Treat separately unary and binary opts if findstr ([" ",k," "],op0) opls (end+1) = {k}; # append k else opls (end+[1:2]) = {k, v}; # append k and v end end elseif iscell (tmp) opls = tmp; else opls = {tmp}; end else opls = varargin; end ops = __read_options__ (opls, ... "op0", op0, "op1", op1, "default", default); backend=ops.backend; verbose=ops.verbose; df=ops.df; d2f=ops.d2f; d2i=ops.d2i; hess=ops.hess; ihess=ops.ihess; jac=ops.jac; ftol=ops.ftol; utol=ops.utol; dtol=ops.dtol; order=ops.order; narg=ops.narg; maxev=ops.maxev; isz=ops.isz; ndiff=ops.ndiff; if length (df), error ("Option 'df' doesn't exist any more. Sorry.\n");end if jac, error ("Option 'jac' doesn't exist any more. Sorry.\n");end # Basic coherence checks ############# ws = ""; # Warning string es = ""; # Error string # Warn if more than 1 differential is given if !!length (df) + !!length (d2f) + !!length (d2i) + jac + hess + ihess + ... ndiff > 1 # Order of preference of if length (d2i), ws = [ws,"d2i='",d2i,"', "]; end if length (d2f), ws = [ws,"d2f='",d2f,"', "]; end if length (df), ws = [ws,"df='",df,"', "]; end if hess , ws = [ws,"hess, "]; end if ihess , ws = [ws,"ihess, "]; end if jac , ws = [ws,"jac, "]; end if ndiff , ws = [ws,"ndiff, "]; end ws = ws(1:length(ws)-2); ws = ["Options ",ws," were passed. Only one will be used\n"] end # Check that enough args are passed to call # f(), unless backend is specified, in which # case I don't need to call f() if ! isnan (narg) && ! backend if narg > 1 es = [es,sprintf("narg=%i, but a single argument was passed\n",narg)]; end end if length (ws), warn (ws); end if length (es), error (es); end # EOF Basic coherence checks ######### op = 0; # Set if any option is passed and should be # passed to backend if ! isnan (ftol) , ctls.ftol = ftol; op = 1; end if ! isnan (utol) , ctls.utol = utol; op = 1; end if ! isnan (dtol) , ctls.dtol = dtol; op = 1; end if ! isnan (maxev) , ctls.maxev = maxev; op = 1; end if ! isnan (narg) , ctls.narg = narg; op = 1; end if ! isnan (isz) , ctls.isz = isz; op = 1; end if verbose , ctls.verbose = 1; op = 1; end # defaults That are used in this function : if isnan (narg), narg = 1; end # ########################################## # Choose one optimization method ########### # Choose according to available derivatives if ihess, d2f = f; ctls.id2f = 1; op = 1; elseif hess, d2f = f; end if length (d2i), method = "d2_min"; ctls.id2f = 1; op = 1; d2f = d2i; elseif length (d2f), method = "d2_min"; ### elseif length (df) , method = "bfgsmin"; ctls.df = df; op = 1; ### elseif jac , method = "bfgsmin"; ctls.jac = 1 ; op = 1; ## else method = "nelder_mead_min"; ## end # Choose method because ndiff is passed #### elseif ndiff , method = "bfgsmin"; # Choose method by specifying order ######## elseif ! isnan (order) if order == 0, method = "nelder_mead_min"; elseif order == 1 method = "bfgsmin"; elseif order == 2 if ! (length (d2f) || length (d2i)) error ("minimize(): 'order' is 2, but 2nd differential is missing\n"); end else error ("minimize(): 'order' option only implemented for order<=2\n"); end else # Default is nelder_mead_min method = "nelder_mead_min"; end # EOF choose method ######################## if verbose printf ("minimize(): Using '%s' as back-end\n",method); end # More checks ############################## ws = ""; if !isnan (isz) && strcmp (method,"d2_min") ws = [ws,"option 'isz' is passed to method that doesn't use it"]; end if length (ws), warn (ws); end # EOF More checks ########################## if strcmp (method, "d2_min"), all_args = {f, d2f, args}; elseif strcmp (method, "bfgsmin"),all_args = {f, args}; else all_args = {f, args}; end # Eventually add ctls to argument list if op, all_args{end+1} = ctls; end if ! backend # Call the backend ################### if strcmp (method, "d2_min"), [x,v,nev,h] = d2_min(all_args{:}); # Eventually return inverse of Hessian if nargout > 3, varargout{1} = h; vr_val_cnt=2; end elseif strcmp (method, "bfgsmin") nev = nan; if !iscell(args), args = {args}; end if isnan (ftol), ftol = 1e-12; end # Use bfgsmin's defaults if isnan (utol), utol = 1e-6; end if isnan (dtol), dtol = 1e-5; end if isnan (maxev), maxev = inf; end [x, v, okcv] = bfgsmin (f, args, {maxev,verbose,1,narg,0,ftol,utol,dtol}); else [x,v,nev] = feval (method, all_args{:}); end else # Don't call backend, just return its name # and arguments. x = method; if op, v = ctls; else v = []; end end optim-1.4.1/inst/cpiv_bard.m0000644000175000017500000000603212477523726014375 0ustar olafolaf## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## [lb, idx, ridx, mv] = cpiv_bard (v, m[, incl]) ## ## v: column vector; m: matrix; incl (optional): index. length (v) ## must equal rows (m). Finds column vectors w and l with w == v + m * ## l, w >= 0, l >= 0, l.' * w == 0. Chooses idx, w, and l so that ## l(~idx) == 0, l(idx) == -inv (m(idx, idx)) * v(idx), w(idx) roughly ## == 0, and w(~idx) == v(~idx) + m(idx, ~idx).' * l(idx). idx indexes ## at least everything indexed by incl, but l(incl) may be < 0. lb: ## l(idx) (column vector); idx: logical index, defined above; ridx: ## ~idx & w roughly == 0; mv: [m, v] after performing a Gauss-Jordan ## 'sweep' (with gjp.m) on each diagonal element indexed by idx. ## Except the handling of incl (which enables handling of equality ## constraints in the calling code), this is called solving the ## 'complementary pivot problem' (Cottle, R. W. and Dantzig, G. B., ## 'Complementary pivot theory of mathematical programming', Linear ## Algebra and Appl. 1, 102--125. References for the current ## algorithm: Bard, Y.: Nonlinear Parameter Estimation, p. 147--149, ## Academic Press, New York and London 1974; Bard, Y., 'An eclectic ## approach to nonlinear programming', Proc. ANU Sem. Optimization, ## Canberra, Austral. Nat. Univ.). function [lb, idx, ridx, m] = cpiv_bard (v, m, incl) n = length (v); if (n > size (v, 1)) error ("first argument is no column vector"); # the most typical mistake endif if (nargin < 3) incl = []; elseif (islogical (incl)) incl = find (incl); endif nincl = 1:n; nincl(incl) = []; sgn = ones (n, 1); if (length (incl) == n) sgn = - sgn; m = inv (m); m = cat (2, m, m * v); else m = cat (2, m, v); for id = incl(:).' sgn(id) = -sgn(id); m = gjp (m, id); endfor endif nz = eps; # This is arbitrary; components of w and -l are regarded as # non-negative if >= -nz. nl = 100 * n; # maximum number of loop repeats, after that give up if (isempty (nincl)) ready = true; else ready = false; while (~ready && nl > 0) [vm, idm] = min (sgn(nincl) .* m(nincl, end)); if (vm >= -nz) ready = true; else idm = nincl(idm); sgn(idm) = -sgn(idm); m = gjp (m, idm); nl = nl - 1; endif endwhile endif if (~ready) error ("not successful"); endif idx = sgn < 0; lb = -m(idx, end); ridx = ~idx & abs (m(:, end)) <= nz; endfunction optim-1.4.1/inst/curvefit_stat.m0000644000175000017500000000410112477523726015321 0ustar olafolaf## Copyright (C) 2011-2014 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {@var{info} =} curvefit_stat (@var{f}, @var{p}, @var{x}, @var{y}, @var{settings}) ## ## Frontend for computation of statistics for fitting of values, ## computed by a model function, to observed values. ## ## Please refer to the description of @code{residmin_stat}. The only ## differences to @code{residmin_stat} are the additional arguments ## @var{x} (independent values) and @var{y} (observations), that the ## model function @var{f}, if provided, has a second obligatory argument ## which will be set to @var{x} and is supposed to return guesses for ## the observations (with the same dimensions), and that the possibly ## user-supplied function for the jacobian of the model function has ## also a second obligatory argument which will be set to @var{x}. ## ## @seealso {residmin_stat} ## @end deftypefn function ret = curvefit_stat (f, pfin, x, y, settings) if (nargin == 1) ret = __residmin_stat__ (f); return; endif if (nargin != 5) print_usage () endif if (! isempty (dfdp = optimget (settings, "dfdp")) && ... ! isnumeric (dfdp)) if (ischar (dfdp)) dfdp = str2func (dfdp); endif settings.dfdp = @ (p, varargin) dfdp (p, x, varargin{:}); endif if (! isempty (f)) f = @ (p) f (p, x); endif ret = __residmin_stat__ ... (f, pfin, settings, struct ("observations", y)); endfunction optim-1.4.1/inst/test_min_2.m0000644000175000017500000000573712477523726014522 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## Copyright (C) 2004 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## test_min_2 - Test that bfgs works ## ## Defines some simple functions and verifies that calling ## ## bfgs on them returns the correct minimum. ## ## Sets 'ok' to 1 if success, 0 otherwise if ! exist ("optim_func"), optim_func = "bfgsmin"; end ok = 1; if ! exist ("verbose"), verbose = 0; end P = 15; R = 20; # must have R >= P global obsmat ; ## Make test_min_2 reproducible by using fixed obsmat ## obsmat = randn(R,P) ; obsmat = zeros (R,P); obsmat(sub2ind([R,P],1:R,1+rem(0:R-1,P))) = 1:R; global truep ; ## Make test_min_2 reproducible by using fixed starting point ## truep = randn(P,1) ; ## xinit = randn(P,1) ; truep = rem (1:P, P/4)'; xinit = truep + 2*(1:P)'/(P); global obses ; obses = obsmat*truep ; function v = ff(x) global obsmat; global obses; v = mean ((obses - obsmat*x).^2) + 1 ; endfunction function dv = dff(x) global obsmat; global obses; er = -obses + obsmat*x ; dv = 2*er'*obsmat / rows(obses) ; ## dv = 2*er'*obsmat ; endfunction ## dt = mytic() ## ## Returns the cputime since last call to 'mytic'. function dt = mytic() persistent last_mytic = 0 ; [t,u,s] = cputime() ; dt = t - last_mytic ; last_mytic = t ; endfunction if verbose printf ("\n Testing %s on a quadratic problem\n\n", optim_func); printf ([" Set 'optim_func' to the name of the optimization\n",... " function you want to test (must have same synopsis\n",... " as 'bfgs')\n\n"]); printf (" Nparams = P = %i, Nobses = R = %i\n",P,R); fflush (stdout); end ctl.df = "dff"; ctl.ftol = eps; ctl.dtol = 1e-7; mytic() ; if strcmp(optim_func,"bfgsmin") ctl = {-1,2,1,1}; xinit2 = {xinit}; else xinit2 = xinit; endif ## [xlev,vlev,nlev] = feval(optim_func, "ff", "dff", xinit) ; [xlev,vlev,nlev] = feval(optim_func, "ff", xinit2, ctl) ; tlev = mytic() ; if max (abs(xlev-truep)) > 1e-4, if verbose printf ("Error is too big : %8.3g\n", max (abs (xlev-truep))); end ok = 0; elseif verbose, printf ("ok 1\n"); end if verbose, printf (" Costs : init=%8.3g, final=%8.3g, best=%8.3g\n",... ff(xinit), vlev, ff(truep)); end if verbose printf ( " time : %8.3g\n",tlev); end if verbose && ok printf ( "All tests ok (there's just one test)\n"); end optim-1.4.1/inst/brent_line_min.m0000644000175000017500000001404612477523726015434 0ustar olafolaf## Copyright (C) 2009 Levente Torok ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {[@var{s},@var{v},@var{n}]} brent_line_min ( @var{f},@var{df},@var{args},@var{ctl} ) ## Line minimization of f along df ## ## Finds minimum of f on line @math{ x0 + dx*w | a < w < b } by ## bracketing. a and b are passed through argument ctl. ## ## @subheading Arguments ## @itemize @bullet ## @item @var{f} : string : Name of function. Must return a real value ## @item @var{args} : cell : Arguments passed to f or RxC : f's only argument. x0 must be at @var{args}@{ @var{ctl}(2) @} ## @item @var{ctl} : 5 : (optional) Control variables, described below. ## @end itemize ## ## @subheading Returned values ## @itemize @bullet ## @item @var{s} : 1 : Minimum is at x0 + s*dx ## @item @var{v} : 1 : Value of f at x0 + s*dx ## @item @var{nev} : 1 : Number of function evaluations ## @end itemize ## ## @subheading Control Variables ## @itemize @bullet ## @item @var{ctl}(1) : Upper bound for error on s Default=sqrt(eps) ## @item @var{ctl}(2) : Position of minimized argument in args Default= 1 ## @item @var{ctl}(3) : Maximum number of function evaluations Default= inf ## @item @var{ctl}(4) : a Default=-inf ## @item @var{ctl}(5) : b Default= inf ## @end itemize ## ## Default values will be used if ctl is not passed or if nan values are ## given. ## @end deftypefn function [s,gs,nev] = brent_line_min( f,dx,args,ctl ) verbose = 0; seps = sqrt (eps); # Default control variables tol = 10*eps; # sqrt (eps); narg = 1; maxev = inf; a = -inf; b = inf; if nargin >= 4, # Read arguments if !isnan (ctl (1)), tol = ctl(1); end if length (ctl)>=2 && !isnan (ctl(2)), narg = ctl(2); end if length (ctl)>=3 && !isnan (ctl(3)), maxev = ctl(3); end if length (ctl)>=4 && !isnan (ctl(4)), a = ctl(4); end if length (ctl)>=5 && !isnan (ctl(5)), b = ctl(5); end end # Otherwise, use defaults, def'd above if a>b, tmp=a; a=b; b=tmp; end if narg > length (args), printf ("brent_line_min : narg==%i > length (args)==%i",... narg, length (args)); keyboard end if ! iscell (args), args = {args}; endif x = args{ narg }; [R,C] = size (x); N = R*C; # Size of argument gs0 = gs = feval (f, args); nev = 1; # Initial value s = 0; if all (dx==0), return; end # Trivial case # If any of the bounds is infinite, find # finite bounds that bracket minimum if !isfinite (a) || !isfinite (b), if !isfinite (a) && !isfinite (b), [a,b, ga,gb, n] = __bracket_min (f, dx, narg, args); elseif !isfinite (a), [a,b, ga,gb, n] = __semi_bracket (f, -dx, -b, narg, args); tmp = a; a = -b; b = -tmp; tmp = ga; ga = gb; gb = tmp; else [a,b, ga,gb, n] = __semi_bracket (f, dx, a, narg, args); end nev += n; else args{narg} = x+a*dx; ga = feval( f, args ); args{narg} = x+b*dx; gb = feval( f, args ); nev += 2; end # End of finding bracket for minimum if a > b, # Check assumptions printf ("brent_line_min : a > b\n"); keyboard end s = 0.5*(a+b); args{narg} = x+ s*dx; gs = feval( f, args ); nev++; if verbose, printf ("[a,s,b]=[%.3e,%.3e,%.3e], [ga,gs,gb]=[%.3e,%.3e,%.3e]\n",... a,s,b,ga,gs,gb); end maxerr = 2*tol; while ( b-a > maxerr ) && nev < maxev, if gs > ga && gs > gb, # Check assumptions printf ("brent_line_min : gs > ga && gs > gb\n"); keyboard end if ga == gb && gb == gs, break end # Don't trust poly_2_ex for glued points # (see test_poly_2_ex). ## if min (b-s, s-a) > 10*seps, # If s is not glued to a or b and does not # look linear ## mydet = sum (l([2 3 1]).*f([3 1 2])-l([3 1 2]).*f([2 3 1])) mydet = sum ([s b a].*[gb ga gs] - [b a s].*[gs gb ga]); if min (b-s, s-a) > 10*seps && abs (mydet) > 10*seps && ... (t = poly_2_ex ([a,s,b], [ga, gs, gb])) < b && t > a, # t has already been set ## if t>=b || t<=a, ## printf ("brent_line_min : t is not in ]a,b[\n"); ## keyboard ## end # Otherwise, reduce the biggest of the # intervals, but not too much elseif s-a > b-s, t = max (0.5*(a+s), s-100*seps); else t = min (0.5*(s+b), s+100*seps); end if abs (t-s) < 0.51*maxerr, #sayif (verbose, "ungluing t and s\n"); t = s + (1 - 2*(s-a > b-s))*0.49*maxerr ; end if a > s || s > b, # Check assumptions printf ("brent_line_min : a > s || s > b\n"); keyboard end xt = args; args{narg} = x+t*dx; gt = feval( f, args ); nev++; if verbose, printf ("t = %.3e, gt = %.3e\n",t,gt); end if t ga + seps, # Check assumptions printf ("brent_line_min : gt > ga\n"); keyboard end if gt < gs, b = s; gb = gs; s = t; gs = gt; else a = t; ga = gt; end else # New point is in ]s,b[ if gt > gb + seps, # Check assumptions printf ("brent_line_min : gt > gb\n"); keyboard end if gt < gs, a = s; ga = gs; s = t; gs = gt; else b = t; gb = gt; end end if verbose, printf ("[a,s,b]=[%.3e,%.3e,%.3e], [ga,gs,gb]=[%.3e,%.3e,%.3e]\n",... a,s,b,ga,gs,gb); end ## keyboard ## [b-a, maxerr] end s2 = 0.5*(a+b); args{narg} = x + s2*dx; gs2 = feval (f, args ); nev++; if gs2 < gs, s = s2; gs = gs2; end if gs > gs0, printf ("brent_line_min : goes uphill by %8.3e\n",gs-gs0); keyboard end optim-1.4.1/inst/test_minimize_1.m0000644000175000017500000001541012477523726015544 0ustar olafolaf## Copyright (C) 2002 Etienne Grossmann ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## ok = test_minimize - Test that minimize works ok = 1; # Remains set if all ok. Set to 0 otherwise cnt = 0; # Test counter page_screen_output (0); page_output_immediately (1); if ! exist ("verbose"), verbose = 0; end N = 2; x0 = randn(N,1) ; y0 = randn(N,1) ; ## Return value function v = ff(x,y,t) A = [1 -1;1 1]; M = A'*diag([100,1])*A; v = (x(1:2) - y(1:2))'*M*(x(1:2)-y(1:2)) + 1; endfunction ## Return differential function dv = dff(x,y,t) if nargin < 3, t = 1; end if t == 1, N = length (x); else N = length (y); end A = [1 -1;1 1]; M = A'*diag([100,1])*A; dv = 2*(x(1:2)-y(1:2))'*M; if N>2, dv = [dv, zeros(1,N-2)]; end if t == 2, dv = -dv; end endfunction ## Return value, diff and 2nd diff function [v,dv,d2v] = d2ff(x,y,t) if nargin < 3, t = 1; end if t == 1, N = length (x); else N = length (y); end A = [1 -1;1 1]; M = A'*diag([100,1])*A; v = (x(1:2) - y(1:2))'*M*(x(1:2)-y(1:2)) + 1; dv = 2*(x(1:2)-y(1:2))'*M; d2v = zeros (N); d2v(1:2,1:2) = 2*M; if N>2, dv = [dv, zeros(1,N-2)]; end if t == 2, dv = -dv; end endfunction ## Return value, diff and inv of 2nd diff function [v,dv,d2v] = d2iff(x,y,t) if nargin < 3, t = 1; end if t == 1, N = length (x); else N = length (y); end A = [1 -1;1 1]; M = A'*diag([100,1])*A; v = (x(1:2) - y(1:2))'*M*(x(1:2)-y(1:2)) + 1; dv = 2*(x(1:2)-y(1:2))'*M; d2v = zeros (N); d2v(1:2,1:2) = inv (2*M); if N>2, dv = [dv, zeros(1,N-2)]; end if t == 2, dv = -dv; end endfunction ## PRint Now function prn (varargin), printf (varargin{:}); fflush (stdout); end if verbose prn ("\n Testing that minimize() works as it should\n\n"); prn (" Nparams = N = %i\n",N); fflush (stdout); end ## Plain run, just to make sure ###################################### ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("ff",{x0,y0,1}); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## See what 'backend' gives in that last case ######################## [method,ctl] = minimize ("ff",{x0,y0,1},"order",0,"backend"); cnt++; if ! ischar (method) || ! strcmp (method,"nelder_mead_min") if verbose if ischar (method) prn ("Wrong method '%s' != 'nelder_mead_min' was chosen\n", method); else prn ("minimize pretends to use a method that isn't a string\n"); end return end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end [xle2,vle2,nle2] = feval (method, "ff", {x0,y0,1}, ctl); cnt++; # nelder_mead_min is not very repeatable # because of restarts from random positions if max (abs (xlev-xle2)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-xle2))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## Run, w/ differential, just to make sure ########################### ## Minimum wrt 'x' is y0 # [xlev,vlev,nlev] = minimize ("ff",{x0,y0,1},"df","dff"); # cnt++; # if max (abs (xlev-y0)) > 100*sqrt (eps) # if verbose # prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); # end # ok = 0; # elseif verbose, prn ("ok %i\n",cnt); # en ## Run, w/ differential returned by function ('jac' option) ########## ## Minimum wrt 'x' is y0 # [xlev,vlev,nlev] = minimize ("d2ff",{x0,y0,1},"jac"); # cnt++; # if max (abs (xlev-y0)) > 100*sqrt (eps) # if verbose # prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); # end # ok = 0; # elseif verbose, prn ("ok %i\n",cnt); # end ## Run, w/ 2nd differential, just to make sure ####################### ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("ff",{x0,y0,1},"d2f","d2ff"); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## Use the 'hess' option, when f can return 2nd differential ######### ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("d2ff", {x0,y0,1},"hess"); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## Run, w/ inverse of 2nd differential, just to make sure ############ ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("ff", {x0,y0,1},"d2i","d2iff"); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## Use the 'ihess' option, when f can return pinv of 2nd differential ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("d2iff", {x0,y0,1},"ihess"); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## Run, w/ numerical differential #################################### ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("ff",{x0,y0,1},"ndiff"); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end ## Run, w/ numerical differential, specified by "order" ############## ## Minimum wrt 'x' is y0 [xlev,vlev,nlev] = minimize ("ff",{x0,y0,1},"order",1); cnt++; if max (abs (xlev-y0)) > 100*sqrt (eps) if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end # ## See what 'backend' gives in that last case ######################## # [method,ctl] = minimize ("ff",{x0,y0,1},"order",1,"backend"); # cnt++; # if ! strcmp (method,"bfgsmin") # if verbose # prn ("Wrong method '%s' != 'bfgsmin' was chosen\n", method); # end # ok = 0; # elseif verbose, prn ("ok %i\n",cnt); # end ## [xle2,vle2,nle2] = feval (method, "ff",{x0,y0,1}, ctl); [xle2,vle2,nle2] = minimize ("ff",{x0,y0,1},"order",1); cnt++; if max (abs (xlev-xle2)) > 100*eps if verbose prn ("Error is too big : %8.3g\n", max (abs (xlev-y0))); end ok = 0; elseif verbose, prn ("ok %i\n",cnt); end if verbose && ok prn ( "All tests ok\n"); end optim-1.4.1/inst/expfit.m0000644000175000017500000001013312477523726013740 0ustar olafolaf## Copyright (C) 2000 Gert Van den Eynde ## Copyright (C) 2002 Rolf Fabian ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## USAGE [alpha,c,rms] = expfit( deg, x1, h, y ) ## ## Prony's method for non-linear exponential fitting ## ## Fit function: \sum_1^{deg} c(i)*exp(alpha(i)*x) ## ## Elements of data vector y must correspond to ## equidistant x-values starting at x1 with stepsize h ## ## The method is fully compatible with complex linear ## coefficients c, complex nonlinear coefficients alpha ## and complex input arguments y, x1, non-zero h . ## Fit-order deg must be a real positive integer. ## ## Returns linear coefficients c, nonlinear coefficients ## alpha and root mean square error rms. This method is ## known to be more stable than 'brute-force' non-linear ## least squares fitting. ## ## Example ## x0 = 0; step = 0.05; xend = 5; x = x0:step:xend; ## y = 2*exp(1.3*x)-0.5*exp(2*x); ## error = (rand(1,length(y))-0.5)*1e-4; ## [alpha,c,rms] = expfit(2,x0,step,y+error) ## ## alpha = ## 2.0000 ## 1.3000 ## c = ## -0.50000 ## 2.00000 ## rms = 0.00028461 ## ## The fit is very sensitive to the number of data points. ## It doesn't perform very well for small data sets. ## Theoretically, you need at least 2*deg data points, but ## if there are errors on the data, you certainly need more. ## ## Be aware that this is a very (very,very) ill-posed problem. ## By the way, this algorithm relies heavily on computing the ## roots of a polynomial. I used 'roots.m', if there is ## something better please use that code. ## ## Demo for a complex fit-function: ## deg= 2; N= 20; x1= -(1+i), x= linspace(x1,1+i/2,N).'; ## h = x(2) - x(1) ## y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x ); ## A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise ## [alpha,c,rms]= expfit( deg, x1, h, y ) function [a,c,rms] = expfit(deg,x1,h,y) % Check input if deg<1, error('expfit: deg must be >= 1'); end if ~h, error('expfit: vanishing stepsize h'); end if ( N=length( y=y(:) ) ) < 2*deg error('expfit: less than %d samples',2*deg); end % Solve for polynomial coefficients A = hankel( y(1:N-deg),y(N-deg:N) ); s = - A(:,1:deg) \ A(:,deg+1); % Compose polynomial p = flipud([s;1]); % Compute roots u = roots(p); % nonlinear coefficients a = log(u)/h; % Compose second matrix A(i,j) = u(j)^(i-1) A = ( ones(N,1) * u(:).' ) .^ ( [0:N-1]' * ones(1,deg) ); % Solve linear system f = A\y; % linear coefficients c = f./exp( a*x1 ); % Compute rms of y - approx % where approx(i) = sum_k ( c(k) * exp(x(i)*a(k)) ) if nargout > 2 x = x1+h*[0:N-1]; approx = exp( x(:) * a(:).' ) * c(:); rms = sqrt( sumsq(approx - y) ); end endfunction %!demo % same as in help - part %! deg= 2; N= 20; x1= -(1+i), x= linspace(x1,1+i/2,N).'; %! h = x(2) - x(1) %! y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x ); %! A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise %! [alpha,c,rms]= expfit( deg, x1, h, y ) %!demo % demo for stepsize with negative real part %! deg= 2; N= 20; x1= +3*(1+i), x= linspace(x1,1+i/3,N).'; %! h = x(2) - x(1) %! y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x ); %! A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise %! [alpha,c,rms]= expfit( deg, x1, h, y ) %!demo %! x0 = 1.5; step = 0.05; xend = 5; %! a = [1.3, 2]'; %! c = [2, -0.5]'; %! v = 1e-4; %! %! x = x0:step:xend; %! y = exp (x(:) * a(:).') * c(:); %! err = randn (size (y)) * v; %! plot (x, y + err); %! %! [a_out, c_out, rms] = expfit (2, x0, step, y+err) optim-1.4.1/inst/wsolve.m0000644000175000017500000000745612477523726013776 0ustar olafolaf## Author: Paul Kienzle ## This program is granted to the public domain. ## [x,s] = wsolve(A,y,dy) ## ## Solve a potentially over-determined system with uncertainty in ## the values. ## ## A x = y +/- dy ## ## Use QR decomposition for increased accuracy. Estimate the ## uncertainty for the solution from the scatter in the data. ## ## The returned structure s contains ## ## normr = sqrt( A x - y ), weighted by dy ## R such that R'R = A'A ## df = n-p, n = rows of A, p = columns of A ## ## See polyconf for details on how to use s to compute dy. ## The covariance matrix is inv(R'*R). If you know that the ## parameters are independent, then uncertainty is given by ## the diagonal of the covariance matrix, or ## ## dx = sqrt(N*sumsq(inv(s.R'))') ## ## where N = normr^2/df, or N = 1 if df = 0. ## ## Example 1: weighted system ## ## A=[1,2,3;2,1,3;1,1,1]; xin=[1;2;3]; ## dy=[0.2;0.01;0.1]; y=A*xin+randn(size(dy)).*dy; ## [x,s] = wsolve(A,y,dy); ## dx = sqrt(sumsq(inv(s.R'))'); ## res = [xin, x, dx] ## ## Example 2: weighted overdetermined system y = x1 + 2*x2 + 3*x3 + e ## ## A = fullfact([3,3,3]); xin=[1;2;3]; ## y = A*xin; dy = rand(size(y))/50; y+=dy.*randn(size(y)); ## [x,s] = wsolve(A,y,dy); ## dx = s.normr*sqrt(sumsq(inv(s.R'))'/s.df); ## res = [xin, x, dx] ## ## Note there is a counter-intuitive result that scaling the ## uncertainty in the data does not affect the uncertainty in ## the fit. Indeed, if you perform a monte carlo simulation ## with x,y datasets selected from a normal distribution centered ## on y with width 10*dy instead of dy you will see that the ## variance in the parameters indeed increases by a factor of 100. ## However, if the error bars really do increase by a factor of 10 ## you should expect a corresponding increase in the scatter of ## the data, which will increase the variance computed by the fit. function [x_out,s]=wsolve(A,y,dy) if nargin < 2, usage("[x dx] = wsolve(A,y[,dy])"); end if nargin < 3, dy = []; end [nr,nc] = size(A); if nc > nr, error("underdetermined system"); end ## apply weighting term, if it was given if prod(size(dy))==1 A = A ./ dy; y = y ./ dy; elseif ~isempty(dy) A = A ./ (dy * ones (1, columns(A))); y = y ./ dy; endif ## system solution: A x = y => x = inv(A) y ## QR decomposition has good numerical properties: ## AP = QR, with P'P = Q'Q = I, and R upper triangular ## so ## inv(A) y = P inv(R) inv(Q) y = P inv(R) Q' y = P (R \ (Q' y)) ## Note that b is usually a vector and Q is matrix, so it will ## be faster to compute (y' Q)' than (Q' y). [Q,R,p] = qr(A,0); x = R\(y'*Q)'; x(p) = x; s.R = R; s.R(:,p) = R; s.df = nr-nc; s.normr = norm(y - A*x); if nargout == 0, cov = s.R'*s.R if s.df, normalized_chisq = s.normr^2/s.df, end x = x' else x_out = x; endif ## We can show that uncertainty dx = sumsq(inv(R'))' = sqrt(diag(inv(A'A))). ## ## Rather than calculate inv(A'A) directly, we are going to use the QR ## decomposition we have already computed: ## ## AP = QR, with P'P = Q'Q = I, and R upper triangular ## ## so ## ## A'A = PR'Q'QRP' = PR'RP' ## ## and ## ## inv(A'A) = inv(PR'RP') = inv(P')inv(R'R)inv(P) = P inv(R'R) P' ## ## For a permutation matrix P, ## ## diag(PXP') = P diag(X) ## ## so ## diag(inv(A'A)) = diag(P inv(R'R) P') = P diag(inv(R'R)) ## ## For R upper triangular, inv(R') = inv(R)' so inv(R'R) = inv(R)inv(R)'. ## Conveniently, for X upper triangular, diag(XX') = sumsq(X')', so ## ## diag(inv(A'A)) = P sumsq(inv(R)')' ## ## This is both faster and more accurate than computing inv(A'A) ## directly. ## ## One small problem: if R is not square then inv(R) does not exist. ## This happens when the system is underdetermined, but in that case ## you shouldn't be using wsolve. optim-1.4.1/inst/leasqr.m0000644000175000017500000007443312477523726013745 0ustar olafolaf## Copyright (C) 1992-1994 Richard Shrager ## Copyright (C) 1992-1994 Arthur Jutan ## Copyright (C) 1992-1994 Ray Muzic ## Copyright (C) 2010-2013 Olaf Till ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . ## -*- texinfo -*- ## @deftypefn {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}) ## @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}) ## @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}) ## @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}) ## @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}) ## @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}, @var{dFdp}) ## @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}, @var{dFdp}, @var{options}) ## @deftypefnx {Function File} {[@var{f}, @var{p}, @var{cvg}, @var{iter}, @var{corp}, @var{covp}, @var{covr}, @var{stdresid}, @var{Z}, @var{r2}] =} leasqr (@dots{}) ## Levenberg-Marquardt nonlinear regression. ## ## Input arguments: ## ## @table @var ## @item x ## Vector or matrix of independent variables. ## ## @item y ## Vector or matrix of observed values. ## ## @item pin ## Vector of initial parameters to be adjusted by leasqr. ## ## @item F ## Name of function or function handle. The function must be of the form ## @code{y = f(x, p)}, with y, x, p of the form @var{y}, @var{x}, @var{pin}. ## ## @item stol ## Scalar tolerance on fractional improvement in scalar sum of squares, i.e., ## @code{sum ((@var{wt} .* (@var{y}-@var{f}))^2)}. Set to 0.0001 if ## empty or not given; ## ## @item niter ## Maximum number of iterations. Set to 20 if empty or not given. ## ## @item wt ## Statistical weights (same dimensions as @var{y}). These should be ## set to be proportional to @code{sqrt (@var{y}) ^-1}, i.e., the ## covariance matrix of the data is assumed to be proportional to ## diagonal with diagonal equal to @code{(@var{wt}.^2)^-1}. The constant of ## proportionality will be estimated. Set to @code{ones (size ## (@var{y}))} if empty or not given. ## ## @item dp ## Fractional increment of @var{p} for numerical partial derivatives. Set ## to @code{0.001 * ones (size (@var{pin}))} if empty or not given. ## ## @itemize @bullet ## @item dp(j) > 0 means central differences on j-th parameter p(j). ## @item dp(j) < 0 means one-sided differences on j-th parameter p(j). ## @item dp(j) = 0 holds p(j) fixed, i.e., leasqr won't change initial guess: pin(j) ## @end itemize ## ## @item dFdp ## Name of partial derivative function in quotes or function handle. If ## not given or empty, set to @code{dfdp}, a slow but general partial ## derivatives function. The function must be of the form @code{prt = ## dfdp (x, f, p, dp, F [,bounds])}. For backwards compatibility, the ## function will only be called with an extra 'bounds' argument if the ## 'bounds' option is explicitly specified to leasqr (see dfdp.m). ## ## @item options ## Structure with multiple options. The following fields are recognized: ## ## @table @asis ## @item @qcode{fract_prec} ## Column vector (same length as @var{pin}) ## of desired fractional precisions in parameter estimates. ## Iterations are terminated if change in parameter vector (chg) ## relative to current parameter estimate is less than their ## corresponding elements in 'fract_prec', i.e., ## @code{all (abs (chg) < abs (options.fract_prec .* current_parm_est))} on two ## consecutive iterations. Defaults to @code{zeros (size (@var{pin}))}. ## ## @item @qcode{max_fract_change} ## Column vector (same length as @var{pin}) of maximum fractional step ## changes in parameter vector. ## Fractional change in elements of parameter vector is constrained to ## be at most 'max_fract_change' between sucessive iterations, i.e., ## @code{abs (chg(i)) = abs (min([chg(i), options.max_fract_change(i) * current param estimate]))}. ## Defaults to @code{Inf * ones (size (@var{pin}))}. ## ## @item @qcode{inequc} ## Cell-array containing up to four entries, ## two entries for linear inequality constraints and/or one or two ## entries for general inequality constraints. Initial parameters ## must satisfy these constraints. Either linear or general ## constraints may be the first entries, but the two entries for ## linear constraints must be adjacent and, if two entries are given ## for general constraints, they also must be adjacent. The two ## entries for linear constraints are a matrix (say m) and a vector ## (say v), specifying linear inequality constraints of the form ## `m.' * parameters + v >= 0'. If the constraints are just bounds, ## it is suggested to specify them in 'options.bounds' instead, ## since then some sanity tests are performed, and since the ## function 'dfdp.m' is guarantied not to violate constraints during ## determination of the numeric gradient only for those constraints ## specified as 'bounds' (possibly with violations due to a certain ## inaccuracy, however, except if no constraints except bounds are ## specified). The first entry for general constraints must be a ## differentiable vector valued function (say h), specifying general ## inequality constraints of the form `h (p[, idx]) >= 0'; p is the ## column vector of optimized paraters and the optional argument idx ## is a logical index. h has to return the values of all constraints ## if idx is not given, and has to return only the indexed ## constraints if idx is given (so computation of the other ## constraints can be spared). If a second entry for general ## constraints is given, it must be a function (say dh) which ## returnes a matrix whos rows contain the gradients of the ## constraint function h with respect to the optimized parameters. ## It has the form jac_h = dh (vh, p, dp, h, idx[, bounds]); p is ## the column vector of optimized parameters, and idx is a logical ## index --- only the rows indexed by idx must be returned (so ## computation of the others can be spared). The other arguments of ## dh are for the case that dh computes numerical gradients: vh is ## the column vector of the current values of the constraint ## function h, with idx already applied. h is a function h (p) to ## compute the values of the constraints for parameters p, it will ## return only the values indexed by idx. dp is a suggestion for ## relative step width, having the same value as the argument 'dp' ## of leasqr above. If bounds were specified to leasqr, they are ## provided in the argument bounds of dh, to enable their ## consideration in determination of numerical gradients. If dh is ## not specified to leasqr, numerical gradients are computed in the ## same way as with 'dfdp.m' (see above). If some constraints are ## linear, they should be specified as linear constraints (or ## bounds, if applicable) for reasons of performance, even if ## general constraints are also specified. ## ## @item @qcode{bounds} ## Two-column-matrix, one row for each ## parameter in @var{pin}. Each row contains a minimal and maximal value ## for each parameter. Default: [-Inf, Inf] in each row. If this ## field is used with an existing user-side function for 'dFdp' ## (see above) the functions interface might have to be changed. ## ## @item @qcode{equc} ## Equality constraints, specified the same ## way as inequality constraints (see field 'options.inequc'). ## Initial parameters must satisfy these constraints. ## Note that there is possibly a certain inaccuracy in honoring ## constraints, except if only bounds are specified. ## @emph{Warning}: If constraints (or bounds) are set, returned guesses ## of @var{corp}, @var{covp}, and @var{Z} are generally invalid, even if ## no constraints ## are active for the final parameters. If equality constraints are ## specified, @var{corp}, @var{covp}, and @var{Z} are not guessed at all. ## ## @item @qcode{cpiv} ## Function for complementary pivot algorithm ## for inequality constraints. Defaults to cpiv_bard. No different ## function is supplied. ## ## @end table ## ## For backwards compatibility, @var{options} can also be a matrix whose ## first and second column contains the values of @qcode{fract_prec} and ## @qcode{max_fract_change}, respectively. ## ## @end table ## ## Output: ## ## @table @var ## @item f ## Column vector of values computed: f = F(x,p). ## ## @item p ## Column vector trial or final parameters, i.e, the solution. ## ## @item cvg ## Scalar: = 1 if convergence, = 0 otherwise. ## ## @item iter ## Scalar number of iterations used. ## ## @item corp ## Correlation matrix for parameters. ## ## @item covp ## Covariance matrix of the parameters. ## ## @item covr ## Diag(covariance matrix of the residuals). ## ## @item stdresid ## Standardized residuals. ## ## @item Z ## Matrix that defines confidence region (see comments in the source). ## ## @item r2 ## Coefficient of multiple determination, intercept form. ## ## @end table ## ## Not suitable for non-real residuals. ## ## References: ## Bard, Nonlinear Parameter Estimation, Academic Press, 1974. ## Draper and Smith, Applied Regression Analysis, John Wiley and Sons, 1981. ## ## @end deftypefn function [f,p,cvg,iter,corp,covp,covr,stdresid,Z,r2]= ... leasqr(x,y,pin,F,stol,niter,wt,dp,dFdp,options) ## The following two blocks of comments are chiefly from the original ## version for Matlab. For later changes the logs of the Octave Forge ## svn repository should also be consulted. ## A modified version of Levenberg-Marquardt ## Non-Linear Regression program previously submitted by R.Schrager. ## This version corrects an error in that version and also provides ## an easier to use version with automatic numerical calculation of ## the Jacobian Matrix. In addition, this version calculates statistics ## such as correlation, etc.... ## ## Version 3 Notes ## Errors in the original version submitted by Shrager (now called ## version 1) and the improved version of Jutan (now called version 2) ## have been corrected. ## Additional features, statistical tests, and documentation have also been ## included along with an example of usage. BEWARE: Some the the input and ## output arguments were changed from the previous version. ## ## Ray Muzic ## Arthur Jutan ## Richard I. Shrager (301)-496-1122 ## Modified by A.Jutan (519)-679-2111 ## Modified by Ray Muzic 14-Jul-1992 ## 1) add maxstep feature for limiting changes in parameter estimates ## at each step. ## 2) remove forced columnization of x (x=x(:)) at beginning. x ## could be a matrix with the ith row of containing values of ## the independent variables at the ith observation. ## 3) add verbose option ## 4) add optional return arguments covp, stdresid, chi2 ## 5) revise estimates of corp, stdev ## Modified by Ray Muzic 11-Oct-1992 ## 1) revise estimate of Vy. remove chi2, add Z as return values ## (later remark: the current code contains no variable Vy) ## Modified by Ray Muzic 7-Jan-1994 ## 1) Replace ones(x) with a construct that is compatible with versions ## newer and older than v 4.1. ## 2) Added global declaration of verbose (needed for newer than v4.x) ## 3) Replace return value var, the variance of the residuals ## with covr, the covariance matrix of the residuals. ## 4) Introduce options as 10th input argument. Include ## convergence criteria and maxstep in it. ## 5) Correct calculation of xtx which affects coveraince estimate. ## 6) Eliminate stdev (estimate of standard deviation of ## parameter estimates) from the return values. The covp is a ## much more meaningful expression of precision because it ## specifies a confidence region in contrast to a confidence ## interval.. If needed, however, stdev may be calculated as ## stdev=sqrt(diag(covp)). ## 7) Change the order of the return values to a more logical order. ## 8) Change to more efficent algorithm of Bard for selecting epsL. ## 9) Tighten up memory usage by making use of sparse matrices (if ## MATLAB version >= 4.0) in computation of covp, corp, stdresid. ## Modified by Francesco Potortì ## for use in Octave ## Added linear inequality constraints with quadratic programming to ## this file and special case bounds to this file and to dfdp.m ## (24-Feb-2010) and later also general inequality constraints ## (12-Apr-2010) (Reference: Bard, Y., 'An eclectic approach to ## nonlinear programming', Proc. ANU Sem. Optimization, Canberra, ## Austral. Nat. Univ.). Differences from the reference: adaption to ## svd-based algorithm, linesearch or stepwidth adaptions to ensure ## decrease in objective function omitted to rather start a new ## overall cycle with a new epsL, some performance gains from linear ## constraints even if general constraints are specified. Equality ## constraints also implemented. Olaf Till ## Now split into files leasqr.m and __lm_svd__.m. __plot_cmds__ (); # flag persistent variables invalid global verbose; ## argument processing ## if (nargin < 4) print_usage (); endif if (nargin > 8 && ! isempty (dFdp)) if (ischar (dFdp)) dfdp = str2func (dFdp); else dfdp = dFdp; endif endif if (nargin <= 7 || isempty (dp)) dp=.001*(pin*0+1); endif #DT if (nargin <= 6 || isempty (wt)) wt = ones (size (y)); endif #SMB modification if (nargin <= 5) niter = []; endif if (nargin == 4 || isempty (stol)) stol=.0001; endif if (ischar (F)) F = str2func (F); endif ## if (any (size (y) ~= size (wt))) error ("dimensions of observations and weights do not match"); endif wtl = wt(:); pin=pin(:); dp=dp(:); #change all vectors to columns [rows_y, cols_y] = size (y); m = rows_y * cols_y; n=length(pin); f_pin = F (x, pin); if (any (size (f_pin) ~= size (y))) error ("dimensions of returned values of model function and of observations do not match"); endif f_pin = y - f_pin; dFdp = @ (p, dfdp_hook) - dfdp (x, y(:) - dfdp_hook.f, p, dp, F); ## processing of 'options' pprec = zeros (n, 1); maxstep = Inf * ones (n, 1); have_gencstr = false; # no general constraints have_genecstr = false; # no general equality constraints n_gencstr = 0; mc = zeros (n, 0); vc = zeros (0, 1); rv = 0; emc = zeros (n, 0); evc = zeros (0, 1); erv = 0; bounds = cat (2, -Inf * ones (n, 1), Inf * ones (n, 1)); pin_cstr.inequ.lin_except_bounds = [];; pin_cstr.inequ.gen = [];; pin_cstr.equ.lin = [];; pin_cstr.equ.gen = [];; dfdp_bounds = {}; cpiv = @ cpiv_bard; eq_idx = []; # numerical index for equality constraints in all # constraints, later converted to # logical index if (nargin > 9) if (isnumeric (options)) # backwards compatibility tp = options; options = struct ("fract_prec", tp(:, 1)); if (columns (tp) > 1) options.max_fract_change = tp(:, 2); endif endif if (isfield (options, "cpiv") && ~isempty (options.cpiv)) ## As yet there is only one cpiv function distributed with leasqr, ## but this may change; the algorithm of cpiv_bard is said to be ## relatively fast, but may have disadvantages. if (ischar (options.cpiv)) cpiv = str2func (options.cpiv); else cpiv = options.cpiv; endif endif if (isfield (options, "fract_prec")) pprec = options.fract_prec; if (any (size (pprec) ~= [n, 1])) error ("fractional precisions: wrong dimensions"); endif endif if (isfield (options, "max_fract_change")) maxstep = options.max_fract_change; if (any (size (maxstep) ~= [n, 1])) error ("maximum fractional step changes: wrong dimensions"); endif endif if (isfield (options, "inequc")) inequc = options.inequc; if (isnumeric (inequc{1})) mc = inequc{1}; vc = inequc{2}; if (length (inequc) > 2) have_gencstr = true; f_gencstr = inequc{3}; if (length (inequc) > 3) df_gencstr = inequc{4}; else df_gencstr = @ dcdp; endif endif else lid = 0; # no linear constraints have_gencstr = true; f_gencstr = inequc{1}; if (length (inequc) > 1) if (isnumeric (inequc{2})) lid = 2; df_gencstr = @ dcdp; else df_gencstr = inequc{2}; if (length (inequc) > 2) lid = 3; endif endif else df_gencstr = @ dcdp; endif if (lid) mc = inequc{lid}; vc = inequc{lid + 1}; endif endif if (have_gencstr) if (ischar (f_gencstr)) f_gencstr = str2func (f_gencstr); endif tp = f_gencstr (pin); n_gencstr = length (tp); f_gencstr = @ (p, idx) tf_gencstr (p, idx, f_gencstr); if (ischar (df_gencstr)) df_gencstr = str2func (df_gencstr); endif if (strcmp (func2str (df_gencstr), "dcdp")) df_gencstr = @ (f, p, dp, idx, db) ... df_gencstr (f(idx), p, dp, ... @ (tp) f_gencstr (tp, idx), db{:}); else df_gencstr = @ (f, p, dp, idx, db) ... df_gencstr (f(idx), p, dp, ... @ (tp) f_gencstr (tp, idx), idx, db{:}); endif endif [rm, cm] = size (mc); [rv, cv] = size (vc); if (rm ~= n || cm ~= rv || cv ~= 1) error ("linear inequality constraints: wrong dimensions"); endif pin_cstr.inequ.lin_except_bounds = mc.' * pin + vc; if (have_gencstr) pin_cstr.inequ.gen = tp; endif endif if (isfield (options, "equc")) equc = options.equc; if (isnumeric (equc{1})) emc = equc{1}; evc = equc{2}; if (length (equc) > 2) have_genecstr = true; f_genecstr = equc{3}; if (length (equc) > 3) df_genecstr = equc{4}; else df_genecstr = @ dcdp; endif endif else lid = 0; # no linear constraints have_genecstr = true; f_genecstr = equc{1}; if (length (equc) > 1) if (isnumeric (equc{2})) lid = 2; df_genecstr = @ dcdp; else df_genecstr = equc{2}; if (length (equc) > 2) lid = 3; endif endif else df_genecstr = @ dcdp; endif if (lid) emc = equc{lid}; evc = equc{lid + 1}; endif endif if (have_genecstr) if (ischar (f_genecstr)) f_genecstr = str2func (f_genecstr); endif tp = f_genecstr (pin); n_genecstr = length (tp); f_genecstr = @ (p, idx) tf_gencstr (p, idx, f_genecstr); if (ischar (df_genecstr)) df_genecstr = str2func (df_genecstr); endif if (strcmp (func2str (df_genecstr), "dcdp")) df_genecstr = @ (f, p, dp, idx, db) ... df_genecstr (f, p, dp, ... @ (tp) f_genecstr (tp, idx), db{:}); else df_genecstr = @ (f, p, dp, idx, db) ... df_genecstr (f, p, dp, ... @ (tp) f_genecstr (tp, idx), idx, db{:}); endif endif [erm, ecm] = size (emc); [erv, ecv] = size (evc); if (erm ~= n || ecm ~= erv || ecv ~= 1) error ("linear equality constraints: wrong dimensions"); endif pin_cstr.equ.lin = emc.' * pin + evc; if (have_genecstr) pin_cstr.equ.gen = tp; endif endif if (isfield (options, "bounds")) bounds = options.bounds; if (any (size (bounds) ~= [n, 2])) error ("bounds: wrong dimensions"); endif idx = bounds(:, 1) > bounds(:, 2); tp = bounds(idx, 2); bounds(idx, 2) = bounds(idx, 1); bounds(idx, 1) = tp; ## It is possible to take this decision here, since this frontend ## is used only with one certain backend. The backend will check ## this again; but it will not reference 'dp' in its message, ## thats why the additional check here. idx = bounds(:, 1) == bounds(:, 2); if (any (idx)) warning ("leasqr:constraints", "lower and upper bounds identical for some parameters, setting the respective elements of dp to zero"); dp(idx) = 0; endif ## tp = eye (n); lidx = ~isinf (bounds(:, 1)); uidx = ~isinf (bounds(:, 2)); mc = cat (2, mc, tp(:, lidx), - tp(:, uidx)); vc = cat (1, vc, - bounds(lidx, 1), bounds(uidx, 2)); [rm, cm] = size (mc); [rv, cv] = size (vc); dfdp_bounds = {bounds}; dFdp = @ (p, dfdp_hook) - dfdp (x, y(:) - dfdp_hook.f, p, dp, ... F, bounds); endif ## concatenate inequality and equality constraint functions, mc, and ## vc; update eq_idx, rv, n_gencstr, have_gencstr if (erv > 0) mc = cat (2, mc, emc); vc = cat (1, vc, evc); eq_idx = rv + 1 : rv + erv; rv = rv + erv; endif if (have_genecstr) eq_idx = cat (2, eq_idx, ... rv + n_gencstr + 1 : rv + n_gencstr + n_genecstr); nidxi = 1 : n_gencstr; nidxe = n_gencstr + 1 : n_gencstr + n_genecstr; n_gencstr = n_gencstr + n_genecstr; if (have_gencstr) f_gencstr = @ (p, idx) cat (1, ... f_gencstr (p, idx(nidxi)), ... f_genecstr (p, idx(nidxe))); df_gencstr = @ (f, p, dp, idx, db) ... cat (1, ... df_gencstr (f(nidxi), p, dp, idx(nidxi), db), ... df_genecstr (f(nidxe), p, dp, idx(nidxe), db)); else f_gencstr = f_genecstr; df_gencstr = df_genecstr; have_gencstr = true; endif endif endif if (have_gencstr) nidxl = 1:rv; nidxh = rv+1:rv+n_gencstr; f_cstr = @ (p, idx) ... cat (1, mc(:, idx(nidxl)).' * p + vc(idx(nidxl), 1), ... f_gencstr (p, idx(nidxh))); ## in the case of this interface, diffp is already zero at fixed; ## also in this special case, dfdp_bounds can be filled in directly ## --- otherwise it would be a field of hook in the called function df_cstr = @ (p, idx, dfdp_hook) ... cat (1, mc(:, idx(nidxl)).', ... df_gencstr (dfdp_hook.f(nidxh), p, dp, ... idx(nidxh), ... dfdp_bounds)); else f_cstr = @ (p, idx) mc(:, idx).' * p + vc(idx, 1); df_cstr = @ (p, idx, dfdp_hook) mc(:, idx).'; endif ## in a general interface, check for all(fixed) here ## passed constraints hook.mc = mc; # matrix of linear constraints hook.vc = vc; # vector of linear constraints hook.f_cstr = f_cstr; # function of all constraints hook.df_cstr = df_cstr; # function of derivatives of all constraints hook.n_gencstr = n_gencstr; # number of non-linear constraints hook.eq_idx = false (size (vc, 1) + n_gencstr, 1); hook.eq_idx(eq_idx) = true; # logical index of equality constraints in # all constraints hook.lbound = bounds(:, 1); # bounds, subset of linear inequality # constraints in mc and vc hook.ubound = bounds(:, 2); ## passed values of constraints for initial parameters hook.pin_cstr = pin_cstr; ## passed derivative of model function hook.dfdp = dFdp; ## passed function for complementary pivoting hook.cpiv = cpiv; ## passed value of residual function for initial parameters hook.f_pin = f_pin; ## passed options hook.max_fract_change = maxstep; hook.fract_prec = pprec; hook.TolFun = stol; hook.MaxIter = niter; hook.weights = wt; hook.fixed = dp == 0; if (verbose) hook.Display = "iter"; hook.plot_cmd = @ (f) 0; # `plot_cmd' is deprecated hook.user_interaction = ... {@ (p, v, s) ... {ifelse(strcmp(s, "iter"), false(__plot_cmds__(x, y, y - v.residual)), false), []}{:}}; else hook.Display = "off"; hook.plot_cmd = @ (f) 0; # `plot_cmd' is deprecated hook.user_interaction = {}; endif ## only preliminary, for testing hook.testing = false; hook.new_s = false; if (nargin > 9) if (isfield (options, "testing")) hook.testing = options.testing; endif if (isfield (options, "new_s")) hook.new_s = options.new_s; endif endif [p, resid, cvg, outp] = __lm_svd__ (@ (p) y - F (x, p), pin, hook); f = y - resid; iter = outp.niter; cvg = cvg > 0; if (~cvg) disp(' CONVERGENCE NOT ACHIEVED! '); endif if (~(verbose || nargout > 4)) return; endif yl = y(:); f = f(:); ## CALC VARIANCE COV MATRIX AND CORRELATION MATRIX OF PARAMETERS ## re-evaluate the Jacobian at optimal values jac = dFdp (p, struct ("f", f)); msk = ~hook.fixed; n = sum(msk); # reduce n to equal number of estimated parameters jac = jac(:, msk); # use only fitted parameters ## following section is Ray Muzic's estimate for covariance and correlation ## assuming covariance of data is a diagonal matrix proportional to ## diag(1/wt.^2). ## cov matrix of data est. from Bard Eq. 7-5-13, and Row 1 Table 5.1 tp = wtl.^2; if (exist('sparse')) # save memory Q = sparse (1:m, 1:m, 1 ./ tp); Qinv = sparse (1:m, 1:m, tp); else Q = diag (ones (m, 1) ./ tp); Qinv = diag (tp); endif resid = resid(:); # un-weighted residuals if (~isreal (resid)) error ("residuals are not real"); endif tp = resid.' * Qinv * resid; covr = (tp / m) * Q; #covariance of residuals ## Matlab compatibility and avoiding recomputation make the following ## logic clumsy. compute = 1; if (m <= n || any (eq_idx)) compute = 0; else Qinv = ((m - n) / tp) * Qinv; ## simplified Eq. 7-5-13, Bard; cov of parm est, inverse; outer ## parantheses contain inverse of guessed covariance matrix of data covpinv = jac.' * Qinv * jac; if (exist ('rcond')) if (rcond (covpinv) <= eps) compute = 0; endif elseif (rank (covpinv) < n) ## above test is not equivalent to 'rcond' and may unnecessarily ## reject some matrices compute = 0; endif endif if (compute) covp = inv (covpinv); d=sqrt(diag(covp)); corp = covp ./ (d * d.'); else covp = NA * ones (n); corp = covp; endif if (exist('sparse')) covr=spdiags(covr,0); else covr=diag(covr); # convert returned values to # compact storage endif covr = reshape (covr, rows_y, cols_y); stdresid = resid .* abs (wtl) / sqrt (tp / m); # equivalent to resid ./ # sqrt (covr) stdresid = reshape (stdresid, rows_y, cols_y); if (~(verbose || nargout > 8)) return; endif if (m > n && ~any (eq_idx)) Z = ((m - n) / (n * resid.' * Qinv * resid)) * covpinv; else Z = NA * ones (n); endif ### alt. est. of cov. mat. of parm.:(Delforge, Circulation, 82:1494-1504, 1990 ##disp('Alternate estimate of cov. of param. est.') ##acovp=resid'*Qinv*resid/(m-n)*inv(jac'*Qinv*jac); if (~(verbose || nargout > 9)) return; endif ##Calculate R^2, intercept form ## tp = sumsq (yl - mean (yl)); if (tp > 0) r2 = 1 - sumsq (resid) / tp; else r2 = NA; endif ## if someone has asked for it, let them have it ## if (verbose) __plot_cmds__ (x, y, f); disp(' Least Squares Estimates of Parameters') disp(p.') disp(' Correlation matrix of parameters estimated') disp(corp) disp(' Covariance matrix of Residuals' ) disp(covr) disp(' Correlation Coefficient R^2') disp(r2) fprintf(" 95%% conf region: F(0.05)(%.0f,%.0f)>= delta_pvec.%s*Z*delta_pvec\n", n, m - n, char (39)); # works with " and ' Z ## runs test according to Bard. p 201. n1 = sum (resid > 0); n2 = sum (resid < 0); nrun=sum(abs(diff(resid > 0)))+1; if ((n1 > 10) && (n2 > 10)) # sufficent data for test? zed=(nrun-(2*n1*n2/(n1+n2)+1)+0.5)/(2*n1*n2*(2*n1*n2-n1-n2)... /((n1+n2)^2*(n1+n2-1))); if (zed < 0) prob = erfc(-zed/sqrt(2))/2*100; disp([num2str(prob),"% chance of fewer than ",num2str(nrun)," runs."]); else prob = erfc(zed/sqrt(2))/2*100; disp([num2str(prob),"% chance of greater than ",num2str(nrun)," runs."]); endif endif endif endfunction function ret = tf_gencstr (p, idx, f) ## necessary since user function f_gencstr might return [] or a row ## vector ret = f (p, idx); if (isempty (ret)) ret = zeros (0, 1); elseif (size (ret, 2) > 1) ret = ret(:); endif endfunction %!demo %! % Define functions %! leasqrfunc = @(x, p) p(1) * exp (-p(2) * x); %! leasqrdfdp = @(x, f, p, dp, func) [exp(-p(2)*x), -p(1)*x.*exp(-p(2)*x)]; %! %! % generate test data %! t = [1:10:100]'; %! p = [1; 0.1]; %! data = leasqrfunc (t, p); %! %! rnd = [0.352509; -0.040607; -1.867061; -1.561283; 1.473191; ... %! 0.580767; 0.841805; 1.632203; -0.179254; 0.345208]; %! %! % add noise %! % wt1 = 1 /sqrt of variances of data %! % 1 / wt1 = sqrt of var = standard deviation %! wt1 = (1 + 0 * t) ./ sqrt (data); %! data = data + 0.05 * rnd ./ wt1; %! %! % Note by Thomas Walter : %! % %! % Using a step size of 1 to calculate the derivative is WRONG !!!! %! % See numerical mathbooks why. %! % A derivative calculated from central differences need: s %! % step = 0.001...1.0e-8 %! % And onesided derivative needs: %! % step = 1.0e-5...1.0e-8 and may be still wrong %! %! F = leasqrfunc; %! dFdp = leasqrdfdp; % exact derivative %! % dFdp = dfdp; % estimated derivative %! dp = [0.001; 0.001]; %! pin = [.8; .05]; %! stol=0.001; niter=50; %! minstep = [0.01; 0.01]; %! maxstep = [0.8; 0.8]; %! options = [minstep, maxstep]; %! %! global verbose; %! verbose = 1; %! [f1, p1, kvg1, iter1, corp1, covp1, covr1, stdresid1, Z1, r21] = ... %! leasqr (t, data, pin, F, stol, niter, wt1, dp, dFdp, options); %!demo %! %% Example for linear inequality constraints. %! %% model function: %! F = @ (x, p) p(1) * exp (p(2) * x); %! %% independents and dependents: %! x = 1:5; %! y = [1, 2, 4, 7, 14]; %! %% initial values: %! init = [.25; .25]; %! %% other configuration (default values): %! tolerance = .0001; %! max_iterations = 20; %! weights = ones (1, 5); %! dp = [.001; .001]; % bidirectional numeric gradient stepsize %! dFdp = "dfdp"; % function for gradient (numerical) %! %! %% linear constraints, A.' * parametervector + B >= 0 %! A = [1; -1]; B = 0; % p(1) >= p(2); %! options.inequc = {A, B}; %! %! %% start leasqr, be sure that 'verbose' is not set %! global verbose; verbose = false; %! [f, p, cvg, iter] = ... %! leasqr (x, y, init, F, tolerance, max_iterations, ... %! weights, dp, dFdp, options) optim-1.4.1/inst/bfgsmin_example.m0000644000175000017500000001320212477523726015601 0ustar olafolaf## Copyright (C) 2004,2005,2006 Michael Creel ## ## This program is free software; you can redistribute it and/or modify it under ## the terms of the GNU General Public License as published by the Free Software ## Foundation; either version 3 of the License, or (at your option) any later ## version. ## ## This program is distributed in the hope that it will be useful, but WITHOUT ## ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or ## FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more ## details. ## ## You should have received a copy of the GNU General Public License along with ## this program; if not, see . # usage: bfgsmin_example (to run) or edit bfgsmin_example (to examine) ## # Shows how to call bfgsmin. There are two objective functions, the first # supplies the analytic gradient, and the second does not. The true minimizer # is at "location", a 50x1 vector (0.00, 0.02, 0.04 ..., 1.00). # Note that limited memory bfgs is faster when the dimension is high. # Also note that supplying analytic derivatives gives a speedup. ## # Six examples are given: # Example 1: regular bfgs, analytic gradient # Example 2: same as Example 1, but verbose # Example 3: limited memory bfgs, analytic gradient # Example 4: regular bfgs, numeric gradient # Example 5: limited memory bfgs, numeric gradient # Example 6: regular bfgs, analytic gradient, minimize wrt second argument 1; # example obj. fn.: supplies analytic gradient function [obj_value, gradient] = objective(theta, location) x = theta - location + ones(rows(theta),1); # move minimizer to "location" [obj_value, gradient] = rosenbrock(x); endfunction # example obj. fn.: no gradient function obj_value = objective2(theta, location) x = theta - location + ones(rows(theta),1); # move minimizer to "location" obj_value = rosenbrock(x); endfunction # initial values dim = 20; # dimension of Rosenbrock function theta0 = zeros(dim+1,1); # starting values location = (0:dim)/dim; # true values location = location'; printf("EXAMPLE 1: Ordinary BFGS, using analytic gradient\n"); t=cputime(); control = {Inf,1}; # maxiters, verbosity [theta, obj_value, convergence] = bfgsmin("objective", {theta0, location}, control); fflush(1); t1 = cputime() - t; conv = norm(theta-location, 'inf'); test1 = conv < 1e-5; printf("EXAMPLE 2: Same as Example 1, but verbose\n"); t=cputime(); control = {-1;3}; # maxiters, verbosity [theta, obj_value, convergence] = bfgsmin("objective", {theta0, location}, control); fflush(1); t2 = cputime() - t; conv = norm(theta-location, 'inf'); test2 = conv < 1e-5; printf("EXAMPLE 3: Limited memory BFGS, using analytic gradient\n"); t=cputime(); control = {-1;1;1;1;3}; # maxiters, verbosity, conv. requirement., arg_to_min, lbfgs memory [theta, obj_value, convergence] = bfgsmin("objective", {theta0, location}, control); fflush(1); t3 = cputime() - t; conv = norm(theta-location, 'inf'); test3 = conv < 1e-5; printf("EXAMPLE 4: Ordinary BFGS, using numeric gradient\n"); t=cputime(); control = {-1;1}; # maxiters, verbosity [theta, obj_value, convergence] = bfgsmin("objective2", {theta0, location}, control); fflush(1); t4 = cputime() - t; conv = norm(theta-location, 'inf'); test4 = conv < 1e-5; printf("EXAMPLE 5: Limited memory BFGS, using numeric gradient\n"); t=cputime(); control = {-1;1;1;1;3}; # maxiters, verbosity, conv. reg., arg_to_min, lbfgs memory [theta, obj_value, convergence] = bfgsmin("objective2", {theta0, location}, control); fflush(1); t5 = cputime() - t; conv = norm(theta-location, 'inf'); test5 = conv < 1e-5; printf("EXAMPLE 6: Funny case: minimize w.r.t. second argument, Ordinary BFGS, using numeric gradient\n"); t=cputime(); control = {-1;1;1;2}; # maxiters, verbosity, conv. reg., arg_to_min [theta, obj_value, convergence] = bfgsmin("objective2", {location, theta0}, control); fflush(1); t6 = cputime() - t; conv = norm(theta-location, 'inf'); test6 = conv < 1e-5; printf("EXAMPLE 7: Ordinary BFGS, using numeric gradient, using non-default tolerances\n"); t=cputime(); control = {-1;1;1;1;0;1e-6;1e-2;1e-2}; # maxiters, verbosity, conv. reg., arg_to_min, lbfgs memory, fun. tol., param. tol., gradient tol. [theta, obj_value, convergence] = bfgsmin("objective2", {theta0, location}, control); fflush(1); t7 = cputime() - t; conv = norm(theta-location, 'inf'); test7 = conv < 1e-2; printf("EXAMPLE 1: Ordinary BFGS, using analytic gradient\n"); if test1 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t1); printf("EXAMPLE 2: Same as Example 1, but verbose\n"); if test2 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t2); printf("EXAMPLE 3: Limited memory BFGS, using analytic gradient\n"); if test3 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t3); printf("EXAMPLE 4: Ordinary BFGS, using numeric gradient\n"); if test4 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t4); printf("EXAMPLE 5: Limited memory BFGS, using numeric gradient\n"); if test5 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t5); printf("EXAMPLE 6: Funny case: minimize w.r.t. second argument, Ordinary BFGS, using numeric gradient\n"); if test6 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t6); printf("EXAMPLE 7: Ordinary BFGS, using numeric gradient, using non-default tolerances\n"); if test7 printf("Success!! :-)\n"); else printf("Failure?! :-(\n"); endif printf("Elapsed time = %f\n\n\n\n",t7); optim-1.4.1/DESCRIPTION0000644000175000017500000000051412477523726013016 0ustar olafolafName: Optim Version: 1.4.1 Date: 2015-03-10 Author: various authors Maintainer: Octave-Forge community Title: Optimization. Description: Non-linear optimization toolkit. Depends: octave (>= 3.6.0), struct (>= 1.0.10) Autoload: no License: GFDL, GPLv3+, modified BSD, public domain Url: http://octave.sf.net optim-1.4.1/COPYING0000644000175000017500000000004212477523726012337 0ustar olafolafSee individual files for licenses optim-1.4.1/NEWS0000644000175000017500000001307412477523726012014 0ustar olafolafoptim 1.4.1: ------------ ** Made compatible with Octave 4.0. ** Fixed bug #43699. optim 1.4.0: ------------ ** There is now general package documentation in info format. To display it, type `optim_doc ()' or `optim_doc (keyword)'. ** Core Octaves `sqp' can now be used by the `nonlin_min' frontend. ** New backend `d2_min' for `nonlin_min' frontend, the standalone function `d2_min' is deprecated instead. ** Introduced option `user_interaction' into frontends, `nonlin_min', `nonlin_residmin', and `nonlin_curvefit'. ** Package `optim' does not depend on package `miscellaneous' anymore. ** The frontend `minimize' has been deprecated. An alternative frontend is `nonlin_min', released previously. The backends `nelder_mead_min' and `bfgsmin' of `minimize' can also be called directly, the backend `d2_min' is also deprecated and its algorithm is now available as a backend to `nonlin_min'. `poly_2_ex' is scheduled for removal from optim since it is not related to optimization. optim 1.3.1: ------------ ** Fixed parameter fixing issue in residmin_stat/curvefit_stat. ** Simulated annealing backend of nonlin_min can save and recover state, for interrupting and continuing long optimizations. ** Having the parallel package loaded is now optional. ** Simulated annealing backend of nonlin_min can do some parallel computation. optim 1.3.0: ------------ ** LinearRegression: Two demo codes added ad the confidence band for the dependent variable can be estimated. ** Bugfix in jacobs for functions of more than 2 variables. ** Parallel computation of gradients supported by nonlin_residmin, nonlin_curvefit, and nonlin_min for default and for complex step gradient methods. Introduces dependency on parallel package. ** In nonlin_residmin, nonlin_curvefit, nonlin_min, residmin_stat, and curvefit_stat, the following names of optimset options had to be changed to be consistent with interpretation of abbreviated option names: dfdp_pstruct -> df_pstruct, inequc_f_idx -> f_inequc_idx, inequc_df_idx -> df_inequc_idxn, equc_f_idx -> f_equc_idx, equc_df_idx -> df_equc_idx, objf_grad_pstruct -> grad_objf_pstruct, objf_hessian_pstruct -> hessian_objf_pstruct, objf -> objf_type, complex_step_derivative -> complex_step_derivative_f. ** The function fminsearch was removed from the optim package, since it is available in Octave 3.8.0. ** The depricated functions optimset_compat and fminunc_compat are now removed. optim 1.2.2: ------------ ** Bugfixes in nonlin_min and nonlin_residmin/nonlin_curvefit in termination criteria, argument checking, and documentaion. ** Bugfix in private/__lm_feasible__.m: Wrong parantheses in checking of equality constraints for initial parameters (reported by Marcus Schmidt). optim 1.2.1: ------------ ** Bugfix in leasqr.m: errors when a function 'options' is in namespace. Summary of important user-visible changes for optim 1.2.0: ------------------------------------------------------------------- ** Together with the new backend "lm_feasible" there is now a complete suite of backends for optimization with linear and general equality and inequality constraints, for scalar valued objective functions and for array valued model function, which features, a.o., honouring of constraints throughout optimization and handling of structure-based parameters. The respective user functions (frontends) are nonlin_min nonlin_residmin nonlin_curvefit together with a user function for statistics residmin_stat ** The requirement of nonlin_min, nonlin_residmin, and nonlin_curvefit for the general constraint functions being able to honour an index of constraints has been removed, the respective feature is still available by setting some options. ** Makefile fixed to work with non-standard linker options e.g on Apple. Summary of important user-visible changes for optim 1.1.0: ------------------------------------------------------------------- ** The following functions are new optim 1.1.0: powell cauchy nonlin_min ** The following functions have been deprecated since they have been implemented in Octave core: fminunc_compat optimset_compat ** The function `fmin' has been deprecated in favour of using `fminbnd' directly. If one really wishes to use the short version, one can easily create an alias on an octaverc file (see `doc startup') with the following code function out=fmin(varargin) out=fminbnd(varargin{:}); endfunction ** The package Makefile has been adapted for compatibility with Octave 3.6.0. ** Bugfixes on the functions: deriv linprog ** The function `line_min' has a configurable setpesize and max evals. ** Added possibility to restrict a parameter to samin. ** Package is no longer automatically loaded. Some important changes of the last versions of optim before 1.1.0: ------------------------------------------------------------------ ** New functions: jacobs: complex step derivative approximation nonlin_residmin, nonlin_curvefit: Frontends with a general interface for constrained residual-based optimization. They allow a.o. optimization of structure-based named parameters or parameter-arrays. A backend is included, which is derived from the algorithm of leasqr, but enables feasible-path optimization with linear and general constraints. optim-1.4.1/src/0002755000175000017500000000000012477524560012076 5ustar olafolafoptim-1.4.1/src/numgradient.cc0000644000175000017500000001044612477523726014730 0ustar olafolaf// Copyright (C) 2004, 2006 Michael Creel // // This program is free software; you can redistribute it and/or modify it under // the terms of the GNU General Public License as published by the Free Software // Foundation; either version 3 of the License, or (at your option) any later // version. // // This program is distributed in the hope that it will be useful, but WITHOUT // ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or // FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more // details. // // You should have received a copy of the GNU General Public License along with // this program; if not, see . // numgradient: numeric central difference gradient #include #include #include #include #include // argument checks static bool any_bad_argument(const octave_value_list& args) { if (!args(0).is_string()) { error("numgradient: first argument must be string holding objective function name"); return true; } if (!args(1).is_cell()) { error("numgradient: second argument must cell array of function arguments"); return true; } // minarg, if provided if (args.length() == 3) { int tmp = args(2).int_value(); if (error_state) { error("numgradient: 3rd argument, if supplied, must an integer\n\ that specifies the argument wrt which differentiation is done"); return true; } if ((tmp > args(1).length())||(tmp < 1)) { error("numgradient: 3rd argument must be a positive integer that indicates \n\ which of the elements of the second argument is the\n\ one to differentiate with respect to"); return true; } } return false; } DEFUN_DLD(numgradient, args, , "numgradient(f, {args}, minarg)\n\ \n\ Numeric central difference gradient of f with respect\n\ to argument \"minarg\".\n\ * first argument: function name (string)\n\ * second argument: all arguments of the function (cell array)\n\ * third argument: (optional) the argument to differentiate w.r.t.\n\ (scalar, default=1)\n\ \n\ \"f\" may be vector-valued. If \"f\" returns\n\ an n-vector, and the argument is a k-vector, the gradient\n\ will be an nxk matrix\n\ \n\ Example:\n\ function a = f(x);\n\ a = [x'*x; 2*x];\n\ endfunction\n\ numgradient(\"f\", {ones(2,1)})\n\ ans =\n\ \n\ 2.00000 2.00000\n\ 2.00000 0.00000\n\ 0.00000 2.00000\n\ ") { int nargin = args.length(); if (!((nargin == 2)|| (nargin == 3))) { error("numgradient: you must supply 2 or 3 arguments"); return octave_value_list(); } // check the arguments if (any_bad_argument(args)) return octave_value_list(); std::string f (args(0).string_value()); Cell f_args_cell (args(1).cell_value()); octave_value_list f_args, f_return; Matrix obj_value, obj_left, obj_right; double SQRT_EPS, p, delta, diff; int i, j, k, n, minarg, test; // Default values for controls minarg = 1; // by default, first arg is one over which we minimize // copy cell contents over to octave_value_list to use feval() k = f_args_cell.length(); f_args(k - 1); // resize only once for (i = 0; i diff; if (test) delta = (fabs(p) + SQRT_EPS) * SQRT_EPS; else delta = diff; // right side parameter(j) = p + delta; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); obj_right = f_return(0).matrix_value(); // left size parameter(j) = p - delta; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); obj_left = f_return(0).matrix_value(); parameter(j) = p; // restore original parameter columnj = (obj_right - obj_left) / (2*delta); for (i=0; i Deltaf/max(|f(x)|,1) } where Deltaf is the decrease in f observed in the last iteration (each iteration consists R*C line searches). @item @var{ctl}(1)==2 : Stopping criterion : Stop search when updates are small, as tested by @math{ ctl(2) > max @{ dx(i)/max(|x(i)|,1) | i in 1..N @}} where dx is the change in the x that occured in the last iteration. @item @var{ctl}(2) : Threshold used in stopping tests. Default=10*eps @item @var{ctl}(2)==0 : Default value @item @var{ctl}(3) : Position of the minimized argument in args Default=1 @item @var{ctl}(3)==0 : Default value @item @var{ctl}(4) : Maximum number of function evaluations Default=inf @item @var{ctl}(4)==0 : Default value @item @var{ctl}(5) : Type of optimization: @item @var{ctl}(5)==1 : "Fletcher-Reves" method @item @var{ctl}(5)==2 : "Polak-Ribiere" (Default) @item @var{ctl}(5)==3 : "Hestenes-Stiefel" method @end itemize @var{ctl} may have length smaller than 4. Default values will be used if ctl is not passed or if nan values are given. @subheading Example: function r=df( l ) b=[1;0;-1]; r = -( 2*l@{1@} - 2*b + rand(size(l@{1@}))); endfunction @* function r=ff( l ) b=[1;0;-1]; r = (l@{1@}-b)' * (l@{1@}-b); endfunction @* ll = @{ [10; 2; 3] @}; @* ctl(5) = 3; @* [x0,v,nev]=cg_min( "ff", "df", ll, ctl ) @* Comment: In general, BFGS method seems to be better performin in many cases but requires more computation per iteration See also http://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient. @seealso{bfgsmin} @end deftypefn cpiv_bard @c cpiv_bard ../inst/cpiv_bard.m [lb, idx, ridx, mv] = cpiv_bard (v, m[, incl]) v: column vector; m: matrix; incl (optional): index. length (v) must equal rows (m). Finds column vectors w and l with w == v + m * l, w >= 0, l >= 0, l.' * w == 0. Chooses idx, w, and l so that l(~idx) == 0, l(idx) == -inv (m(idx, idx)) * v(idx), w(idx) roughly == 0, and w(~idx) == v(~idx) + m(idx, ~idx).' * l(idx). idx indexes at least everything indexed by incl, but l(incl) may be < 0. lb: l(idx) (column vector); idx: logical index, defined above; ridx: ~idx & w roughly == 0; mv: [m, v] after performing a Gauss-Jordan 'sweep' (with gjp.m) on each diagonal element indexed by idx. Except the handling of incl (which enables handling of equality constraints in the calling code), this is called solving the 'complementary pivot problem' (Cottle, R. W. and Dantzig, G. B., 'Complementary pivot theory of mathematical programming', Linear Algebra and Appl. 1, 102--125. References for the current algorithm: Bard, Y.: Nonlinear Parameter Estimation, p. 147--149, Academic Press, New York and London 1974; Bard, Y., 'An eclectic approach to nonlinear programming', Proc. ANU Sem. Optimization, Canberra, Austral. Nat. Univ.). curvefit_stat @c curvefit_stat ../inst/curvefit_stat.m -*- texinfo -*- @deftypefn {Function File} {@var{info} =} curvefit_stat (@var{f}, @var{p}, @var{x}, @var{y}, @var{settings}) Frontend for computation of statistics for fitting of values, computed by a model function, to observed values. Please refer to the description of @code{residmin_stat}. The only differences to @code{residmin_stat} are the additional arguments @var{x} (independent values) and @var{y} (observations), that the model function @var{f}, if provided, has a second obligatory argument which will be set to @var{x} and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to @var{x}. @seealso {residmin_stat} @end deftypefn d2_min @c d2_min ../inst/d2_min.m [x,v,nev,h,args] = d2_min(f,d2f,args,ctl,code) - Newton-like minimization Minimize f(x) using 1st and 2nd derivatives. Any function w/ second derivatives can be minimized, as in Newton. f(x) decreases at each iteration, as in Levenberg-Marquardt. This function is inspired from the Levenberg-Marquardt algorithm found in the book "Numerical Recipes". ARGUMENTS : f : string : Cost function's name d2f : string : Name of function returning the cost (1x1), its differential (1xN) and its second differential or it's pseudo-inverse (NxN) (see ctl(5) below) : [v,dv,d2v] = d2f (x). args : list : f and d2f's arguments. By default, minimize the 1st or matrix : argument. ctl : vector : Control arguments (see below) or struct code : string : code will be evaluated after each outer loop that produced some (any) improvement. Variables visible from "code" include "x", the best parameter found, "v" the best value and "args", the list of all arguments. All can be modified. This option can be used to re-parameterize the argument space during optimization CONTROL VARIABLE ctl : (optional). May be a struct or a vector of length ---------------------- 5 or less where NaNs are ignored. Default values are written . FIELD VECTOR NAME POS ftol, f N/A : Stop search when value doesn't improve, as tested by f > Deltaf/max(|f(x)|,1) where Deltaf is the decrease in f observed in the last iteration. <10*sqrt(eps)> utol, u N/A : Stop search when updates are small, as tested by u > max { dx(i)/max(|x(i)|,1) | i in 1..N } where dx is the change in the x that occured in the last iteration. dtol, d N/A : Stop search when derivative is small, as tested by d > norm (dv) crit, c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'utol' (c=2) or 'dtol' (c=3) to the value of by the 'tol' option. <1> tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> narg, n ctl(3) : Position of the minimized argument in args <1> maxev,m ctl(4) : Maximum number of function evaluations maxout,m : Maximum number of outer loops id2f, i ctl(5) : 0 if d2f returns the 2nd derivatives, 1 if <0> it returns its pseudo-inverse. verbose, v N/A : Be more or less verbose (quiet=0) <0> dcdp @c dcdp ../inst/dcdp.m function prt = dcdp (f, p, dp, func[, bounds]) This is an interface to __dfdp__.m, similar to dfdp.m, but for functions only of parameters 'p', not of independents 'x'. See dfdp.m. dfpdp is more general and is meant to be used instead of dcdp in optimization. de_min @c de_min ../inst/de_min.m de_min: global optimisation using differential evolution Usage: [x, obj_value, nfeval, convergence] = de_min(fcn, control) minimization of a user-supplied function with respect to x(1:D), using the differential evolution (DE) method based on an algorithm by Rainer Storn (http://www.icsi.berkeley.edu/~storn/code.html) See: http://www.softcomputing.net/tevc2009_1.pdf Arguments: --------------- fcn string : Name of function. Must return a real value control vector : (Optional) Control variables, described below or struct Returned values: ---------------- x vector : parameter vector of best solution obj_value scalar : objective function value of best solution nfeval scalar : number of function evaluations convergence : 1 = best below value to reach (VTR) 0 = population has reached defined quality (tol) -1 = some values are close to constraints/boundaries -2 = max number of iterations reached (maxiter) -3 = max number of functions evaluations reached (maxnfe) Control variable: (optional) may be named arguments (i.e. "name",value ---------------- pairs), a struct, or a vector, where NaN's are ignored. XVmin : vector of lower bounds of initial population *** note: by default these are no constraints *** XVmax : vector of upper bounds of initial population constr : 1 -> enforce the bounds not just for the initial population const : data vector (remains fixed during the minimization) NP : number of population members F : difference factor from interval [0, 2] CR : crossover probability constant from interval [0, 1] strategy : 1 --> DE/best/1/exp 7 --> DE/best/1/bin 2 --> DE/rand/1/exp 8 --> DE/rand/1/bin 3 --> DE/target-to-best/1/exp 9 --> DE/target-to-best/1/bin 4 --> DE/best/2/exp 10--> DE/best/2/bin 5 --> DE/rand/2/exp 11--> DE/rand/2/bin 6 --> DEGL/SAW/exp else DEGL/SAW/bin refresh : intermediate output will be produced after "refresh" iterations. No intermediate output will be produced if refresh is < 1 VTR : Stopping criterion: "Value To Reach" de_min will stop when obj_value <= VTR. Use this if you know which value you expect. tol : Stopping criterion: "tolerance" stops if (best-worst)/max(1,worst) < tol This stops basically if the whole population is "good". maxnfe : maximum number of function evaluations maxiter : maximum number of iterations (generations) The algorithm seems to work well only if [XVmin,XVmax] covers the region where the global minimum is expected. DE is also somewhat sensitive to the choice of the difference factor F. A good initial guess is to choose F from interval [0.5, 1], e.g. 0.8. CR, the crossover probability constant from interval [0, 1] helps to maintain the diversity of the population and is rather uncritical but affects strongly the convergence speed. If the parameters are correlated, high values of CR work better. The reverse is true for no correlation. Experiments suggest that /bin likes to have a slightly larger CR than /exp. The number of population members NP is also not very critical. A good initial guess is 10*D. Depending on the difficulty of the problem NP can be lower than 10*D or must be higher than 10*D to achieve convergence. Default Values: --------------- XVmin = [-2]; XVmax = [ 2]; constr= 0; const = []; NP = 10 *D F = 0.8; CR = 0.9; strategy = 12; refresh = 0; VTR = -Inf; tol = 1.e-3; maxnfe = 1e6; maxiter = 1000; Example to find the minimum of the Rosenbrock saddle: ---------------------------------------------------- Define f as: function result = f(x); result = 100 * (x(2) - x(1)^2)^2 + (1 - x(1))^2; end Then type: ctl.XVmin = [-2 -2]; ctl.XVmax = [ 2 2]; [x, obj_value, nfeval, convergence] = de_min (@f, ctl); Keywords: global-optimisation optimisation minimisation deriv @c deriv ../inst/deriv.m -*- texinfo -*- @deftypefn {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}) @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}) @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}, @var{O}) @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}, @var{O}, @var{N}) Calculate derivate of function @var{f}. @var{f} must be a function handle or the name of a function that takes @var{x0} and returns a variable of equal length and orientation. @var{x0} must be a numeric vector or scalar. @var{h} defines the step taken for the derivative calculation. Defaults to 1e-7. @var{O} defines the order of the calculation. Supported values are 2 (h^2 order) or 4 (h^4 order). Defaults to 2. @var{N} defines the derivative order. Defaults to the 1st derivative of the function. Can be up to the 4th derivative. Reference: Numerical Methods for Mathematics, Science, and Engineering by John H. Mathews. @end deftypefn dfdp @c dfdp ../inst/dfdp.m function prt = dfdp (x, f, p, dp, func[, bounds]) numerical partial derivatives (Jacobian) df/dp for use with leasqr --------INPUT VARIABLES--------- x=vec or matrix of indep var(used as arg to func) x=[x0 x1 ....] f=func(x,p) vector initialsed by user before each call to dfdp p= vec of current parameter values dp= fractional increment of p for numerical derivatives dp(j)>0 central differences calculated dp(j)<0 one sided differences calculated dp(j)=0 sets corresponding partials to zero; i.e. holds p(j) fixed func=function (string or handle) to calculate the Jacobian for, e.g. to calc Jacobian for function expsum prt=dfdp(x,f,p,dp,'expsum') bounds=two-column-matrix of lower and upper bounds for parameters If no 'bounds' options is specified to leasqr, it will call dfdp without the 'bounds' argument. ----------OUTPUT VARIABLES------- prt= Jacobian Matrix prt(i,j)=df(i)/dp(j) ================================ dfxpdp is more general and is meant to be used instead of dfdp in optimization. dfpdp @c dfpdp ../inst/dfpdp.m function jac = dfpdp (p, func[, hook]) Returns Jacobian of func (p) with respect to p with finite differencing. The optional argument hook is a structure which can contain the following fields at the moment: hook.f: value of func(p) for p as given in the arguments hook.diffp: positive vector of fractional steps from given p in finite differencing (actual steps may be smaller if bounds are given). The default is .001 * ones (size (p)). hook.diff_onesided: logical vector, indexing elements of p for which only one-sided differences should be computed (faster); even if not one-sided, differences might not be exactly central if bounds are given. The default is false (size (p)). hook.fixed: logical vector, indexing elements of p for which zero should be returned instead of the guessed partial derivatives (useful in optimization if some parameters are not optimized, but are 'fixed'). hook.lbound, hook.ubound: vectors of lower and upper parameter bounds (or -Inf or +Inf, respectively) to be respected in finite differencing. The consistency of bounds is not checked. dfxpdp @c dfxpdp ../inst/dfxpdp.m function jac = dfxpdp (x, p, func[, hook]) Returns Jacobian of func (p, x) with respect to p with finite differencing. The optional argument hook is a structure which can contain the following fields at the moment: hook.f: value of func(p, x) for p and x as given in the arguments hook.diffp: positive vector of fractional steps from given p in finite differencing (actual steps may be smaller if bounds are given). The default is .001 * ones (size (p)); hook.diff_onesided: logical vector, indexing elements of p for which only one-sided differences should be computed (faster); even if not one-sided, differences might not be exactly central if bounds are given. The default is false (size (p)). hook.fixed: logical vector, indexing elements of p for which zero should be returned instead of the guessed partial derivatives (useful in optimization if some parameters are not optimized, but are 'fixed'). hook.lbound, hook.ubound: vectors of lower and upper parameter bounds (or -Inf or +Inf, respectively) to be respected in finite differencing. The consistency of bounds is not checked. expfit @c expfit ../inst/expfit.m USAGE [alpha,c,rms] = expfit( deg, x1, h, y ) Prony's method for non-linear exponential fitting Fit function: \sum_1^{deg} c(i)*exp(alpha(i)*x) Elements of data vector y must correspond to equidistant x-values starting at x1 with stepsize h The method is fully compatible with complex linear coefficients c, complex nonlinear coefficients alpha and complex input arguments y, x1, non-zero h . Fit-order deg must be a real positive integer. Returns linear coefficients c, nonlinear coefficients alpha and root mean square error rms. This method is known to be more stable than 'brute-force' non-linear least squares fitting. Example x0 = 0; step = 0.05; xend = 5; x = x0:step:xend; y = 2*exp(1.3*x)-0.5*exp(2*x); error = (rand(1,length(y))-0.5)*1e-4; [alpha,c,rms] = expfit(2,x0,step,y+error) alpha = 2.0000 1.3000 c = -0.50000 2.00000 rms = 0.00028461 The fit is very sensitive to the number of data points. It doesn't perform very well for small data sets. Theoretically, you need at least 2*deg data points, but if there are errors on the data, you certainly need more. Be aware that this is a very (very,very) ill-posed problem. By the way, this algorithm relies heavily on computing the roots of a polynomial. I used 'roots.m', if there is something better please use that code. Demo for a complex fit-function: deg= 2; N= 20; x1= -(1+i), x= linspace(x1,1+i/2,N).'; h = x(2) - x(1) y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x ); A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise [alpha,c,rms]= expfit( deg, x1, h, y ) fmins @c fmins ../inst/fmins.m -*- texinfo -*- @deftypefn {Function File} {[@var{x}] =} fmins (@var{f},@var{X0},@var{options},@var{grad},@var{P1},@var{P2}, ...) Find the minimum of a funtion of several variables. By default the method used is the Nelder&Mead Simplex algorithm Example usage: fmins(inline('(x(1)-5).^2+(x(2)-8).^4'),[0;0]) @strong{Inputs} @c use @asis and explicite @var in @table to avoid makeinfo warning @c `unlikely character , in @var' for `P1, P2, ...'. @table @asis @item @var{f} A string containing the name of the function to minimize @item @var{X0} A vector of initial parameters fo the function @var{f}. @item @var{options} Vector with control parameters (not all parameters are used) @verbatim options(1) - Show progress (if 1, default is 0, no progress) options(2) - Relative size of simplex (default 1e-3) options(6) - Optimization algorithm if options(6)==0 - Nelder & Mead simplex (default) if options(6)==1 - Multidirectional search Method if options(6)==2 - Alternating Directions search options(5) if options(6)==0 && options(5)==0 - regular simplex if options(6)==0 && options(5)==1 - right-angled simplex Comment: the default is set to "right-angled simplex". this works better for me on a broad range of problems, although the default in nmsmax is "regular simplex" options(10) - Maximum number of function evaluations @end verbatim @item @var{grad} Unused (For compatibility with Matlab) @item @var{P1}, @var{P2}, ... Optional parameters for function @var{f} @end table @end deftypefn gjp @c gjp ../inst/gjp.m m = gjp (m, k[, l]) m: matrix; k, l: row- and column-index of pivot, l defaults to k. Gauss-Jordon pivot as defined in Bard, Y.: Nonlinear Parameter Estimation, p. 296, Academic Press, New York and London 1974. In the pivot column, this seems not quite the same as the usual Gauss-Jordan(-Clasen) pivot. Bard gives Beaton, A. E., 'The use of special matrix operators in statistical calculus' Research Bulletin RB-64-51 (1964), Educational Testing Service, Princeton, New Jersey as a reference, but this article is not easily accessible. Another reference, whose definition of gjp differs from Bards by some signs, is Clarke, R. B., 'Algorithm AS 178: The Gauss-Jordan sweep operator with detection of collinearity', Journal of the Royal Statistical Society, Series C (Applied Statistics) (1982), 31(2), 166--168. jacobs @c jacobs ../inst/jacobs.m -*- texinfo -*- @deftypefn {Function File} {Df =} jacobs (@var{x}, @var{f}) @deftypefnx {Function File} {Df =} jacobs (@var{x}, @var{f}, @var{hook}) Calculate the jacobian of a function using the complex step method. Let @var{f} be a user-supplied function. Given a point @var{x} at which we seek for the Jacobian, the function @command{jacobs} returns the Jacobian matrix @code{d(f(1), @dots{}, df(end))/d(x(1), @dots{}, x(n))}. The function uses the complex step method and thus can be applied to real analytic functions. The optional argument @var{hook} is a structure with additional options. @var{hook} can have the following fields: @itemize @bullet @item @code{h} - can be used to define the magnitude of the complex step and defaults to 1e-20; steps larger than 1e-3 are not allowed. @item @code{fixed} - is a logical vector internally usable by some optimization functions; it indicates for which elements of @var{x} no gradient should be computed, but zero should be returned. @end itemize For example: @example @group f = @@(x) [x(1)^2 + x(2); x(2)*exp(x(1))]; Df = jacobs ([1, 2], f) @end group @end example @end deftypefn leasqr @c leasqr ../inst/leasqr.m -*- texinfo -*- @deftypefn {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}, @var{dFdp}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}, @var{dFdp}, @var{options}) @deftypefnx {Function File} {[@var{f}, @var{p}, @var{cvg}, @var{iter}, @var{corp}, @var{covp}, @var{covr}, @var{stdresid}, @var{Z}, @var{r2}] =} leasqr (@dots{}) Levenberg-Marquardt nonlinear regression. Input arguments: @table @var @item x Vector or matrix of independent variables. @item y Vector or matrix of observed values. @item pin Vector of initial parameters to be adjusted by leasqr. @item F Name of function or function handle. The function must be of the form @code{y = f(x, p)}, with y, x, p of the form @var{y}, @var{x}, @var{pin}. @item stol Scalar tolerance on fractional improvement in scalar sum of squares, i.e., @code{sum ((@var{wt} .* (@var{y}-@var{f}))^2)}. Set to 0.0001 if empty or not given; @item niter Maximum number of iterations. Set to 20 if empty or not given. @item wt Statistical weights (same dimensions as @var{y}). These should be set to be proportional to @code{sqrt (@var{y}) ^-1}, i.e., the covariance matrix of the data is assumed to be proportional to diagonal with diagonal equal to @code{(@var{wt}.^2)^-1}. The constant of proportionality will be estimated. Set to @code{ones (size (@var{y}))} if empty or not given. @item dp Fractional increment of @var{p} for numerical partial derivatives. Set to @code{0.001 * ones (size (@var{pin}))} if empty or not given. @itemize @bullet @item dp(j) > 0 means central differences on j-th parameter p(j). @item dp(j) < 0 means one-sided differences on j-th parameter p(j). @item dp(j) = 0 holds p(j) fixed, i.e., leasqr won't change initial guess: pin(j) @end itemize @item dFdp Name of partial derivative function in quotes or function handle. If not given or empty, set to @code{dfdp}, a slow but general partial derivatives function. The function must be of the form @code{prt = dfdp (x, f, p, dp, F [,bounds])}. For backwards compatibility, the function will only be called with an extra 'bounds' argument if the 'bounds' option is explicitly specified to leasqr (see dfdp.m). @item options Structure with multiple options. The following fields are recognized: @table @asis @item @qcode{fract_prec} Column vector (same length as @var{pin}) of desired fractional precisions in parameter estimates. Iterations are terminated if change in parameter vector (chg) relative to current parameter estimate is less than their corresponding elements in 'fract_prec', i.e., @code{all (abs (chg) < abs (options.fract_prec .* current_parm_est))} on two consecutive iterations. Defaults to @code{zeros (size (@var{pin}))}. @item @qcode{max_fract_change} Column vector (same length as @var{pin}) of maximum fractional step changes in parameter vector. Fractional change in elements of parameter vector is constrained to be at most 'max_fract_change' between sucessive iterations, i.e., @code{abs (chg(i)) = abs (min([chg(i), options.max_fract_change(i) * current param estimate]))}. Defaults to @code{Inf * ones (size (@var{pin}))}. @item @qcode{inequc} Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Initial parameters must satisfy these constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say m) and a vector (say v), specifying linear inequality constraints of the form `m.' * parameters + v >= 0'. If the constraints are just bounds, it is suggested to specify them in 'options.bounds' instead, since then some sanity tests are performed, and since the function 'dfdp.m' is guarantied not to violate constraints during determination of the numeric gradient only for those constraints specified as 'bounds' (possibly with violations due to a certain inaccuracy, however, except if no constraints except bounds are specified). The first entry for general constraints must be a differentiable vector valued function (say h), specifying general inequality constraints of the form `h (p[, idx]) >= 0'; p is the column vector of optimized paraters and the optional argument idx is a logical index. h has to return the values of all constraints if idx is not given, and has to return only the indexed constraints if idx is given (so computation of the other constraints can be spared). If a second entry for general constraints is given, it must be a function (say dh) which returnes a matrix whos rows contain the gradients of the constraint function h with respect to the optimized parameters. It has the form jac_h = dh (vh, p, dp, h, idx[, bounds]); p is the column vector of optimized parameters, and idx is a logical index --- only the rows indexed by idx must be returned (so computation of the others can be spared). The other arguments of dh are for the case that dh computes numerical gradients: vh is the column vector of the current values of the constraint function h, with idx already applied. h is a function h (p) to compute the values of the constraints for parameters p, it will return only the values indexed by idx. dp is a suggestion for relative step width, having the same value as the argument 'dp' of leasqr above. If bounds were specified to leasqr, they are provided in the argument bounds of dh, to enable their consideration in determination of numerical gradients. If dh is not specified to leasqr, numerical gradients are computed in the same way as with 'dfdp.m' (see above). If some constraints are linear, they should be specified as linear constraints (or bounds, if applicable) for reasons of performance, even if general constraints are also specified. @item @qcode{bounds} Two-column-matrix, one row for each parameter in @var{pin}. Each row contains a minimal and maximal value for each parameter. Default: [-Inf, Inf] in each row. If this field is used with an existing user-side function for 'dFdp' (see above) the functions interface might have to be changed. @item @qcode{equc} Equality constraints, specified the same way as inequality constraints (see field 'options.inequc'). Initial parameters must satisfy these constraints. Note that there is possibly a certain inaccuracy in honoring constraints, except if only bounds are specified. @emph{Warning}: If constraints (or bounds) are set, returned guesses of @var{corp}, @var{covp}, and @var{Z} are generally invalid, even if no constraints are active for the final parameters. If equality constraints are specified, @var{corp}, @var{covp}, and @var{Z} are not guessed at all. @item @qcode{cpiv} Function for complementary pivot algorithm for inequality constraints. Defaults to cpiv_bard. No different function is supplied. @end table For backwards compatibility, @var{options} can also be a matrix whose first and second column contains the values of @qcode{fract_prec} and @qcode{max_fract_change}, respectively. @end table Output: @table @var @item f Column vector of values computed: f = F(x,p). @item p Column vector trial or final parameters, i.e, the solution. @item cvg Scalar: = 1 if convergence, = 0 otherwise. @item iter Scalar number of iterations used. @item corp Correlation matrix for parameters. @item covp Covariance matrix of the parameters. @item covr Diag(covariance matrix of the residuals). @item stdresid Standardized residuals. @item Z Matrix that defines confidence region (see comments in the source). @item r2 Coefficient of multiple determination, intercept form. @end table Not suitable for non-real residuals. References: Bard, Nonlinear Parameter Estimation, Academic Press, 1974. Draper and Smith, Applied Regression Analysis, John Wiley and Sons, 1981. @end deftypefn LinearRegression @c LinearRegression ../inst/LinearRegression.m -*- texinfo -*- @deftypefn {Function File} {[@var{p},@var{e_var},@var{r},@var{p_var},@var{y_var}] =} LinearRegression (@var{F},@var{y}) @deftypefnx {Function File} {[@var{p},@var{e_var},@var{r},@var{p_var},@var{y_var}] =} LinearRegression (@var{F},@var{y},@var{w}) general linear regression determine the parameters p_j (j=1,2,...,m) such that the function f(x) = sum_(i=1,...,m) p_j*f_j(x) is the best fit to the given values y_i = f(x_i) parameters: @itemize @item @var{F} is an n*m matrix with the values of the basis functions at the support points. In column j give the values of f_j at the points x_i (i=1,2,...,n) @item @var{y} is a column vector of length n with the given values @item @var{w} is n column vector of of length n vector with the weights of data points @end itemize return values: @itemize @item @var{p} is the vector of length m with the estimated values of the parameters @item @var{e_var} is the estimated variance of the difference between fitted and measured values @item @var{r} is the weighted norm of the residual @item @var{p_var} is the estimated variance of the parameters p_j @item @var{y_var} is the estimated variance of the dependend variables @end itemize Caution: do NOT request @var{y_var} for large data sets, as a n by n matrix is generated @c Will be cut out in optims info file and replaced with the same @c refernces explicitely there, since references to core Octave @c functions are not automatically transformed from here to there. @c BEGIN_CUT_TEXINFO @seealso{regress,leasqr,nonlin_curvefit,polyfit,wpolyfit,expfit} @c END_CUT_TEXINFO @end deftypefn line_min @c line_min ../inst/line_min.m [a,fx,nev] = line_min (f, dx, args, narg, h, nev_max) - Minimize f() along dx INPUT ---------- f : string : Name of minimized function dx : matrix : Direction along which f() is minimized args : cell : Arguments of f narg : integer : Position of minimized variable in args. Default=1 h : scalar : Step size to use for centered finite difference approximation of first and second derivatives. Default=1E-3. nev_max : integer : Maximum number of function evaluations. Default=30 OUTPUT --------- a : scalar : Value for which f(x+a*dx) is a minimum (*) fx : scalar : Value of f(x+a*dx) at minimum (*) nev : integer : Number of function evaluations (*) The notation f(x+a*dx) assumes that args == {x}. Reference: David G Luenberger's Linear and Nonlinear Programming linprog @c linprog ../inst/linprog.m -*- texinfo -*- @deftypefn{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}) @deftypefnx{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}, @var{Aeq}, @var{beq}) @deftypefnx{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}, @var{Aeq}, @var{beq}, @var{lb}, @var{ub}) @deftypefnx{Function File} {[@var{x}, @var{fval}] =} linprog (@dots{}) Solve a linear problem. Finds @example min (f' * x) @end example (both f and x are column vectors) subject to @example @group A * x <= b Aeq * x = beq lb <= x <= ub @end group @end example If not specified, @var{Aeq} and @var{beq} default to empty matrices. If not specified, the lower bound @var{lb} defaults to minus infinite and the upper bound @var{ub} defaults to infinite. @c Will be cut out in optims info file and replaced with the same @c refernces explicitely there, since references to core Octave @c functions are not automatically transformed from here to there. @c BEGIN_CUT_TEXINFO @seealso{glpk} @c END_CUT_TEXINFO @end deftypefn mdsmax @c mdsmax ../inst/mdsmax.m MDSMAX Multidirectional search method for direct search optimization. [x, fmax, nf] = MDSMAX(FUN, x0, STOPIT, SAVIT) attempts to maximize the function FUN, using the starting vector x0. The method of multidirectional search is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative size of the simplex is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). The form of the initial simplex is determined by STOPIT(4): STOPIT(4) = 0: regular simplex (sides of equal length, the default), STOPIT(4) = 1: right-angled simplex. Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. MDSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). This implementation uses 2n^2 elements of storage (two simplices), where x0 is an n-vector. It is based on the algorithm statement in [2, sec.3], modified so as to halve the storage (with a slight loss in readability). References: [1] V. J. Torczon, Multi-directional search: A direct search algorithm for parallel machines, Ph.D. Thesis, Rice University, Houston, Texas, 1989. [2] V. J. Torczon, On the convergence of the multidirectional search algorithm, SIAM J. Optimization, 1 (1991), pp. 123-145. [3] N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. [4] N. J. Higham, Accuracy and Stability of Numerical Algorithms, Second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2002; sec. 20.5. minimize @c minimize ../inst/minimize.m [x,v,nev,...] = minimize (f,args,...) - Minimize f ARGUMENTS f : string : Name of function. Must return a real value args : list or : List of arguments to f (by default, minimize the first) matrix : f's only argument RETURNED VALUES x : matrix : Local minimum of f. Let's suppose x is M-by-N. v : real : Value of f in x0 nev : integer : Number of function evaluations or 1 x 2 : Number of function and derivative evaluations (if derivatives are used) Extra arguments are either a succession of option-value pairs or a single list or struct of option-value pairs (for unary options, the value in the struct is ignored). OPTIONS : DERIVATIVES Derivatives may be used if one of these options --------------------- uesd. Otherwise, the Nelder-Mean (see nelder_mead_min) method is used. 'd2f', d2f : Name of a function that returns the value of f, of its 1st and 2nd derivatives : [fx,dfx,d2fx] = feval (d2f, x) where fx is a real number, dfx is 1x(M*N) and d2fx is (M*N)x(M*N). A Newton-like method (d2_min) will be used. 'hess' : Use [fx,dfx,d2fx] = leval (f, args) to compute 1st and 2nd derivatives, and use a Newton-like method (d2_min). 'd2i', d2i : Name of a function that returns the value of f, of its 1st and pseudo-inverse of second derivatives : [fx,dfx,id2fx] = feval (d2i, x) where fx is a real number, dfx is 1x(M*N) and d2ix is (M*N)x(M*N). A Newton-like method will be used (see d2_min). 'ihess' : Use [fx,dfx,id2fx] = leval (f, args) to compute 1st derivative and the pseudo-inverse of 2nd derivatives, and use a Newton-like method (d2_min). NOTE : df, d2f or d2i take the same arguments as f. 'order', n : Use derivatives of order n. If the n'th order derivative is not specified by 'df', 'd2f' or 'd2i', it will be computed numerically. Currently, only order 1 works. 'ndiff' : Use a variable metric method (bfgs) using numerical differentiation. OPTIONS : STOPPING CRITERIA Default is to use 'tol' --------------------------- 'ftol', ftol : Stop search when value doesn't improve, as tested by ftol > Deltaf/max(|f(x)|,1) where Deltaf is the decrease in f observed in the last iteration. Default=10*eps 'utol', utol : Stop search when updates are small, as tested by tol > max { dx(i)/max(|x(i)|,1) | i in 1..N } where dx is the change in the x that occured in the last iteration. 'dtol',dtol : Stop search when derivatives are small, as tested by dtol > max { df(i)*max(|x(i)|,1)/max(v,1) | i in 1..N } where x is the current minimum, v is func(x) and df is the derivative of f in x. This option is ignored if derivatives are not used in optimization. MISC. OPTIONS ------------- 'maxev', m : Maximum number of function evaluations 'narg' , narg : Position of the minimized argument in args <1> 'isz' , step : Initial step size (only for 0 and 1st order method) <1> Should correspond to expected distance to minimum 'verbose' : Display messages during execution 'backend' : Instead of performing the minimization itself, return [backend, control], the name and control argument of the backend used by minimize(). Minimimzation can then be obtained without the overhead of minimize by calling, if a 0 or 1st order method is used : [x,v,nev] = feval (backend, args, control) or, if a 2nd order method is used : [x,v,nev] = feval (backend, control.d2f, args, control) nelder_mead_min @c nelder_mead_min ../inst/nelder_mead_min.m [x0,v,nev] = nelder_mead_min (f,args,ctl) - Nelder-Mead minimization Minimize 'f' using the Nelder-Mead algorithm. This function is inspired from the that found in the book "Numerical Recipes". ARGUMENTS --------- f : string : Name of function. Must return a real value args : list : Arguments passed to f. or matrix : f's only argument ctl : vector : (Optional) Control variables, described below or struct RETURNED VALUES --------------- x0 : matrix : Local minimum of f v : real : Value of f in x0 nev : number : Number of function evaluations CONTROL VARIABLE : (optional) may be named arguments (i.e. "name",value ------------------ pairs), a struct, or a vector of length <= 6, where NaN's are ignored. Default values are written . OPT. VECTOR NAME POS ftol,f N/A : Stopping criterion : stop search when values at simplex vertices are all alike, as tested by f > (max_i (f_i) - min_i (f_i)) /max(max(|f_i|),1) where f_i are the values of f at the vertices. <10*eps> rtol,r N/A : Stop search when biggest radius of simplex, using infinity-norm, is small, as tested by : ctl(2) > Radius <10*eps> vtol,v N/A : Stop search when volume of simplex is small, tested by ctl(2) > Vol crit,c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'rtol' (c=2) or 'vtol' (c=3) to the value of the 'tol' option. <1> tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> narg ctl(3) : Position of the minimized argument in args <1> maxev ctl(4) : Maximum number of function evaluations. This number may be slightly exceeded. isz ctl(5) : Size of initial simplex, which is : <1> { x + e_i | i in 0..N } Where x == args{narg} is the initial value e_0 == zeros (size (x)), e_i(j) == 0 if j != i and e_i(i) == ctl(5) e_i has same size as x Set ctl(5) to the distance you expect between the starting point and the minimum. rst ctl(6) : When a minimum is found the algorithm restarts next to it until the minimum does not improve anymore. ctl(6) is the maximum number of restarts. Set ctl(6) to zero if you know the function is well-behaved or if you don't mind not getting a true minimum. <0> verbose, v Be more or less verbose (quiet=0) <0> nmsmax @c nmsmax ../inst/nmsmax.m NMSMAX Nelder-Mead simplex method for direct search optimization. [x, fmax, nf] = NMSMAX(FUN, x0, STOPIT, SAVIT) attempts to maximize the function FUN, using the starting vector x0. The Nelder-Mead direct search method is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative size of the simplex is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). The form of the initial simplex is determined by STOPIT(4): STOPIT(4) = 0: regular simplex (sides of equal length, the default) STOPIT(4) = 1: right-angled simplex. Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). STOPIT(6) indicates the direction (ie. minimization or maximization.) Default is 1, maximization. set STOPIT(6)=-1 for minimization If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. NMSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). References: N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. C. T. Kelley, Iterative Methods for Optimization, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1999. nonlin_curvefit @c nonlin_curvefit ../inst/nonlin_curvefit.m -*- texinfo -*- @deftypefn {Function File} {[@var{p}, @var{fy}, @var{cvg}, @var{outp}] =} nonlin_curvefit (@var{f}, @var{pin}, @var{x}, @var{y}) @deftypefnx {Function File} {[@var{p}, @var{fy}, @var{cvg}, @var{outp}] =} nonlin_curvefit (@var{f}, @var{pin}, @var{x}, @var{y}, @var{settings}) Frontend for nonlinear fitting of values, computed by a model function, to observed values. Please refer to the description of @code{nonlin_residmin}. The differences to @code{nonlin_residmin} are the additional arguments @var{x} (independent values, mostly, but not necessarily, an array of the same dimensions or the same number of rows as @var{y}) and @var{y} (array of observations), the returned value @var{fy} (final guess for observed values) instead of @var{resid}, that the model function has a second obligatory argument which will be set to @var{x} and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to @var{x}. @c The following block will be cut out in the package info file. @c BEGIN_CUT_TEXINFO Also, if the setting @code{user_interaction} is given, additional information is passed to these functions. Type @code{optim_doc ("Common optimization options")} for this setting. @c END_CUT_TEXINFO @seealso {nonlin_residmin} @end deftypefn nonlin_min @c nonlin_min ../inst/nonlin_min.m -*- texinfo -*- @deftypefn {Function File} {[@var{p}, @var{objf}, @var{cvg}, @var{outp}] =} nonlin_min (@var{f}, @var{pin}) @deftypefnx {Function File} {[@var{p}, @var{objf}, @var{cvg}, @var{outp}] =} nonlin_min (@var{f}, @var{pin}, @var{settings}) Frontend for nonlinear minimization of a scalar objective function. The functions supplied by the user have a minimal interface; any additionally needed constants can be supplied by wrapping the user functions into anonymous functions. The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately. @var{f}: objective function. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination. @var{pin}: real column vector of initial parameters. @var{settings}: structure whose fields stand for optional settings referred to below. The fields can be set by @code{optimset()}. The returned values are the column vector of final parameters @var{p}, the final value of the objective function @var{objf}, an integer @var{cvg} indicating if and how optimization succeeded or failed, and a structure @var{outp} with additional information, curently with possible fields: @code{niter}, the number of iterations, @code{nobjf}, the number of objective function calls (indirect calls by gradient function not counted), @code{lambda}, the lambda of constraints at the result, and @code{user_interaction}, information on user stops (see settings). The backend may define additional fields. @var{cvg} is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be @code{0} (maximum number of iterations exceeded), @code{1} (success without further specification of criteria), @code{2} (parameter change less than specified precision in two consecutive iterations), @code{3} (improvement in objective function less than specified), @code{-1} (algorithm aborted by a user function), or @code{-4} (algorithm got stuck). @c The following block will be cut out in the package info file. @c BEGIN_CUT_TEXINFO For settings, type @code{optim_doc ("nonlin_min")}. For desription of structure-based parameter handling, type @code{optim_doc ("parameter structures")}. For description of individual backends (currently only one), type @code{optim_doc ("scalar optimization")} and choose the backend in the menu. @c END_CUT_TEXINFO @end deftypefn nonlin_residmin @c nonlin_residmin ../inst/nonlin_residmin.m -*- texinfo -*- @deftypefn {Function File} {[@var{p}, @var{resid}, @var{cvg}, @var{outp}] =} nonlin_residmin (@var{f}, @var{pin}) @deftypefnx {Function File} {[@var{p}, @var{resid}, @var{cvg}, @var{outp}] =} nonlin_residmin (@var{f}, @var{pin}, @var{settings}) Frontend for nonlinear minimization of residuals returned by a model function. The functions supplied by the user have a minimal interface; any additionally needed constants (e.g. observed values) can be supplied by wrapping the user functions into anonymous functions. The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately. @var{f}: function returning the array of residuals. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination. @var{pin}: real column vector of initial parameters. @var{settings}: structure whose fields stand for optional settings referred to below. The fields can be set by @code{optimset()}. The returned values are the column vector of final parameters @var{p}, the final array of residuals @var{resid}, an integer @var{cvg} indicating if and how optimization succeeded or failed, and a structure @var{outp} with additional information, curently with the fields: @code{niter}, the number of iterations and @code{user_interaction}, information on user stops (see settings). The backend may define additional fields. @var{cvg} is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be @code{0} (maximum number of iterations exceeded), @code{2} (parameter change less than specified precision in two consecutive iterations), or @code{3} (improvement in objective function -- e.g. sum of squares -- less than specified), or @code{-1} (algorithm aborted by a user function). @c The following block will be cut out in the package info file. @c BEGIN_CUT_TEXINFO For settings, type @code{optim_doc ("nonlin_residmin")}. For desription of structure-based parameter handling, type @code{optim_doc ("parameter structures")}. For description of individual backends (currently only one), type @code{optim_doc ("residual optimization")} and choose the backend in the menu. @c END_CUT_TEXINFO @seealso {nonlin_curvefit} @end deftypefn nrm @c nrm ../inst/nrm.m -*- texinfo -*- @deftypefn {Function File} {@var{xmin} =} nrm (@var{f},@var{x0}) Using @var{x0} as a starting point find a minimum of the scalar function @var{f}. The Newton-Raphson method is used. @end deftypefn optim_doc @c optim_doc ../inst/optim_doc.m -*- texinfo -*- @deftypefn {Function File} {} optim_doc () @deftypefnx {Function File} {} optim_doc (@var{keyword}) Show optim package documentation. Runs the info viewer Octave is configured with on the documentation in info format of the installed optim package. Without argument, the top node of the documentation is displayed. With an argument, the respective index entry is searched for and its node displayed. @end deftypefn optim_problems @c optim_problems ../inst/optim_problems.m Problems for testing optimizers. Documentation is in the code. poly_2_ex @c poly_2_ex ../inst/poly_2_ex.m ex = poly_2_ex (l, f) - Extremum of a 1-var deg-2 polynomial l : 3 : Values of variable at which polynomial is known. f : 3 : f(i) = Value of the degree-2 polynomial at l(i). ex : 1 : Value for which f reaches its extremum Assuming that f(i) = a*l(i)^2 + b*l(i) + c = P(l(i)) for some a, b, c, ex is the extremum of the polynome P. This function will be removed from future versions of the optim package since it is not related to optimization. polyconf @c polyconf ../inst/polyconf.m [y,dy] = polyconf(p,x,s) Produce prediction intervals for the fitted y. The vector p and structure s are returned from polyfit or wpolyfit. The x values are where you want to compute the prediction interval. polyconf(...,['ci'|'pi']) Produce a confidence interval (range of likely values for the mean at x) or a prediction interval (range of likely values seen when measuring at x). The prediction interval tells you the width of the distribution at x. This should be the same regardless of the number of measurements you have for the value at x. The confidence interval tells you how well you know the mean at x. It should get smaller as you increase the number of measurements. Error bars in the physical sciences usually show a 1-alpha confidence value of erfc(1/sqrt(2)), representing one standandard deviation of uncertainty in the mean. polyconf(...,1-alpha) Control the width of the interval. If asking for the prediction interval 'pi', the default is .05 for the 95% prediction interval. If asking for the confidence interval 'ci', the default is erfc(1/sqrt(2)) for a one standard deviation confidence interval. Example: [p,s] = polyfit(x,y,1); xf = linspace(x(1),x(end),150); [yf,dyf] = polyconf(p,xf,s,'ci'); plot(xf,yf,'g-;fit;',xf,yf+dyf,'g.;;',xf,yf-dyf,'g.;;',x,y,'xr;data;'); plot(x,y-polyval(p,x),';residuals;',xf,dyf,'g-;;',xf,-dyf,'g-;;'); polyfitinf @c polyfitinf ../inst/polyfitinf.m function [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT,REF0) Best polynomial approximation in discrete uniform norm INPUT VARIABLES: M : degree of the fitting polynomial N : number of data points X(N) : x-coordinates of data points Y(N) : y-coordinates of data points K : character of the polynomial: K = 0 : mixed parity polynomial K = 1 : odd polynomial ( X(1) must be > 0 ) K = 2 : even polynomial ( X(1) must be >= 0 ) EPSH : tolerance for leveling. A useful value for 24-bit mantissa is EPSH = 2.0E-7 MAXIT : upper limit for number of exchange steps REF0(M2): initial alternating set ( N-vector ). This is an OPTIONAL argument. The length M2 is given by: M2 = M + 2 , if K = 0 M2 = integer part of (M+3)/2 , if K = 1 M2 = 2 + M/2 (M must be even) , if K = 2 OUTPUT VARIABLES: A : polynomial coefficients of the best approximation in order of increasing powers: p*(x) = A(1) + A(2)*x + A(3)*x^2 + ... REF : selected alternating set of points HMAX : maximum deviation ( uniform norm of p* - f ) H : pointwise approximation errors R : total number of iterations EQUAL : success of failure of algorithm EQUAL=1 : succesful EQUAL=0 : convergence not acheived EQUAL=-1: input error EQUAL=-2: algorithm failure Relies on function EXCH, provided below. Example: M = 5; N = 10000; K = 0; EPSH = 10^-12; MAXIT = 10; X = linspace(-1,1,N); % uniformly spaced nodes on [-1,1] k=1; Y = abs(X).^k; % the function Y to approximate [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); p = polyval(A,X); plot(X,Y,X,p) % p is the best approximation Note: using an even value of M, e.g., M=2, in the example above, makes the algorithm to fail with EQUAL=-2, because of collocation, which appears because both the appriximating function and the polynomial are even functions. The way aroung it is to approximate only the right half of the function, setting K = 2 : even polynomial. For example: N = 10000; K = 2; EPSH = 10^-12; MAXIT = 10; X = linspace(0,1,N); for i = 1:2 k = 2*i-1; Y = abs(X).^k; for j = 1:4 M = 2^j; [~,~,HMAX] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); approxerror(i,j) = HMAX; end end disp('Table 3.1 from Approximation theory and methods, M.J.D.POWELL, p. 27'); disp(' '); disp(' n K=1 K=3'); disp(' '); format short g; disp([(2.^(1:4))' approxerror']); ALGORITHM: Computation of the polynomial that best approximates the data (X,Y) in the discrete uniform norm, i.e. the polynomial with the minimum value of max{ | p(x_i) - y_i | , x_i in X } . That polynomial, also known as minimax polynomial, is obtained by the exchange algorithm, a finite iterative process requiring, at most, n ( ) iterations ( usually p = M + 2. See also function EXCH ). p since this number can be very large , the routine may not converge within MAXIT iterations . The other possibility of failure occurs when there is insufficient floating point precision for the input data chosen. CREDITS: This routine was developed and modified as computer assignments in Approximation Theory courses by Prof. Andrew Knyazev, University of Colorado Denver, USA. Team Fall 98 (Revision 1.0): Chanchai Aniwathananon Crhistopher Mehl David A. Duran Saulo P. Oliveira Team Spring 11 (Revision 1.1): Manuchehr Aminian The algorithm and the comments are based on a FORTRAN code written by Joseph C. Simpson. The code is available on Netlib repository: http://www.netlib.org/toms/501 See also: Communications of the ACM, V14, pp.355-356(1971) NOTES: 1) A may contain the collocation polynomial 2) If MAXIT is exceeded, REF contains a new reference set 3) M, EPSH and REF can be altered during the execution 4) To keep consistency to the original code , EPSH can be negative. However, the use of REF0 is *NOT* determined by EPSH< 0, but only by its inclusion as an input parameter. Some parts of the code can still take advantage of vectorization. Revision 1.0 from 1998 is a direct human translation of the FORTRAN code http://www.netlib.org/toms/501 Revision 1.1 is a clean-up and technical update. Tested on MATLAB Version 7.11.0.584 (R2010b) and GNU Octave Version 3.2.4 powell @c powell ../inst/powell.m -*- texinfo -*- @deftypefn {Function File} {[@var{p}, @var{obj_value}, @var{convergence}, @var{iters}, @var{nevs}] =} powell (@var{f}, @var{p0}, @var{control}) Multidimensional minimization (direction-set method). Implements a direction-set (Powell's) method for multidimensional minimization of a function without calculation of the gradient [1, 2] @subheading Arguments @itemize @bullet @item @var{f}: name of function to minimize (string or handle), which should accept one input variable (see example for how to pass on additional input arguments) @item @var{p0}: An initial value of the function argument to minimize @item @var{options}: an optional structure, which can be generated by optimset, with some or all of the following fields: @itemize @minus @item MaxIter: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) @item TolFun: minimum amount by which function value must decrease in each iteration to continue (default is 1E-8) @item MaxFunEvals: maximum function evaluations (positive integer, or -1 or Inf for unlimited (default)) @item SearchDirections: an n*n matrix whose columns contain the initial set of (presumably orthogonal) directions to minimize along, where n is the number of elements in the argument to be minimized for; or an n*1 vector of magnitudes for the initial directions (defaults to the set of unit direction vectors) @end itemize @end itemize @subheading Examples @example @group y = @@(x, s) x(1) ^ 2 + x(2) ^ 2 + s; o = optimset('MaxIter', 100, 'TolFun', 1E-10); s = 1; [x_optim, y_min, conv, iters, nevs] = powell(@@(x) y(x, s), [1 0.5], o); %pass y wrapped in an anonymous function so that all other arguments to y, which are held constant, are set %should return something like x_optim = [4E-14 3E-14], y_min = 1, conv = 1, iters = 2, nevs = 24 @end group @end example @subheading Returns: @itemize @bullet @item @var{p}: the minimizing value of the function argument @item @var{obj_value}: the value of @var{f}() at @var{p} @item @var{convergence}: 1 if normal convergence, 0 if not @item @var{iters}: number of iterations performed @item @var{nevs}: number of function evaluations @end itemize @subheading References @enumerate @item Powell MJD (1964), An efficient method for finding the minimum of a function of several variables without calculating derivatives, @cite{Computer Journal}, 7 :155-162 @item Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (1992). @cite{Numerical Recipes in Fortran: The Art of Scientific Computing} (2nd Ed.). New York: Cambridge University Press (Section 10.5) @end enumerate @end deftypefn residmin_stat @c residmin_stat ../inst/residmin_stat.m -*- texinfo -*- @deftypefn {Function File} {@var{info} =} residmin_stat (@var{f}, @var{p}, @var{settings}) Frontend for computation of statistics for a residual-based minimization. @var{settings} is a structure whose fields can be set by @code{optimset}. With @var{settings} the computation of certain statistics is requested by setting the fields @code{ret_} to @code{true}. The respective statistics will be returned in a structure as fields with name @code{}. Depending on the requested statistic and on the additional information provided in @var{settings}, @var{f} and @var{p} may be empty. Otherwise, @var{f} is the model function of an optimization (the interface of @var{f} is described e.g. in @code{nonlin_residmin}, please see there), and @var{p} is a real column vector with parameters resulting from the same optimization. Currently, the following statistics (or general information) can be requested: @code{dfdp}: Jacobian of model function with respect to parameters. @code{covd}: Covariance matrix of data (typically guessed by applying a factor to the covariance matrix of the residuals). @code{covp}: Covariance matrix of final parameters. @code{corp}: Correlation matrix of final parameters. @c The following block will be cut out in the package info file. @c BEGIN_CUT_TEXINFO For further settings, type @code{optim_doc ("residmin_stat")}. For desription of structure-based parameter handling, type @code{optim_doc ("parameter structures")}. For backend information, type @code{optim_doc ("residual optimization")} and choose the backends type in the menu. @c END_CUT_TEXINFO @seealso {curvefit_stat} @end deftypefn rosenbrock @c rosenbrock ../inst/rosenbrock.m Rosenbrock function - used to create example obj. fns. Function value and gradient vector of the rosenbrock function The minimizer is at the vector (1,1,..,1), and the minimized value is 0. samin_example @c samin_example ../inst/samin_example.m samin_example: example script that contains examples of how to call samin for minimization using simulated annealing. Edit the script to see how samin may be used. usage: samin_example test_min_1 @c test_min_1 ../inst/test_min_1.m Test an optimization function with the same synopsis as bfgs.m test_min_2 @c test_min_2 ../inst/test_min_2.m test_min_2 - Test that bfgs works Defines some simple functions and verifies that calling bfgs on them returns the correct minimum. Sets 'ok' to 1 if success, 0 otherwise test_min_3 @c test_min_3 ../inst/test_min_3.m ok - Test that bfgs works with extra arguments Defines some simple functions and verifies that calling bfgs on them returns the correct minimum. Sets 'ok' to 1 if success, 0 otherwise test_min_4 @c test_min_4 ../inst/test_min_4.m test_bfgs - Test that bfgs works Check that bfgs treats struct options correctly Sets 'ok' to 1 if success, 0 otherwise test_minimize_1 @c test_minimize_1 ../inst/test_minimize_1.m ok = test_minimize - Test that minimize works test_nelder_mead_min_1 @c test_nelder_mead_min_1 ../inst/test_nelder_mead_min_1.m Checks wether the function 'nelder_mead_min' works, by making it minimize a quadratic function. test_nelder_mead_min_2 @c test_nelder_mead_min_2 ../inst/test_nelder_mead_min_2.m Checks wether the function 'nelder_mead_min' accepts options properly test_wpolyfit @c test_wpolyfit ../inst/test_wpolyfit.m Tests for wpolyfit. Test cases are taken from the NIST Statistical Reference Datasets http://www.itl.nist.gov/div898/strd/ vfzero @c vfzero ../inst/vfzero.m -*- texinfo -*- @deftypefn {Function File} {} vfzero (@var{fun}, @var{x0}) @deftypefnx {Function File} {} vfzero (@var{fun}, @var{x0}, @var{options}) @deftypefnx {Function File} {[@var{x}, @var{fval}, @var{info}, @var{output}] =} vfzero (@dots{}) A variant of @code{fzero}. Finds a zero of a vector-valued multivariate function where each output element only depends on the input element with the same index (so the Jacobian is diagonal). @var{fun} should be a handle or name of a function returning a column vector. @var{x0} should be a two-column matrix, each row specifying two points which bracket a zero of the respective output element of @var{fun}. If @var{x0} is a single-column matrix then several nearby and distant values are probed in an attempt to obtain a valid bracketing. If this is not successful, the function fails. @var{options} is a structure specifying additional options. Currently, @code{vfzero} recognizes these options: @code{"FunValCheck"}, @code{"OutputFcn"}, @code{"TolX"}, @code{"MaxIter"}, @code{"MaxFunEvals"}. For a description of these options, see optimset. On exit, the function returns @var{x}, the approximate zero and @var{fval}, the function value thereof. @var{info} is a column vector of exit flags that can have these values: @itemize @item 1 The algorithm converged to a solution. @item 0 Maximum number of iterations or function evaluations has been reached. @item -1 The algorithm has been terminated from user output function. @item -5 The algorithm may have converged to a singular point. @end itemize @var{output} is a structure containing runtime information about the @code{fzero} algorithm. Fields in the structure are: @itemize @item iterations Number of iterations through loop. @item nfev Number of function evaluations. @item bracketx A two-column matrix with the final bracketing of the zero along the x-axis. @item brackety A two-column matrix with the final bracketing of the zero along the y-axis. @end itemize @end deftypefn wpolyfit @c wpolyfit ../inst/wpolyfit.m -*- texinfo -*- @deftypefn {Function File} {[@var{p}, @var{s}] =} wpolyfit (@var{x}, @var{y}, @var{dy}, @var{n}) Return the coefficients of a polynomial @var{p}(@var{x}) of degree @var{n} that minimizes @c ######################################################## @c These lines must be written without space at start to work around @c a bug in html generation. @iftex @tex $$ \sum_{i=1}^N (p(x_i) - y_i)^2 $$ @end tex @end iftex @ifnottex @code{sumsq (p(x(i)) - y(i))}, @end ifnottex @c ######################################################## to best fit the data in the least squares sense. The standard error on the observations @var{y} if present are given in @var{dy}. The returned value @var{p} contains the polynomial coefficients suitable for use in the function polyval. The structure @var{s} returns information necessary to compute uncertainty in the model. To compute the predicted values of y with uncertainty use @example [y,dy] = polyconf(p,x,s,'ci'); @end example You can see the effects of different confidence intervals and prediction intervals by calling the wpolyfit internal plot function with your fit: @example feval('wpolyfit:plt',x,y,dy,p,s,0.05,'pi') @end example Use @var{dy}=[] if uncertainty is unknown. You can use a chi^2 test to reject the polynomial fit: @example p = 1-chi2cdf(s.normr^2,s.df); @end example p is the probability of seeing a chi^2 value higher than that which was observed assuming the data are normally distributed around the fit. If p < 0.01, you can reject the fit at the 1% level. You can use an F test to determine if a higher order polynomial improves the fit: @example [poly1,S1] = wpolyfit(x,y,dy,n); [poly2,S2] = wpolyfit(x,y,dy,n+1); F = (S1.normr^2 - S2.normr^2)/(S1.df-S2.df)/(S2.normr^2/S2.df); p = 1-f_cdf(F,S1.df-S2.df,S2.df); @end example p is the probability of observing the improvement in chi^2 obtained by adding the extra parameter to the fit. If p < 0.01, you can reject the lower order polynomial at the 1% level. You can estimate the uncertainty in the polynomial coefficients themselves using @example dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr; @end example but the high degree of covariance amongst them makes this a questionable operation. @end deftypefn @deftypefn {Function File} {[@var{p}, @var{s}, @var{mu}] =} wpolyfit (...) If an additional output @code{mu = [mean(x),std(x)]} is requested then the @var{x} values are centered and normalized prior to computing the fit. This will give more stable numerical results. To compute a predicted @var{y} from the returned model use @code{y = polyval(p, (x-mu(1))/mu(2)} @end deftypefn @deftypefn {Function File} {} wpolyfit (...) If no output arguments are requested, then wpolyfit plots the data, the fitted line and polynomials defining the standard error range. Example @example x = linspace(0,4,20); dy = (1+rand(size(x)))/2; y = polyval([2,3,1],x) + dy.*randn(size(x)); wpolyfit(x,y,dy,2); @end example @end deftypefn @deftypefn {Function File} {} wpolyfit (..., 'origin') If 'origin' is specified, then the fitted polynomial will go through the origin. This is generally ill-advised. Use with caution. Hocking, RR (2003). Methods and Applications of Linear Models. New Jersey: John Wiley and Sons, Inc. @c Will be cut out in optims info file and replaced with the same @c refernces explicitely there, since references to core Octave @c functions are not automatically transformed from here to there. @c BEGIN_CUT_TEXINFO @seealso{polyfit} @c END_CUT_TEXINFO @seealso{polyconf} @end deftypefn wrap_f_dfdp @c wrap_f_dfdp ../inst/wrap_f_dfdp.m [ret1, ret2] = wrap_f_dfdp (f, dfdp, varargin) f and dftp should be the objective function (or "model function" in curve fitting) and its jacobian, respectively, of an optimization problem. ret1: f (varagin{:}), ret2: dfdp (varargin{:}). ret2 is only computed if more than one output argument is given. This manner of calling f and dfdp is needed by some optimization functions. wsolve @c wsolve ../inst/wsolve.m [x,s] = wsolve(A,y,dy) Solve a potentially over-determined system with uncertainty in the values. A x = y +/- dy Use QR decomposition for increased accuracy. Estimate the uncertainty for the solution from the scatter in the data. The returned structure s contains normr = sqrt( A x - y ), weighted by dy R such that R'R = A'A df = n-p, n = rows of A, p = columns of A See polyconf for details on how to use s to compute dy. The covariance matrix is inv(R'*R). If you know that the parameters are independent, then uncertainty is given by the diagonal of the covariance matrix, or dx = sqrt(N*sumsq(inv(s.R'))') where N = normr^2/df, or N = 1 if df = 0. Example 1: weighted system A=[1,2,3;2,1,3;1,1,1]; xin=[1;2;3]; dy=[0.2;0.01;0.1]; y=A*xin+randn(size(dy)).*dy; [x,s] = wsolve(A,y,dy); dx = sqrt(sumsq(inv(s.R'))'); res = [xin, x, dx] Example 2: weighted overdetermined system y = x1 + 2*x2 + 3*x3 + e A = fullfact([3,3,3]); xin=[1;2;3]; y = A*xin; dy = rand(size(y))/50; y+=dy.*randn(size(y)); [x,s] = wsolve(A,y,dy); dx = s.normr*sqrt(sumsq(inv(s.R'))'/s.df); res = [xin, x, dx] Note there is a counter-intuitive result that scaling the uncertainty in the data does not affect the uncertainty in the fit. Indeed, if you perform a monte carlo simulation with x,y datasets selected from a normal distribution centered on y with width 10*dy instead of dy you will see that the variance in the parameters indeed increases by a factor of 100. However, if the error bars really do increase by a factor of 10 you should expect a corresponding increase in the scatter of the data, which will increase the variance computed by the fit. optim-1.4.1/src/samin.cc.docstrings0000644000175000017500000000335012477524021015661 0ustar olafolaf### This file is generated automatically from the ### corresponding .cc file by a Makefile rule. samin @c samin samin.cc samin: simulated annealing minimization of a function. See samin_example.m usage: [x, obj, convergence, details] = samin(\"f\", {args}, {control}) Arguments: * \"f\": function name (string) * {args}: a cell array that holds all arguments of the function, * {control}: a cell array with 11 elements * LB - vector of lower bounds * UB - vector of upper bounds * nt - integer: # of iterations between temperature reductions * ns - integer: # of iterations between bounds adjustments * rt - (0 < rt <1): temperature reduction factor * maxevals - integer: limit on function evaluations * neps - integer: number of values final result is compared to * functol - (> 0): the required tolerance level for function value comparisons * paramtol - (> 0): the required tolerance level for parameters * verbosity - scalar: 0, 1, or 2. * 0 = no screen output * 1 = only final results to screen * 2 = summary every temperature change * minarg - integer: which of function args is minimization over? Returns: * x: the minimizer * obj: the value of f() at x * convergence: 0 if no convergence within maxevals function evaluations 1 if normal convergence to a point interior to the parameter space 2 if convergence to point very near bounds of parameter space (suggest re-running with looser bounds) * details: a px3 matrix. p is the number of times improvements were found. The columns record information at the time an improvement was found * first: cumulative number of function evaluations * second: temperature * third: function value Example: see samin_example optim-1.4.1/src/mkdoc.pl0000755000175000017500000000432712477523726013542 0ustar olafolaf#! /usr/bin/perl -w # # Copyright (C) 2012-2013 Rik Wehbring # # This file is part of Octave. # # Octave is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation; either version 3 of the License, or (at # your option) any later version. # # Octave is distributed in the hope that it will be useful, but WITHOUT # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or # FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License # for more details. # # You should have received a copy of the GNU General Public License # along with Octave; see the file COPYING. If not, see # . # copied from Octave and slightly modified by Olaf Till unless (@ARGV >= 1) { die "Usage: $0 m_filename1 ..." ; } print <<__END_OF_MSG__; ### This file is generated automatically by the packages `mkdoc.pl'. __END_OF_MSG__ MFILE: foreach $full_fname (@ARGV) { next MFILE unless ( $full_fname =~ m{([^/]+)\.m$} ); $fcn = $1; @help_txt = gethelp ($fcn, $full_fname); next MFILE if ($help_txt[0] eq ""); print "\x{1d}$fcn\n"; print "\@c $fcn $full_fname\n"; foreach $_ (@help_txt) { s/^\s+\@/\@/ unless $in_example; s/^\s+\@group/\@group/; s/^\s+\@end\s+group/\@end group/; $in_example = (/\s*\@example\b/ .. /\s*\@end\s+example\b/); print $_; } } ################################################################################ # Subroutines ################################################################################ sub gethelp { ($fcn, $fname) = @_[0..1]; open (FH, $fname) or return ""; do { @help_txt = (); ## Advance to non-blank line while (defined ($_ = ) and /^\s*$/) {;} if (! /^\s*(?:#|%)/ or eof (FH)) { ## No comment block found. Return empty string close (FH); return ""; } ## Extract help text stopping when comment block ends do { ## Remove comment characters at start of line s/^\s*(?:#|%){1,2} ?//; push (@help_txt, $_); } until (! defined ($_ = ) or ! /^\s*(?:#|%)/); } until ($help_txt[0] !~ /^(?:Copyright|Author)/); close (FH); return @help_txt; } optim-1.4.1/src/Makefile0000644000175000017500000000732612477523726013547 0ustar olafolafMKOCTFILE ?= mkoctfile ifndef LAPACK_LIBS LAPACK_LIBS := $(shell $(MKOCTFILE) -p LAPACK_LIBS) endif OCTAVE_LAPACK_LIBS := $(shell $(MKOCTFILE) -p LAPACK_LIBS) # reported necessary for Apple's VecLib framework by Carlo de Falco # ifndef BLAS_LIBS BLAS_LIBS := $(shell $(MKOCTFILE) -p BLAS_LIBS) endif OCTAVE_BLAS_LIBS := $(shell $(MKOCTFILE) -p BLAS_LIBS) CCFILES := $(wildcard *.cc) DSFILES := $(addsuffix .docstrings,$(CCFILES)) MFILES := $(wildcard *.m) $(wildcard ../inst/*.m) # Passing LFLAGS, supplemented with LAPACK_LIBS and BLAS_LIBS, in the # environment to mkoctfile is prefered over passing LAPACK_LIBS and # BLAS_LIBS in mkoctfiles commandline due to mkoctfiles difficulties # with non-standard flags on some systems (e.g. -framework ... on # Apple) LFLAGS := $(shell $(MKOCTFILE) -p LFLAGS) OCTAVE_LFLAGS := $(LFLAGS) LFLAGS += $(LAPACK_LIBS) LFLAGS += $(BLAS_LIBS) OCTAVE_LFLAGS += $(OCTAVE_LAPACK_LIBS) OCTAVE_LFLAGS += $(OCTAVE_BLAS_LIBS) .PHONY: all clean doc release INFOFILE := ../doc/optim.info TEXIFILE := $(addsuffix .texi,$(basename $(INFOFILE))) TXIFILE := $(addsuffix .txi,$(basename $(INFOFILE))) OCTFILES := __bfgsmin.oct numgradient.oct numhessian.oct samin.oct __disna_optim__.oct all: doc $(OCTFILES) # __disna_optim__ should be linked to the same Lapack library as used by Octave __disna_optim__.oct: __disna_optim__.cc LFLAGS="$(OCTAVE_LFLAGS)" $(MKOCTFILE) -s __disna_optim__.cc %.oct: %.cc LFLAGS="$(LFLAGS)" $(MKOCTFILE) -s $< # This target should be made by the maintainer for a release. This # should be done automatically by `./bootstrap'. doc: $(INFOFILE) $(INFOFILE): $(TEXIFILE) makeinfo $(TEXIFILE) -o $(INFOFILE) $(TEXIFILE): $(TXIFILE) MFDOCSTRINGS $(DSFILES) munge-texi.pl ./munge-texi.pl MFDOCSTRINGS $(DSFILES) < $(TXIFILE) > $(TEXIFILE) MFDOCSTRINGS: $(MFILES) mkdoc.pl ./mkdoc.pl $(MFILES) > MFDOCSTRINGS # take long CPP macro names, unlikely to be present in package code # (which is checked) or to be defined by Octave (which can't be # checked) RDEFUN_DLD := DEFUN_DLD_REPLACEMENT_FOR_PACKAGE_DOCS RDEFUNX_DLD := DEFUNX_DLD_REPLACEMENT_FOR_PACKAGE_DOCS %.cc.docstrings: %.ii (echo -n "### This file is generated automatically from the\n### corresponding .cc file by a Makefile rule.\n\n"; sed -e '/.*$(RDEFUN_DLD)/!D' -e 's/.*$(RDEFUN_DLD)(\([^,]\+\),"\(.\+\)\(\\n\)*")$$/\x1D\1\n@c \1 $(addsuffix .cc,$(basename $<))\n\2/' -e 's/\\n/\n/g' $<) > $@ # Explanation of the command after the checks: Macro names are changed # by `sed' so that a different definition can be given to them on the # `cpp' commandline. After running the `cpp' command, one can be sure # that there are not more than one `$(RDEFUN_DLD)' resulting from a # DEFUN(X)_DLD invocation in the original code at one line, that every # occurance of `$(RDEFUN_DLD)' indeed corresponds to an original macro # invocation, and not to `DEFUN(X)_DLD' within a comment or a string, # and that we have all necessary information within one line of # text. The package code is expected to give the function # documentation as a string constant directly in the macro invocation, # as is normally done. %.ii: %.cc if grep -q $(RDEFUN_DLD) $<; then echo "The string '$(RDEFUN_DLD)' must not be present in source code, but is in $<."; exit 1; fi if grep -q $(RDEFUNX_DLD) $<; then echo "The string '$(RDEFUNX_DLD)' must not be present in source code, but is in $<."; exit 1; fi sed -e s/DEFUN_DLD/$(RDEFUN_DLD)/g -e s/DEFUNX_DLD/$(RDEFUNX_DLD)/g $< | $(CPP) `$(MKOCTFILE) -p INCFLAGS` -x c++ -iquote '.' -D'$(RDEFUN_DLD)(name,args,nargout,doc)=$(RDEFUN_DLD)(name,doc)' -D'$(RDEFUNX_DLD)(name,fname,gname,args,nargout,doc)=$(RDEFUN_DLD)(name,doc)' - > $@ clean: $(RM) *.o octave-core *.oct *.cc.docstrings MFDOCSTRINGS $(INFOFILE) $(TEXIFILE) *~ optim-1.4.1/src/bootstrap0000755000175000017500000000001112477523726014032 0ustar olafolafmake doc optim-1.4.1/src/__disna_optim__.cc0000644000175000017500000001063412477523726015514 0ustar olafolaf// Copyright (C) 2011 Olaf Till // // This program is free software; you can redistribute it and/or modify it under // the terms of the GNU General Public License as published by the Free Software // Foundation; either version 3 of the License, or (at your option) any later // version. // // This program is distributed in the hope that it will be useful, but WITHOUT // ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or // FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more // details. // // You should have received a copy of the GNU General Public License along with // this program; if not, see . // This function has also been submitted to Octave (bug #33503). #include #include "f77-fcn.h" extern "C" { F77_RET_T F77_FUNC (ddisna, DDISNA) (F77_CONST_CHAR_ARG_DECL, const octave_idx_type&, const octave_idx_type&, const double*, double*, octave_idx_type&); F77_RET_T F77_FUNC (sdisna, SDISNA) (F77_CONST_CHAR_ARG_DECL, const octave_idx_type&, const octave_idx_type&, const float*, float*, octave_idx_type&); } DEFUN_DLD (__disna_optim__, args, , "-*- texinfo -*-\n\ @deftypefn {Loadable Function} {@var{rcond} =} __disna__ (@var{job}, @var{d})\n\ @deftypefnx {Loadable Function} {@var{rcond} =} __disna__ (@var{job}, @var{d}, @var{m}, @var{n})\n\ Undocumented internal function.\n\ @end deftypefn") { /* Interface to DDISNA and SDISNA of LAPACK. If job is 'E', no third or fourth argument are given. If job is 'L' or 'R', M and N are given. */ std::string fname ("__disna__"); octave_value retval; if (args.length () != 2 && args.length () != 4) { print_usage (); return retval; } std::string job_str (args(0).string_value ()); char job; if (job_str.length () != 1) error ("%s: invalid job label", fname.c_str ()); else job = job_str[0]; octave_idx_type m, n, l; bool single; octave_value d; if (args(1).is_single_type ()) { single = true; d = args(1).float_column_vector_value (); } else { single = false; d = args(1).column_vector_value (); } if (! error_state) { l = d.length (); switch (job) { case 'E' : if (args.length () != 2) error ("%s: with job label 'E' only two arguments are allowed", fname.c_str ()); else m = l; break; case 'L' : case 'R' : if (args.length () != 4) error ("%s: with job labels 'L' or 'R', four arguments must be given", fname.c_str ()); else { m = args(2).idx_type_value (); n = args(3).idx_type_value (); if (! error_state) { octave_idx_type md = m < n ? m : n; if (l != md) error ("%s: given dimensions don't match length of second argument", fname.c_str ()); } } break; default : error ("%s: job label not correct", fname.c_str ()); } } if (error_state) { error ("%s: invalid arguments", fname.c_str ()); return retval; } octave_idx_type info; if (single) { FloatColumnVector srcond (l); F77_XFCN (sdisna, SDISNA, (F77_CONST_CHAR_ARG2 (&job, 1), m, n, d.float_column_vector_value ().fortran_vec (), srcond.fortran_vec (), info)); retval = srcond; } else { ColumnVector drcond (l); F77_XFCN (ddisna, DDISNA, (F77_CONST_CHAR_ARG2 (&job, 1), m, n, d.column_vector_value ().fortran_vec (), drcond.fortran_vec (), info)); retval = drcond; } if (info < 0) error ("%s: LAPACK routine says %i-th argument had an illegal value", fname.c_str (), -info); return retval; } optim-1.4.1/src/numhessian.cc.docstrings0000644000175000017500000000133012477524020016717 0ustar olafolaf### This file is generated automatically from the ### corresponding .cc file by a Makefile rule. numhessian @c numhessian numhessian.cc numhessian(f, {args}, minarg) Numeric second derivative of f with respect to argument \"minarg\". * first argument: function name (string) * second argument: all arguments of the function (cell array) * third argument: (optional) the argument to differentiate w.r.t. (scalar, default=1) If the argument is a k-vector, the Hessian will be a kxk matrix function a = f(x, y) a = x'*x + log(y); endfunction numhessian(\"f\", {ones(2,1), 1}) ans = 2.0000e+00 -7.4507e-09 -7.4507e-09 2.0000e+00 Now, w.r.t. second argument: numhessian(\"f\", {ones(2,1), 1}, 2) ans = -1.0000 optim-1.4.1/src/numgradient.cc.docstrings0000644000175000017500000000132312477524020017064 0ustar olafolaf### This file is generated automatically from the ### corresponding .cc file by a Makefile rule. numgradient @c numgradient numgradient.cc numgradient(f, {args}, minarg) Numeric central difference gradient of f with respect to argument \"minarg\". * first argument: function name (string) * second argument: all arguments of the function (cell array) * third argument: (optional) the argument to differentiate w.r.t. (scalar, default=1) \"f\" may be vector-valued. If \"f\" returns an n-vector, and the argument is a k-vector, the gradient will be an nxk matrix Example: function a = f(x); a = [x'*x; 2*x]; endfunction numgradient(\"f\", {ones(2,1)}) ans = 2.00000 2.00000 2.00000 0.00000 0.00000 2.00000 optim-1.4.1/src/__bfgsmin.cc0000644000175000017500000004226512477523726014342 0ustar olafolaf// Copyright (C) 2004,2005,2006,2007,2010 Michael Creel // // This program is free software; you can redistribute it and/or modify it under // the terms of the GNU General Public License as published by the Free Software // Foundation; either version 3 of the License, or (at your option) any later // version. // // This program is distributed in the hope that it will be useful, but WITHOUT // ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or // FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more // details. // // You should have received a copy of the GNU General Public License along with // this program; if not, see . // the functions defined in this file are: // __bfgsmin_obj: bulletproofed objective function that allows checking for availability of analytic gradient // __numgradient: numeric gradient, used only if analytic not supplied // __bisectionstep: fallback stepsize algorithm // __newtonstep: default stepsize algorithm // __bfgsmin: the DLD function that does the minimization, to be called from bfgsmin.m #include #include #include #include #include #include #include "error.h" int __bfgsmin_obj(double &obj, const std::string f, const octave_value_list f_args, const ColumnVector theta, const int minarg) { octave_value_list f_return, f_args_new; int success = 1; f_args_new = f_args; f_args_new(minarg - 1) = theta; f_return = feval(f, f_args_new); obj = f_return(0).double_value(); // bullet-proof the objective function if (error_state) { warning("__bfgsmin_obj: objective function could not be evaluated - setting to DBL_MAX"); obj = DBL_MAX; success = 0; } return success; } // __numgradient: numeric central difference gradient for bfgs. // This is the same as numgradient, except the derivative is known to be a vector, it's defined as a column, // and the finite difference delta is incorporated directly rather than called from a function int __numgradient(ColumnVector &derivative, const std::string f, const octave_value_list f_args, const int minarg) { double SQRT_EPS, diff, delta, obj_left, obj_right, p; int j, test, success; ColumnVector parameter = f_args(minarg - 1).column_vector_value(); int k = parameter.rows(); ColumnVector g(k); SQRT_EPS = sqrt(DBL_EPSILON); diff = exp(log(DBL_EPSILON)/3.0); // get 1st derivative by central difference for (j=0; j diff; if (test) delta = (fabs(p) + SQRT_EPS) * SQRT_EPS; else delta = diff; // right side parameter(j) = p + delta; success = __bfgsmin_obj(obj_right, f, f_args, parameter, minarg); if (!success) error("__numgradient: objective function failed, can't compute numeric gradient"); // left size parameter(j) = p - delta; success = __bfgsmin_obj(obj_left, f, f_args, parameter, minarg); if (!success) error("__numgradient: objective function failed, can't compute numeric gradient"); parameter(j) = p; // restore original parameter for next round g(j) = (obj_right - obj_left) / (2.0*delta); } derivative = g; return success; } int __bfgsmin_gradient(ColumnVector &derivative, const std::string f, octave_value_list f_args, const ColumnVector theta, const int minarg, int try_analytic_gradient, int &have_analytic_gradient) { octave_value_list f_return; int k = theta.rows(); int success; ColumnVector g(k); Matrix check_gradient(k,1); if (have_analytic_gradient) { f_args(minarg - 1) = theta; f_return = feval(f, f_args); g = f_return(1).column_vector_value(); } else if (try_analytic_gradient) { f_args(minarg - 1) = theta; f_return = feval(f, f_args); if (f_return.length() > 1) { if (f_return(1).is_real_matrix()) { if ((f_return(1).rows() == k) & (f_return(1).columns() == 1)) { g = f_return(1).column_vector_value(); have_analytic_gradient = 1; } else have_analytic_gradient = 0; } else have_analytic_gradient = 0; } else have_analytic_gradient = 0; if (!have_analytic_gradient) __numgradient(g, f, f_args, minarg); } else __numgradient(g, f, f_args, minarg); // check that gradient is ok check_gradient.column(0) = g; if (check_gradient.any_element_is_inf_or_nan()) { error("__bfgsmin_gradient: gradient contains NaNs or Inf"); success = 0; } else success = 1; derivative = g; return success; } // this is the lbfgs direction, used if control has 5 elements ColumnVector lbfgs_recursion(const int memory, const Matrix sigmas, const Matrix gammas, ColumnVector d) { if (memory == 0) { const int n = sigmas.columns(); ColumnVector sig = sigmas.column(n-1); ColumnVector gam = gammas.column(n-1); // do conditioning if there is any memory double cond = gam.transpose()*gam; if (cond > 0) { cond = (sig.transpose()*gam) / cond; d = cond*d; } return d; } else { const int k = d.rows(); const int n = sigmas.columns(); int i, j; ColumnVector sig = sigmas.column(memory-1); ColumnVector gam = gammas.column(memory-1); double rho; rho = 1.0 / (gam.transpose() * sig); double alpha; alpha = rho * (sig.transpose() * d); d = d - alpha*gam; d = lbfgs_recursion(memory - 1, sigmas, gammas, d); d = d + (alpha - rho * gam.transpose() * d) * sig; } return d; } // __bisectionstep: fallback stepsize method if __newtonstep fails int __bisectionstep(double &step, double &obj, const std::string f, const octave_value_list f_args, const ColumnVector x, const ColumnVector dx, const int minarg, const int verbose) { double best_obj, improvement, improvement_0; int found_improvement; ColumnVector trial; // initial values best_obj = obj; improvement_0 = 0.0; step = 1.0; trial = x + step*dx; __bfgsmin_obj(obj, f, f_args, trial, minarg); if (verbose) printf("bisectionstep: trial step: %g obj value: %g\n", step, obj); // this first loop goes until an improvement is found while (obj >= best_obj) { if (step < 2.0*DBL_EPSILON) { if (verbose) warning("bisectionstep: unable to find improvement, setting step to zero"); step = 0.0; return 0; } step = 0.5*step; trial = x + step*dx; __bfgsmin_obj(obj, f, f_args, trial, minarg); if (verbose) printf("bisectionstep: trial step: %g obj value: %g best_value: %g\n", step, obj, best_obj); } // now keep going until rate of improvement is too low, or reach max trials best_obj = obj; while (step > 2.0*DBL_EPSILON) { step = 0.5*step; trial = x + step*dx; __bfgsmin_obj(obj, f, f_args, trial, minarg); if (verbose) printf("bisectionstep: trial step: %g obj value: %g\n", step, obj); // if improved, record new best and try another step if (obj < best_obj) { improvement = best_obj - obj; best_obj = obj; if (improvement > 0.5*improvement_0) { improvement_0 = improvement; } else break; } else { step = 2.0*step; // put it back to best found obj = best_obj; break; } } return 1; } // __newtonstep: default stepsize algorithm int __newtonstep(double &step, double &obj, const std::string f, const octave_value_list f_args, const ColumnVector x, const ColumnVector dx, const int minarg, const int verbose) { double obj_0, obj_left, obj_right, delta, inv_delta_sq, gradient, hessian; int found_improvement = 0; obj_0 = obj; delta = 0.001; // experimentation shows that this is a good choice inv_delta_sq = 1.0 / (delta*delta); ColumnVector x_right = x + delta*dx; ColumnVector x_left = x - delta*dx; // right __bfgsmin_obj(obj_right, f, f_args, x_right, minarg); // left __bfgsmin_obj(obj_left, f, f_args, x_left, minarg); gradient = (obj_right - obj_left) / (2.0*delta); // take central difference hessian = inv_delta_sq*(obj_right - 2.0*obj_0 + obj_left); hessian = fabs(hessian); // ensures we're going in a decreasing direction if (hessian < 2.0*DBL_EPSILON) hessian = 1.0; // avoid div by zero step = - gradient / hessian; // hessian inverse gradient: the Newton step // step = (step < 1.0)*step + 1.0*(step >= 1.0); // maximum stepsize is 1.0 - conservative // ensure that this is improvement, and if not, fall back to bisection __bfgsmin_obj(obj, f, f_args, x + step*dx, minarg); if (verbose) printf("newtonstep: trial step: %g obj value: %g\n", step, obj); if (obj > obj_0) { obj = obj_0; if (verbose) warning("__stepsize: no improvement with Newton step, falling back to bisection"); found_improvement = __bisectionstep(step, obj, f, f_args, x, dx, minarg, verbose); } else found_improvement = 1; if (xisnan(obj)) { obj = obj_0; if (verbose) warning("__stepsize: objective function crash in Newton step, falling back to bisection"); found_improvement = __bisectionstep(step, obj, f, f_args, x, dx, minarg, verbose); } else found_improvement = 1; return found_improvement; } DEFUN_DLD(__bfgsmin, args, ,"__bfgsmin: backend for bfgs minimization\n\ Users should not use this directly. Use bfgsmin.m instead") { std::string f (args(0).string_value()); Cell f_args_cell (args(1).cell_value()); octave_value_list f_args, f_return; // holder for return items int max_iters, verbosity, criterion, minarg, convergence, iter, memory, \ gradient_ok, i, j, k, conv_fun, conv_param, conv_grad, have_gradient, \ try_gradient, warnings; double func_tol, param_tol, gradient_tol, stepsize, obj_value, obj_in, \ last_obj_value, obj_value2, denominator, test; Matrix H, H1, H2; ColumnVector thetain, d, g, g_new, p, q, sig, gam; // controls Cell control (args(2).cell_value()); max_iters = control(0).int_value(); if (max_iters == -1) max_iters = INT_MAX; verbosity = control(1).int_value(); criterion = control(2).int_value(); minarg = control(3).int_value(); memory = control(4).int_value(); func_tol = control(5).double_value(); param_tol = control(6).double_value(); gradient_tol = control(7).double_value(); // want to see warnings? warnings = 0; if (verbosity == 3) warnings = 1; // copy cell contents over to octave_value_list to use feval() k = f_args_cell.length(); f_args(k); // resize only once for (i = 0; i 0) { // lbfgs if (iter < memory) d = lbfgs_recursion(iter, sigmas, gammas, g); else d = lbfgs_recursion(memory, sigmas, gammas, g); d = -d; } else d = -H*g; // ordinary bfgs // convergence tests conv_fun = 0; conv_param = 0; conv_grad = 0; // function convergence p = theta+d; __bfgsmin_obj(obj_value, f, f_args, p, minarg); if (fabs(last_obj_value) > 1.0) conv_fun=(fabs((obj_value/last_obj_value-1))) 1.0) conv_param = sqrt(d.transpose() * d) / test < param_tol ; else conv_param = sqrt(d.transpose() * d) < param_tol; // Want intermediate results? // gradient convergence conv_grad = sqrt(g.transpose() * g) < gradient_tol; // Are we done? if (criterion == 1) { if (conv_fun && conv_param && conv_grad) { convergence = 1; break; } } else if (conv_fun) { convergence = 1; break; } // if not done, then take a step // stepsize: try (l)bfgs direction, then steepest descent if it fails f_args(minarg - 1) = theta; obj_value = last_obj_value; __newtonstep(stepsize, obj_value, f, f_args, theta, d, minarg, warnings); if (stepsize == 0.0) { // fall back to steepest descent if (warnings) warning("bfgsmin: BFGS direction fails, switch to steepest descent"); d = -g; // try steepest descent H = identity_matrix(k,k); // accompany with Hessian reset, for good measure obj_value = last_obj_value; __newtonstep(stepsize, obj_value, f, f_args, theta, d, minarg, warnings); if (stepsize == 0.0) { // if true, exit, we can't find a direction of descent warning("bfgsmin: failure, exiting. Try different start values?"); f_return(0) = theta; f_return(1) = obj_value; f_return(2) = -1; f_return(3) = iter; return octave_value_list(f_return); } } p = stepsize*d; // Want intermediate results? if (verbosity > 1) { printf("------------------------------------------------\n"); printf("bfgsmin iteration %d convergence (f g p): %d %d %d\n", iter, conv_fun, conv_grad, conv_param); if (warnings) { if (memory > 0) printf("Using LBFGS, memory is last %d iterations\n",memory); } printf("\nfunction value: %g stepsize: %g \n\n", last_obj_value, stepsize); if (have_gradient) printf("used analytic gradient\n"); else printf("used numeric gradient\n"); for (j = 0; j 0; j--) { for(i = 0; i < k; i++) { sigmas(i,j) = sigmas(i,j-1); gammas(i,j) = gammas(i,j-1); } } // insert new vectors in left-most column for(i = 0; i < k; i++) { sigmas(i, 0) = sig(i); gammas(i, 0) = gam(i); } } else { // failed gradient - loose memory and use previous theta sigmas.fill(0.0); gammas.fill(0.0); theta = theta - p; } } } // Want last iteration results? if (verbosity > 0) { printf("------------------------------------------------\n"); printf("bfgsmin final results: %d iterations\n", iter); if (warnings) { if (memory > 0) printf("Used LBFGS, memory is last %d iterations\n",memory); } printf("\nfunction value: %g\n\n", obj_value); if (convergence == -1) printf("NO CONVERGENCE: max iters exceeded\n"); if ((convergence == 1) & (criterion == 1)) printf("STRONG CONVERGENCE\n"); if ((convergence == 1) & !(criterion == 1)) printf("WEAK CONVERGENCE\n"); if (convergence == 2) printf("NO CONVERGENCE: algorithm failed\n"); printf("Function conv %d Param conv %d Gradient conv %d\n\n", conv_fun, conv_param, conv_grad); if (have_gradient) printf("used analytic gradient\n"); else printf("used numeric gradient\n"); printf(" param gradient (n) change\n"); for (j = 0; j // // This program is free software; you can redistribute it and/or modify it under // the terms of the GNU General Public License as published by the Free Software // Foundation; either version 3 of the License, or (at your option) any later // version. // // This program is distributed in the hope that it will be useful, but WITHOUT // ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or // FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more // details. // // You should have received a copy of the GNU General Public License along with // this program; if not, see . // References: // // The code follows the article // Goffe, William L. (1996) "SIMANN: A Global Optimization Algorithm // using Simulated Annealing " Studies in Nonlinear Dynamics & Econometrics // Oct96, Vol. 1 Issue 3. // // The code uses the same names for control variables, // for the most part. A notable difference is that the initial // temperature is found automatically to ensure that the active // bounds when the temperature begins to reduce cover the entire // parameter space (defined as a n-dimensional // rectangle that is the Cartesian product of the // (lb_i, ub_i), i = 1,2,..n // // Also of note: // Corana et. al., (1987) "Minimizing Multimodal Functions of Continuous // Variables with the "Simulated Annealing" Algorithm", // ACM Transactions on Mathematical Software, V. 13, N. 3. // // Goffe, et. al. (1994) "Global Optimization of Statistical Functions // with Simulated Annealing", Journal of Econometrics, // V. 60, N. 1/2. #include #include #include #include #include #include #include "error.h" // define argument checks static bool any_bad_argument(const octave_value_list& args) { // objective function name is a string? if (!args(0).is_string()) { error("samin: first argument must be string holding objective function name"); return true; } // are function arguments contained in a cell? if (!args(1).is_cell()) { error("samin: second argument must cell array of function arguments"); return true; } // is control a cell? Cell control (args(2).cell_value()); if (error_state) { error("samin: third argument must cell array of algorithm controls"); return true; } // does control have proper number of elements? if (!(control.length() == 11)) { error("samin: third argument must be a cell array with 11 elements"); return true; } // now check type of each element of control if (!(control(0).is_real_matrix()) && !(control(0).is_real_scalar())) { error("samin: 1st element of controls must be LB: a vector of lower bounds"); return true; } if ((control(0).is_real_matrix()) && (control(0).columns() != 1)) { error("samin: 1st element of controls must be LB: a vector of lower bounds"); return true; } if (!(control(1).is_real_matrix()) && !(control(1).is_real_scalar())) { error("samin: 1st element of controls must be UB: a vector of lower bounds"); return true; } if ((control(1).is_real_matrix()) && (control(1).columns() != 1)) { error("samin: 2nd element of controls must be UB: a vector of lower bounds"); return true; } int tmp = control(2).int_value(); if (error_state || tmp < 1) { error("samin: 3rd element of controls must be NT: positive integer\n\ loops per temperature reduction"); return true; } tmp = control(3).int_value(); if (error_state || tmp < 1) { error("samin: 4th element of controls must be NS: positive integer\n\ loops per stepsize adjustment"); return true; } double tmp2 = control(4).double_value(); if (error_state || tmp < 0) { error("samin: 5th element of controls must be RT:\n\ temperature reduction factor, RT > 0"); return true; } tmp2 = control(5).double_value(); if (error_state || tmp < 0) { error("samin: 6th element of controls must be integer MAXEVALS > 0 "); return true; } tmp = control(6).int_value(); if (error_state || tmp < 0) { error("samin: 7th element of controls must be NEPS: positive integer\n\ number of final obj. values that must be within EPS of eachother "); return true; } tmp2 = control(7).double_value();if (error_state || tmp2 < 0) { error("samin: 8th element of controls must must be FUNCTOL (> 0)\n\ used to compare the last NEPS obj values for convergence test"); return true; } tmp2 = control(8).double_value(); if (error_state || tmp2 < 0) { error("samin: 9th element of controls must must be PARAMTOL (> 0)\n\ used to compare the last NEPS obj values for convergence test"); return true; } tmp = control(9).int_value(); if (error_state || tmp < 0 || tmp > 2) { error("samin: 9th element of controls must be VERBOSITY (0, 1, or 2)"); return true; } tmp = control(10).int_value(); if (error_state || tmp < 0) { error("samin: 10th element of controls must be MINARG (integer)\n\ position of argument to minimize wrt"); return true; } // make sure that minarg points to an existing element if ((tmp > args(1).length())||(tmp < 1)) { error("bfgsmin: 4th argument must be a positive integer that indicates \n\ which of the elements of the second argument is the one minimization is over"); return true; } return false; } //-------------- The annealing algorithm -------------- DEFUN_DLD(samin, args, , "samin: simulated annealing minimization of a function. See samin_example.m\n\ \n\ usage: [x, obj, convergence, details] = samin(\"f\", {args}, {control})\n\ \n\ Arguments:\n\ * \"f\": function name (string)\n\ * {args}: a cell array that holds all arguments of the function,\n\ * {control}: a cell array with 11 elements\n\ * LB - vector of lower bounds\n\ * UB - vector of upper bounds\n\ * nt - integer: # of iterations between temperature reductions\n\ * ns - integer: # of iterations between bounds adjustments\n\ * rt - (0 < rt <1): temperature reduction factor\n\ * maxevals - integer: limit on function evaluations\n\ * neps - integer: number of values final result is compared to\n\ * functol - (> 0): the required tolerance level for function value\n\ comparisons\n\ * paramtol - (> 0): the required tolerance level for parameters\n\ * verbosity - scalar: 0, 1, or 2.\n\ * 0 = no screen output\n\ * 1 = only final results to screen\n\ * 2 = summary every temperature change\n\ * minarg - integer: which of function args is minimization over?\n\ \n\ Returns:\n\ * x: the minimizer\n\ * obj: the value of f() at x\n\ * convergence:\n\ 0 if no convergence within maxevals function evaluations\n\ 1 if normal convergence to a point interior to the parameter space\n\ 2 if convergence to point very near bounds of parameter space\n\ (suggest re-running with looser bounds)\n\ * details: a px3 matrix. p is the number of times improvements were found.\n\ The columns record information at the time an improvement was found\n\ * first: cumulative number of function evaluations\n\ * second: temperature\n\ * third: function value\n\ \n\ Example: see samin_example\n\ ") { int nargin = args.length(); if (!(nargin == 3)) { error("samin: you must supply 3 arguments"); return octave_value_list(); } // check the arguments if (any_bad_argument(args)) return octave_value_list(); std::string obj_fn (args(0).string_value()); Cell f_args_cell = args(1).cell_value (); // args to obj fn come in as a cell to allow arbitrary number Cell control (args(2).cell_value()); octave_value_list f_args; octave_value_list f_return; // holder for feval returns int m, i, j, k, h, n, nacc, func_evals; int nup, nrej, nnew, ndown, lnobds; int converge, test, coverage_ok; // user provided controls const ColumnVector lb (control(0).column_vector_value()); const ColumnVector ub (control(1).column_vector_value()); const int nt (control(2).int_value()); const int ns (control(3).int_value()); const double rt (control(4).double_value()); const int maxevals (control(5).int_value()); const int neps (control(6).int_value()); const double functol (control(7).double_value()); const double paramtol (control(8).double_value()); const int verbosity (control(9).int_value()); const int minarg (control(10).int_value()); // type checking for minimization parameter done here, since we don't know minarg // until now if (!(f_args_cell(minarg - 1).is_real_matrix() || (f_args_cell(minarg - 1).is_real_scalar()))) { error("samin: minimization must be with respect to a column vector"); return octave_value_list(); } if ((f_args_cell(minarg - 1).is_real_matrix()) && (f_args_cell(minarg - 1).columns() != 1)) { error("samin: minimization must be with respect to a column vector"); return octave_value_list(); } double f, fp, p, pp, fopt, rand_draw, ratio, t; Matrix details(1,3); // record function evaluations, temperatures and function values RowVector info(3); // copy cell contents over to octave_value_list to use feval() k = f_args_cell.length(); f_args(k); // resize only once for (i = 0; i ub(i)) || (x(i) < lb(i))) { error("samin: initial parameter %d out of bounds", i+1); return octave_value_list(); } } // Initial obj_value f_return = feval(obj_fn, f_args); f = f_return(0).double_value(); fopt = f; // give it something to compare to func_evals = 0; // total function evaluations (limited by maxeval) details(0,0) = func_evals; details(0,1) = t; details(0,2) = fopt; // main loop, first increase temperature until parameter space covered, then reduce until convergence while(converge==0) { // statistics to report at each temp change, set back to zero nup = 0; nrej = 0; nnew = 0; ndown = 0; lnobds = 0; // repeat nt times then adjust temperature for(m = 0;m < nt;m++) { // repeat ns times, then adjust bounds for(j = 0;j < ns;j++) { // generate new point by taking last and adding a random value // to each of elements, in turn for(h = 0;h < n;h++) { // new Sept 2011, if bounds are same, skip the search for that vbl. Allows restrictions without complicated programming if (lb(h) != ub(h)) { xp = x; rand_draw = octave_rand::scalar(); xp(h) = x(h) + (2.0 * rand_draw - 1.0) * bounds(h); if((xp(h) < lb(h)) || (xp(h) > ub(h))) { rand_draw = octave_rand::scalar(); // change 07-Nov-2007: avoid correlation with hitting bounds xp(h) = lb(h) + (ub(h) - lb(h)) * rand_draw; lnobds = lnobds + 1; } // Evaluate function at new point f_args(minarg - 1) = xp; f_return = feval(obj_fn, f_args); fp = f_return(0).double_value(); func_evals = func_evals + 1; // Accept the new point if the function value decreases if(fp <= f) { x = xp; f = fp; nacc = nacc + 1; // total number of acceptances nacp(h) = nacp(h) + 1; // acceptances for this parameter nup = nup + 1; // If lower than any other point, record as new optimum if(fp < fopt) { xopt = xp; fopt = fp; nnew = nnew + 1; info(0) = func_evals; info(1) = t; info(2) = fp; details = details.stack(info); } } // If the point is higher, use the Metropolis criteria to decide on // acceptance or rejection. else { p = exp(-(fp - f) / t); rand_draw = octave_rand::scalar(); if(rand_draw < p) { x = xp; f = fp; nacc = nacc + 1; nacp(h) = nacp(h) + 1; ndown = ndown + 1; } else nrej = nrej + 1; } } // If maxevals exceeded, terminate the algorithm if(func_evals >= maxevals) { if(verbosity >= 1) { printf("\n================================================\n"); printf("SAMIN results\n"); printf("NO CONVERGENCE: MAXEVALS exceeded\n"); printf("================================================\n"); printf("Convergence tolerances: Func. tol. %e Param. tol. %e\n", functol, paramtol); printf("Obj. fn. value %f\n\n", fopt); printf(" parameter search width\n"); for(i = 0; i < n; i++) printf("%20f%20f\n", xopt(i), bounds(i)); } f_return(3) = details; f_return(2) = 0; f_return(1) = fopt; f_return(0) = xopt; return octave_value_list(f_return); } } } // Adjust bounds so that approximately half of all evaluations are accepted test = 0; for(i = 0;i < n;i++) { if (lb(i) != ub(i)) { ratio = nacp(i) / ns; if(ratio > 0.6) bounds(i) = bounds(i) * (1.0 + 2.0 * (ratio - 0.6) / 0.4); else if(ratio < .4) bounds(i) = bounds(i) / (1.0 + 2.0 * ((0.4 - ratio) / 0.4)); // keep within initial bounds if(bounds(i) >= (ub(i) - lb(i))) { bounds(i) = ub(i) - lb(i); test = test + 1; } } else test = test + 1; // make sure coverage check passes for the fixed parameters } nacp.fill(0.0); // check if we cover parameter space. if we have yet to do so if (!coverage_ok) coverage_ok = (test == n); } // intermediate output, if desired if(verbosity == 2) { printf("\nsamin: intermediate results before next temperature change\n"); printf("\ntemperature %e", t); printf("\ncurrent best function value %f", fopt); printf("\ntotal evaluations so far %d", func_evals); printf("\ntotal moves since last temperature reduction %d", nup + ndown + nrej); printf("\ndownhill %d", nup); printf("\naccepted uphill %d", ndown); printf("\nrejected uphill %d", nrej); printf("\nout of bounds trials %d", lnobds); printf("\nnew minima this temperature %d", nnew); printf("\n\n parameter search width\n"); for(i = 0; i < n; i++) printf("%20f%20f\n", xopt(i), bounds(i)); printf("\n"); } // Check for convergence, if we have covered the parameter space if (coverage_ok) { // last value close enough to last neps values? fstar(0) = f; test = 0; for (i = 1; i < neps; i++) test = test + fabs(f - fstar(i)) > functol; test = (test > 0); // if different from zero, function conv. has failed // last value close enough to overall best? if (((fopt - f) <= functol) && (!test)) { // check for bound narrow enough for parameter convergence for (i = 0;i < n;i++) { if (bounds(i) > paramtol) { converge = 0; // no conv. if bounds too wide break; } else converge = 1; } } // check if optimal point is near boundary of parameter space, and change convergence message if so if (converge) if (lnobds > 0) converge = 2; // Like to see the final results? if (converge > 0) { if (verbosity >= 1) { printf("\n================================================\n"); printf("SAMIN results\n\n"); if (converge == 1) printf("==> Normal convergence <==\n\n"); if (converge == 2) { printf("==> WARNING <==: Last point satisfies convergence criteria,\n"); printf("but is near boundary of parameter space.\n"); printf("%d out of %d evaluations were out-of-bounds in the last round.\n", lnobds, (nup+ndown+nrej)); printf("Expand bounds and re-run, unless this is a constrained minimization.\n\n"); } printf("Convergence tolerances:\nFunction: %e\nParameters: %e\n", functol, paramtol); printf("\nObjective function value at minimum: %f\n\n", fopt); printf(" parameter search width\n"); for(i = 0; i < n; i++) printf("%20f%20f\n", xopt(i), bounds(i)); printf("================================================\n"); } f_return(3) = details; f_return(2) = converge; f_return(1) = fopt; f_return(0) = xopt; return f_return; // this breaks out, if we get here } // Reduce temperature, record current function value in the // list of last "neps" values, and loop again t = rt * t; for(i = neps-1; i > 0; i--) fstar(i) = fstar(i-1); f = fopt; x = xopt; } else { // coverage not ok - increase temperature quickly to expand search area, to cover parameter space t = t*t; for(i = neps-1; i > 0; i--) fstar(i) = fstar(i-1); f = fopt; x = xopt; } } } optim-1.4.1/src/numhessian.cc0000644000175000017500000001554312477523726014570 0ustar olafolaf// Copyright (C) 2004, 2006 Michael Creel // // This program is free software; you can redistribute it and/or modify it under // the terms of the GNU General Public License as published by the Free Software // Foundation; either version 3 of the License, or (at your option) any later // version. // // This program is distributed in the hope that it will be useful, but WITHOUT // ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or // FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more // details. // // You should have received a copy of the GNU General Public License along with // this program; if not, see . // numhessian: numeric second derivative #include #include #include #include #include // argument checks static bool any_bad_argument(const octave_value_list& args) { if (!args(0).is_string()) { error("numhessian: first argument must be string holding objective function name"); return true; } if (!args(1).is_cell()) { error("numhessian: second argument must cell array of function arguments"); return true; } // minarg, if provided if (args.length() == 3) { int tmp = args(2).int_value(); if (error_state) { error("numhessian: 3rd argument, if supplied, must an integer\n\ that specifies the argument wrt which differentiation is done"); return true; } if ((tmp > args(1).length())||(tmp < 1)) { error("numhessian: 3rd argument must be a positive integer that indicates \n\ which of the elements of the second argument is the\n\ one to differentiate with respect to"); return true; } } return false; } DEFUN_DLD(numhessian, args, , "numhessian(f, {args}, minarg)\n\ \n\ Numeric second derivative of f with respect\n\ to argument \"minarg\".\n\ * first argument: function name (string)\n\ * second argument: all arguments of the function (cell array)\n\ * third argument: (optional) the argument to differentiate w.r.t.\n\ (scalar, default=1)\n\ \n\ If the argument\n\ is a k-vector, the Hessian will be a kxk matrix\n\ \n\ function a = f(x, y)\n\ a = x'*x + log(y);\n\ endfunction\n\ \n\ numhessian(\"f\", {ones(2,1), 1})\n\ ans =\n\ \n\ 2.0000e+00 -7.4507e-09\n\ -7.4507e-09 2.0000e+00\n\ \n\ Now, w.r.t. second argument:\n\ numhessian(\"f\", {ones(2,1), 1}, 2)\n\ ans = -1.0000\n\ ") { int nargin = args.length(); if (!((nargin == 2)|| (nargin == 3))) { error("numhessian: you must supply 2 or 3 arguments"); return octave_value_list(); } // check the arguments if (any_bad_argument(args)) return octave_value_list(); std::string f (args(0).string_value()); Cell f_args_cell (args(1).cell_value()); octave_value_list f_args, f_return; int i, j, k, minarg; bool test; double di, hi, pi, dj, hj, pj, hia, hja, fpp, fmm, fmp, fpm, obj_value, SQRT_EPS, diff; // Default values for controls minarg = 1; // by default, first arg is one over which we minimize // copy cell contents over to octave_value_list to use feval() k = f_args_cell.length(); f_args.resize (k); // resize only once for (i = 0; i 0 && f_return(0).is_double_type ()) obj_value = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } diff = exp(log(DBL_EPSILON)/4); SQRT_EPS = sqrt(DBL_EPSILON); for (i = 0; i diff; if (test) hi = (fabs(pi) + SQRT_EPS) * SQRT_EPS; else hi = diff; for (j = 0; j < i; j++) { // off-diagonal elements pj = parameter(j); test = (fabs(pj) + SQRT_EPS) * SQRT_EPS > diff; if (test) hj = (fabs(pj) + SQRT_EPS) * SQRT_EPS; else hj = diff; // +1 +1 parameter(i) = di = pi + hi; parameter(j) = dj = pj + hj; hia = di - pi; hja = dj - pj; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); if (f_return.length () > 0 && f_return (0).is_double_type ()) fpp = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } // -1 -1 parameter(i) = di = pi - hi; parameter(j) = dj = pj - hj; hia = hia + pi - di; hja = hja + pj - dj; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); if (f_return.length () > 0 && f_return (0).is_double_type ()) fmm = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } // +1 -1 parameter(i) = pi + hi; parameter(j) = pj - hj; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); if (f_return.length () > 0 && f_return (0).is_double_type ()) fpm = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } // -1 +1 parameter(i) = pi - hi; parameter(j) = pj + hj; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); if (f_return.length () > 0 && f_return (0).is_double_type ()) fmp = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } derivative(j,i) = ((fpp - fpm) + (fmm - fmp)) / (hia * hja); derivative(i,j) = derivative(j,i); parameter(j) = pj; } // diagonal elements // +1 +1 parameter(i) = di = pi + 2 * hi; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); if (f_return.length () > 0 && f_return (0).is_double_type ()) fpp = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } hia = (di - pi) / 2; // -1 -1 parameter(i) = di = pi - 2 * hi; f_args(minarg - 1) = parameter; f_return = feval(f, f_args); if (f_return.length () > 0 && f_return (0).is_double_type ()) fmm = f_return(0).double_value(); else { error ("numhessian: function must return a scalar of class 'double'"); return octave_value_list (); } hia = hia + (pi - di) / 2; derivative(i,i) = ((fpp - obj_value) + (fmm - obj_value)) / (hia * hia); parameter(i) = pi; } return octave_value(derivative); } optim-1.4.1/src/__bfgsmin.cc.docstrings0000644000175000017500000000035212477524017016501 0ustar olafolaf### This file is generated automatically from the ### corresponding .cc file by a Makefile rule. __bfgsmin @c __bfgsmin __bfgsmin.cc __bfgsmin: backend for bfgs minimization Users should not use this directly. Use bfgsmin.m instead optim-1.4.1/src/__disna_optim__.cc.docstrings0000644000175000017500000000060712477524020017655 0ustar olafolaf### This file is generated automatically from the ### corresponding .cc file by a Makefile rule. __disna_optim__ @c __disna_optim__ __disna_optim__.cc -*- texinfo -*- @deftypefn {Loadable Function} {@var{rcond} =} __disna__ (@var{job}, @var{d}) @deftypefnx {Loadable Function} {@var{rcond} =} __disna__ (@var{job}, @var{d}, @var{m}, @var{n}) Undocumented internal function. @end deftypefn optim-1.4.1/src/munge-texi.pl0000755000175000017500000001112312477523726014517 0ustar olafolaf#!/usr/bin/perl -w # # Copyright (C) 2012-2014 Rik Wehbring # # This file is part of Octave. # # Octave is free software; you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the # Free Software Foundation; either version 3 of the License, or (at # your option) any later version. # # Octave is distributed in the hope that it will be useful, but WITHOUT # ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or # FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License # for more details. # # You should have received a copy of the GNU General Public License # along with Octave; see the file COPYING. If not, see # . # copied from Octave and slightly modified by Olaf Till # Validate program call die "usage: munge-texi DOCSTRING-FILE ... < file" if (@ARGV < 1); # Constant patterns $doc_delim = qr/^\x{1d}/; $tex_delim = qr/\Q-*- texinfo -*-\E/; $comment_line = qr/^\s*(?:$|#)/; # Pre-declare hash size for efficiency keys(%help_text) = 1800; ################################################################################ # Load DOCSTRINGS into memory while expanding @seealso references foreach $DOCSTRING_file (@ARGV) { open (DOCFH, $DOCSTRING_file) or die "Unable to open $DOCSTRING_file\n"; # Skip comments while (defined ($_ = ) and /$comment_line/o) {;} # Validate DOCSTRING file format die "invalid doc file format\n" if (! /$doc_delim/o); do { s/\s*$//; # strip EOL character(s) $symbol = substr ($_,1); $docstring = extract_docstring (); if ($help_text{$symbol}) { warn "ignoring duplicate entry for $symbol\n"; } else { $help_text{$symbol} = $docstring; } } while (! eof); } ################################################################################ # Process .txi to .texi by expanding @DOCSTRING, @EXAMPLEFILE macros # Add warning header print '@c This file is generated automatically by the packages munge-texi.pl.',"\n\n"; TXI_LINE: while () { if (/^\s*\@DOCSTRING\((\S+)\)/) { $func = $1; $docstring = $help_text{$func}; if (! $docstring) { warn "no docstring entry for $func\n"; next TXI_LINE; } $func =~ s/^@/@@/; # Texinfo uses @@ to produce '@' $docstring =~ s/^$tex_delim$/\@anchor{XREF$func}/m; print $docstring,"\n"; next TXI_LINE; } if (/^\s*\@DOCSTRINGVERBATIM\((\S+)\)/) { $func = $1; $docstring = $help_text{$func}; if (! $docstring) { warn "no docstring entry for $func\n"; next TXI_LINE; } $func =~ s/^@/@@/; # Texinfo uses @@ to produce '@' print '@anchor',"{XREF$func}\n"; $docstring =~ s/^@[cC] .*?\n//; print '@verbatim',"\n"; print $docstring,"\n"; print '@end verbatim',"\n"; next TXI_LINE; } if (/^\s*\@EXAMPLEFILE\((\S+)\)/) { $fname = "$1"; print '@verbatim',"\n"; open (EXAMPFH, $fname) or die "unable to open example file $fname\n"; while () { print $_; print "\n" if (eof and substr ($_, -1) ne "\n"); } close (EXAMPFH); print '@end verbatim',"\n\n"; next TXI_LINE; } # pass ordinary lines straight through to output print $_; } ################################################################################ # Subroutines ################################################################################ sub extract_docstring { my ($docstring, $arg_list, $func_list, $repl, $rest_of_line); $cut = 0; DOC_LINE: while (defined ($_ = ) and ! /$doc_delim/o) { # drop text marked for cutting out if (/^\s*\@[cC]\s*BEGIN_CUT_TEXINFO\s*$/) { die "BEGIN_CUT_TEXINFO not matched by END_CUT_TEXINFO" if ($cut); $cut = 1; next DOC_LINE; } if (/^\s*\@[cC]\s*END_CUT_TEXINFO\s*$/) { die "END_CUT_TEXINFO without BEGIN_CUT_TEXINFO" if (! $cut); $cut = 0; next DOC_LINE; } if ($cut) { next DOC_LINE; } # expand any @seealso references if (m'^@seealso\s*{') { # Join multiple lines until full macro body found while (! /}/m) { $_ .= ; } ($arg_list, $rest_of_line) = m'^@seealso\s*{(.*)}(.*)?'s; $func_list = $arg_list; $func_list =~ s/\s+//gs; $repl = ""; foreach $func (split (/,/, $func_list)) { $func =~ s/^@/@@/; # Texinfo uses @@ to produce '@' $repl .= "\@ref{XREF$func,,$func}, "; } substr($repl,-2) = ""; # Remove last ', ' $_ = "\@seealso{$repl}$rest_of_line"; } $docstring .= $_; } return $docstring; } optim-1.4.1/doc/0002755000175000017500000000000012477525061012051 5ustar olafolafoptim-1.4.1/doc/optim.txi0000644000175000017500000015227712477523726013751 0ustar olafolaf\input texinfo @c %**start of header @setfilename optim.info @settitle optim_doc @c %**end of header @copying Additional documentation for the optim package for Octave. Copyright @copyright{} @email{Olaf Till } You can redistribute this documentation and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This documentation is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this documentation; if not, see . @end copying @include macros.texi @macro mysee @ifhtml see @end ifhtml @end macro @titlepage @title Additional documentation for the optim package for Octave @page @vskip 0pt plus 1filll @insertcopying @end titlepage @c No table of contents. The table would occupy most of the top node in @c html and IMHO misleads the user to use the table instead of the menu @c structure of the nodes, which would let some information unused. @c @c @contents @c ------------------------------------------------------------------ @node Top @top Additional documentation for the optim package for Octave @ifhtml @strong{Note: Some links in this document lead to Octaves core documentation. These links probably don't work in the html version of this document. They should work, however, in the info version accessible from the Octave commandline with @code{optim_doc()}.} @end ifhtml The optim package is a collection of additional functions related to numerical optimization. For Octaves core optimization functions (not contained in this package) @mysee @ref{Optimization,,,octave}. @menu Types of functions in the optim package * Scalar optimization:: Functions for optimization of a scalar objective function. * Residual optimization:: Functions for optimization of a model function returning an array. * Zero finders:: Functions for finding the zero of a scalar or array valued nonlinear user function. * Gradient functions:: Functions for numerical approximation of gradients and Hessians. * Helper functions:: Functions for algebraic tasks common to optimization problems. * Documentation:: Function optim_doc to view documentation. Configuration * Common frontend options:: Options common to all frontends. * Common optimization options:: Options common to all optimization frontends. * Parameter structures:: Handling of structures of optimized parameters. * Additional parameters:: Passing additional parameters to user functions. Indices * Function index:: Index of functions in optim. * Concept index:: Concept index. @end menu @c ------------------------------------------------------------------ @node Scalar optimization @chapter Functions for optimization of a scalar objective function @cindex scalar optimization @menu Frontend * nonlin_min:: Interface for scalar non-linear optimization. Backends * lm_feasible:: L/M-like optimizer, constraints met throughout optimization. * octave_sqp:: A wrapper to core Octaves sqp function. * siman:: Simulated annealing with constraints. * d2_min:: Newton-like optimizer, no constraints. Alternative frontend * fmins:: Frontend for direct search (gradient-less) algorithms. Standalone functions * nmsmax:: A Nelder-Mead simplex algorithm. * mdsmax:: A multidirectional search algorithm. * adsmax:: An alternating directions algorithm. * nelder_mead_min:: Another Nelder-Mead algorithm. * powell:: Direction-set (Powell's) method. * standalone d2_min:: Unconstrained Newton-like minimization, standalone function. * bfgsmin:: Unconstrained BFGS algorithm. * nrm:: Newton-Raphson algorithm. * cg_min:: A conjugate gradient method. * brent_line_min:: Linesearch, Brent method. * line_min:: Linesearch (minimize a function along dx). * samin:: A simulated annealing algorithm. * de_min:: A differential evolution algorithm. * battery:: Repeatedly call bfgsmin. @end menu @c ------------------------------------------------------------------ @node nonlin_min @section Frontend nonlin_min for scalar non-linear optimization @findex nonlin_min @DOCSTRING(nonlin_min) @subheading Settings The fields of the @var{settings} structure can be set with @ref{XREFoptimset,,optimset,octave}. For settings common to all frontends (including these for statistics) @mysee @ref{Common frontend options}. For additional settings common to all optimization frontends @mysee @ref{Common optimization options}. @subsubheading Specific defaults: @multitable {@code{Algorithm}} {"lm_feasible"} @item @code{Algorithm}: @tab "lm_feasible" @end multitable @subsubheading Additional settings: @table @code @item objf_grad Function computing the gradient of the objective function with respect to the parameters. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If @code{dfdp} was specified by the user, the informational structure has the fields @code{f}: value of objective function for current parameters, @code{fixed}: logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, @code{diffp}, @code{diff_onesided}, @code{lbound}, @code{ubound}: identical to the user settings of this name, @code{plabels}: 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, @mysee @ref{Parameter structures}, contain the names of the parameters and the subindices of the parameters, @mysee @ref{Non-scalar parameters}, respectively. The default gradient function will call the objective function with the second argument set with fields @code{f}: as the @code{f} passed to the gradient function, @code{plabels}: cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of @code{plabels} as passed to the jacobian function corresponding to current parameter, @code{side}: @code{0} for one-sided interval, @code{1} or @code{2}, respectively, for the sides of a two-sided interval, and @code{parallel}: logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative. @item objf_hessian Function computing the Hessian of the objective function with respect to the parameters. The default is backend specific. Will be called with the column vector of parameters as argument. @item inverse_hessian Logical scalar, indicating whether the Hessian function passed by the user actually returns the inverse of the Hessian. @item complex_step_derivative_objf Logical scalar, default: @code{false}. Estimate gradient of objective function with complex step derivative approximation. Use only if you know that your objective function is suitable for this. No user function for the gradient (@code{objf_grad}) must be specified. @item save_state String with path to a file which will be created for periodical saving of the state of optimization. Useful for very long optimizations which might get interrupted. The format of the saved state will be backend-specific. Currently, only the @qcode{"siman"} backend honours this option. Default: empty string, meaning no saving of state. @item recover_state String with path to a file created due to option @code{save_state}, which is used to recover a saved state before starting optimization. Default: empty string, meaning no recovering of state. @end table @subheading Structure based parameter handling Please @mysee @ref{Parameter structures}. @subheading Backend information Please @mysee @ref{Scalar optimization} and choose backend from menu. @c ------------------------------------------------------------------ @node lm_feasible @section Default backend lm_feasible of scalar optimization @cindex lm_feasible A Levenberg/Marquardt-like algorithm, attempting to honour constraints throughout the course of optimization. This means that the initial parameters must not violate constraints (to find an initial feasible set of parameters, e.g. core Octaves @code{sqp} can be used (@mysee @ref{octave_sqp}), by specifying an objective function which is constant or which returns a norm of the distances to the initial values). The Hessian is either supplied by the user or is approximated by the BFGS algorithm. Core Octaves @code{sqp} performed better in some tests with @emph{unconstrained} problems. Returned value @var{cvg} will be @code{2} or @code{3} for success and @code{0} or @code{-4} for failure (@mysee @ref{nonlin_min} for meaning). Returned structure @var{outp} will have the fields @code{niter}, @code{nobjf}, and @code{user_interaction}. Backend-specific defaults are: @code{MaxIter}: 20, @code{fract_prec}: @code{zeros (size (parameters))}, @code{max_fract_change}: @code{Inf} for all parameters. The setting @code{TolX} is not honoured. Interpretation of @code{Display}: if set to @qcode{"iter"}, currently only information on applying @code{max_fract_change} is printed. @c ------------------------------------------------------------------ @node octave_sqp @section Backend wrapping core Octaves sqp function @cindex octave_sqp This backend calls the @code{sqp} function of core Octave, a sequential quadratic programming algorithm with BFGS, so that it is usable with the @code{nonlin_min} frontend of the optim package. @code{sqp} honours constraints for the returned result, but not necessarily during the course of optimization. Compared to calling @code{sqp} directly (@mysee @ref{XREFsqp,,sqp,octave}), @itemize @item a different default (numerical) gradient function is used (that of the frontend), @item bounds are set to @code{+-Inf} by default, not to @code{+-realmax}. @end itemize The value of the additional setting @code{octave_sqp_tolerance}, a tolerance to several termination criteria, is passed as the respective argument to Octaves @code{sqp}, which provides the default @code{sqrt (eps)}. The settings @code{TolFun}, @code{TolX}, @code{fract_prec}, @code{max_fract_change}, @code{Display}, and @code{user_interaction} are not honoured. Returned value @var{cvg} will be @code{1} for success and @code{0} or @code{-4} for failure (@mysee @ref{nonlin_min} for meaning). Returned structure @var{outp} will have the fields @code{niter}, @code{nobjf}, and @code{lambda}. The default of @code{MaxIter} is that of Octaves @code{sqp} (100). @c ------------------------------------------------------------------ @node siman @section Simulated annealing backend siman of scalar optimization @cindex siman A simulated annealing (stochastic) optimizer, changing all parameters at once in a single step, so being suitable for non-bound constraints. No gradient or hessian of the objective function is used. The settings @code{MaxIter}, @code{fract_prec}, @code{TolFun}, @code{TolX}, and @code{max_fract_change} are not honoured. Accepts the additional settings @code{T_init} (initial temperature, default 0.01), @code{T_min} (final temperature, default 1.0e-5), @code{mu_T} (factor of temperature decrease, default 1.005), @code{iters_fixed_T} (iterations within one temperature step, default 10), @code{max_rand_step} (column vector or structure-based configuration of maximum random steps for each parameter, default 0.005 * @var{pin}), @code{stoch_regain_constr} (if @code{true}, regain constraints after a random step, otherwise take new random value until constraints are met, default @code{false}), @code{trace_steps} (set field @code{trace} of @var{outp} with a matrix with a row for each step, first column iteration number, second column repeat number within iteration, third column value of objective function, rest columns parameter values, default @code{false}), and @code{siman_log} (set field @code{log} of @var{outp} with a matrix with a row for each iteration, first column temperature, second column value of objective function, rest columns numbers of tries with decrease, no decrease but accepted, and no decrease and rejected. Steps with increase @code{diff} of objective function are accepted if @code{rand (1) < exp (- diff / T)}, where @code{T} is the temperature of the current iteration. If regaining of constraints failed, optimization will be aborted and returned value of @var{cvg} will be @code{0}. Otherwise, @var{cvg} will be @code{1}. Returned structure @var{outp}, additionally to the possible fields @code{trace} and @code{log} described above, will have the fields @code{niter} and @code{user_interaction}. Interpretation of @code{Display}: if set to @qcode{"iter"}, an informational line is printed after each iteration. If @code{parallel_local} is set to @code{true}, the objective function is evaluated for several parameter combinations in parallel. The course of optimization won't be changed by parallelization, provided the random number generator starts with the same state. To achieve this, some of the parallel results are discarded, causing the speedup to be smaller if the rate of acceptance of results is high. Also, due to overhead, there won't be any speedup, but even a slowdown, if the objective function is not computationally extensive enough. Honours options @code{save_state} and @code{recover_state}, described for the frontend. @c ------------------------------------------------------------------ @node d2_min @section Unconstrained Newton-like optimization @cindex d2_min This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. If the supplied Hessian function actually returns the inverse of the Hessian, set @code{inverse_hessian} to @code{true}. Supplying the inverse Hessian is preferable, if possible. Returned value @var{cvg} will be @code{2} or @code{3} for success and @code{0} or @code{-1} for failure (@mysee @ref{nonlin_min} for meaning). Returned structure @var{outp} will have the fields @code{niter}, @code{nobjf}, and @code{user_interaction}. Interpretation of @code{Display}: if set to @qcode{"iter"}, some diagnostics are printed. @c ------------------------------------------------------------------ @node fmins @section Alternative frontend for gradient-less algorithms @findex fmins @DOCSTRING(fmins) @c ------------------------------------------------------------------ @node nmsmax @section A Nelder-Mead simplex algorithm @findex nmsmax @subheading Helptext: @DOCSTRINGVERBATIM(nmsmax) @c ------------------------------------------------------------------ @node mdsmax @section A multidirectional search algorithm @findex mdsmax @subheading Helptext: @DOCSTRINGVERBATIM(mdsmax) @c ------------------------------------------------------------------ @node adsmax @section An alternating directions algorithm @findex adsmax @subheading Helptext: @DOCSTRINGVERBATIM(adsmax) @c ------------------------------------------------------------------ @node nelder_mead_min @section Another Nelder-Mead algorithm @findex nelder_mead_min This function does gradient-less minimization using the Nelder-Mead algorithm. No constraints are honoured. @subheading Helptext: @DOCSTRINGVERBATIM(nelder_mead_min) @c ------------------------------------------------------------------ @node powell @section Direction-set (Powell's) method @findex powell @DOCSTRING(powell) @c ------------------------------------------------------------------ @node standalone d2_min @section Unconstrained Newton-like minimization, standalone function @findex standalone d2_min This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. This function is deprecated. It is now available as a backend for @code{nonlin_min} (@mysee @ref{nonlin_min}). @subheading Helptext: @DOCSTRINGVERBATIM(d2_min) @c ------------------------------------------------------------------ @node bfgsmin @section Unconstrained BFGS algorithm @findex bfgsmin BFGS or limited memory BFGS minimization of a function. No constraits are honoured. @subheading Helptext: @DOCSTRINGVERBATIM(bfgsmin) @c ------------------------------------------------------------------ @node nrm @section Newton-Raphson algorithm @findex nrm No constraints are honoured. @DOCSTRING(nrm) @c ------------------------------------------------------------------ @node cg_min @section A conjugate gradient method @findex cg_min @DOCSTRING(cg_min) @c ------------------------------------------------------------------ @node brent_line_min @section Linesearch, Brent method @findex brent_line_min @DOCSTRING(brent_line_min) @c ------------------------------------------------------------------ @node line_min @section Linesearch, minimize a function along dx @findex line_min @subheading Helptext: @DOCSTRINGVERBATIM(line_min) @c ------------------------------------------------------------------ @node samin @section A simulated annealing (stochastic) optimizer @findex samin This simulated annealing algorithm implicitely honours bound constraints. @subheading Helptext: @DOCSTRINGVERBATIM(samin) @c ------------------------------------------------------------------ @node de_min @section A differential evolution (stochastic) optimizer @findex de_min @subheading Helptext: @DOCSTRINGVERBATIM(de_min) @c ------------------------------------------------------------------ @node battery @section Repeatedly call bfgsmin @findex battery @subheading Helptext: @DOCSTRINGVERBATIM(battery) @c ------------------------------------------------------------------ @node Residual optimization @chapter Functions for optimization of a model function returning an array @cindex residual optimization Model functions whose parameters are to be optimized may return a vector or array of values. Either these or their differences to some constant values (curve fitting) can be minimized in some sense, often, but not necessarily, by minimizing the sum of their squares. It is usually preferable to use optimizers designed for residual optimization for this purpose. These can exploit information contained in the individual elements of the returned array, which would not be possible if the user calculated a norm (@abbr{e.g@.} sum of squares) of the elements and performed a scalar optimization. @menu Optimization frontends * nonlin_residmin:: The standard interface for non-linear residual minimization. * nonlin_curvefit:: A convenience interface, curve fitting. Optimization backends * lm_svd_feasible:: L/M algorithm with SVD, constraints met throughout optimization. Statistics frontends * residmin_stat:: Statistics for residual minimization. * curvefit_stat:: Statistics for curve fitting. Statistics backends * wls:: Statistics for weighted least squares. Standalone functions * leasqr:: An older function for curve fitting. * expfit:: Prony's method for non-linear exponential fitting. * polyfitinf:: Function polyfitinf for polynomial fitting. * wpolyfit:: Polynomial fitting suitable for polyconf. * polyconf:: Confidence and prediction intervals for polynomial fitting. * LinearRegression:: Function LinearRegression. * wsolve:: Another linear solver. * linprog:: Linear programming. @end menu @c ------------------------------------------------------------------ @node nonlin_residmin @section Frontend nonlin_residmin for non-linear residual minimization @findex nonlin_residmin @c include function help here @DOCSTRING(nonlin_residmin) @subheading Settings The fields of the @var{settings} structure can be set with @ref{XREFoptimset,,optimset,octave}. For settings common to all frontends (including these for statistics) @mysee @ref{Common frontend options}. For additional settings common to all optimization frontends @mysee @ref{Common optimization options}. @subsubheading Specific defaults: @multitable {@code{Algorithm}} {"lm_svd_feasible"} @item @code{Algorithm}: @tab "lm_svd_feasible" @end multitable @subsubheading Additional settings: @table @code @item weights Array of weights for the residuals. Dimensions must match. @anchor{XREFoptiondfdp} @item dfdp Function computing the Jacobian of the residuals with respect to the parameters, assuming residuals are reshaped to a column vector. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If @code{dfdp} was specified by the user, the informational structure has the fields @code{f}: value of residuals for current parameters, reshaped to a column vector, @code{fixed}: logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, @code{diffp}, @code{diff_onesided}, @code{lbound}, @code{ubound}: identical to the user settings of this name, @code{plabels}: 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, @mysee @ref{Parameter structures}, contain the names of the parameters and the subindices of the parameters, @mysee @ref{Non-scalar parameters}, respectively. The default jacobian function will call the model function with the second argument set with fields @code{f}: as the @code{f} passed to the jacobian function, @code{plabels}: cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of @code{plabels} as passed to the jacobian function corresponding to current parameter, @code{side}: @code{0} for one-sided interval, @code{1} or @code{2}, respectively, for the sides of a two-sided interval, and @code{parallel}: logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative. @item complex_step_derivative_f Logical scalar, default: @code{false}. Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (@code{dfdp}) must be specified. @item plot_cmd Function enabling backend to plot results or intermediate results. Will be called with current computed residuals. Default: plot nothing. This setting is deprecated and will disappear. Please use @code{user_interaction} instead (@mysee @ref{Common optimization options}). @end table @subheading Structure based parameter handling Please @mysee @ref{Parameter structures}. @subheading Backend information Please @mysee @ref{Residual optimization} and choose backend from menu under `Optimization backends'. @c ------------------------------------------------------------------ @node nonlin_curvefit @section Function nonlin_curvefit() for curve fitting @findex nonlin_curvefit In curve fitting, the model function computes values from a constant set of `independents', and the intention is to minimize the differences of these computed values to a constant set of `observations'. This can be done with @code{nonlin_residmin}, but it is more convenient to use @code{nonlin_curvefit}, which cares for passing the constant `independents' to the model function and for calculating the differences to the constant `observations'. However, if in some optimization problem you notice that you end up with passing dummy-values for the `independents' and zeros for the `observations', you can more naturally use @code{nonlin_residmin} instead of @code{nonlin_curvefit}. @c include function help here @DOCSTRING(nonlin_curvefit) @c replace the cut out text Also, if the setting @code{user_interaction} is given, additional information is passed to these functions, @mysee @ref{Common optimization options}. @c ------------------------------------------------------------------ @node lm_svd_feasible @section Default backend lm_svd_feasible of residual minimization @cindex lm_svd_feasible Levenberg/Marquardt algorithm using singular value decomposition. Constraints must be met by the initial parameters and are attempted to be kept met throughout the optimization. Returned value @var{cvg} will be @code{0}, @code{1}, or @code{2}. Returned structure @var{outp} will have the fields @code{niter} and @code{user_interaction}. Backend-specific defaults are: @code{MaxIter}: 20, @code{fract_prec}: @code{zeros (size (parameters))}, @code{max_fract_change}: @code{Inf} for all parameters. The setting @code{TolX} is not honoured. Interpretation of @code{Display}: if set to @qcode{"iter"}, currently some diagnostics are printed. Specific option: @code{lm_svd_feasible_alt_s}: if falling back to nearly gradient descent, do it more like original Levenberg/Marquardt method, with descent in each gradient component; for testing only. @c ------------------------------------------------------------------ @node residmin_stat @section Statistics for residual minimization @findex residmin_stat @c include function help here @DOCSTRING(residmin_stat) @subheading Further settings The fields of the @var{settings} structure can be set with @ref{XREFoptimset,,optimset,octave}. For settings common to all frontends @mysee @ref{Common frontend options}. @subsubheading Additional settings: @table @code @item objf_type Type of objective function of the optimization; must be specified in many cases. This determines which backends to use. Currently, there are only backends for the type "wls" (weighted least squares). @item residuals @item covd Optional information on the result of optimization, residuals and covariance matrix of data, respectively. @item weights Array of weights applied to the residuals in the previous optimization. Dimensions must match those of the residuals. @item dfdp Can be set in the same way and has the same default as in @code{nonlin_residmin} (@mysee @ref{nonlin_residmin}), but alternatively may already contain the computed Jacobian of the model function at the final parameters in matrix- or structure-form. @item complex_step_derivative_f Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (@code{dfdp}) must be specified. @end table @subheading Structure based parameter handling Please @mysee @ref{Parameter structures}. @subheading Backend information Please @mysee @ref{Residual optimization} and choose backend from menu under `Statistics backends'. @c ------------------------------------------------------------------ @node curvefit_stat @section Statistics for curve fitting @findex curvefit_stat As @code{nonlin_curvefit} can be used instead of @code{nonlin_residmin} for curve fitting (@mysee @ref{nonlin_curvefit}, @mysee @ref{nonlin_residmin}), @code{curvefit_stat} can be used instead of @code{residmin_stat} (@mysee @ref{residmin_stat}) for statistics on the results of curve fitting. @c include function help here @DOCSTRING(curvefit_stat) @c ------------------------------------------------------------------ @node wls @section Statistics for weighted least squares @cindex wls @cindex statistics for weighted least squares The backends for @code{objf_type == "wls"} (currently the only supported type of objective function) compute @code{covd} (due to user request or as a prerequisite for @code{covp} and @code{corp}) as a diagonal matrix by assuming that the variances of data points are proportional to the reciprocal of the squared @code{weights} and guessing the factor of proportionality from the residuals. If @code{covp} is not defined (e.g. because the Jacobian has no full rank), an attempt is made to still compute its uniquely defined elements, if any. In @code{corp}, interdependent parameters can cause elements of @code{1} or @code{-1}, which in this case are not the real coefficients of correlation, but rather indicate the direction of parameter interdependence. To be consistent with this, an attempt is made (often not successful) to identify parameter interdependence and mark it with elements of @code{1} or @code{-1} in @code{corp} even if the respective elements of @code{covp} can not be computed. @c ------------------------------------------------------------------ @node leasqr @section An older function for curve fitting @findex leasqr This was a popular function for curve fitting and has been enhanced to honour constraints. @code{nonlin_curvefit} (@mysee @ref{nonlin_curvefit}) does now the same job if used with the default backend, and should be prefered due to its more powerful interface. The statistics returned by @code{leasqr} can also (and partially better) be computed with @code{curvefit_stat} (@mysee @ref{curvefit_stat}). There are currently two things which still only @code{leasqr} does: @itemize @item internally providing a function for plotting fits during optimization, @item returning a pre-computed matrix for determining confidence regions. @end itemize @DOCSTRING(leasqr) @c ------------------------------------------------------------------ @node expfit @section Prony's method for non-linear exponential fitting @findex expfit @subheading Helptext: @DOCSTRINGVERBATIM(expfit) @c ------------------------------------------------------------------ @node polyfitinf @section Function polyfitinf for polynomial fitting. @findex polyfitinf @subheading Helptext: @DOCSTRINGVERBATIM(polyfitinf) @c ------------------------------------------------------------------ @node wpolyfit @section Polynomial fitting suitable for polyconf @findex wpolyfit @DOCSTRING(wpolyfit) See also @ref{XREFpolyfit,,polyfit,octave}. @c ------------------------------------------------------------------ @node polyconf @section Confidence and prediction intervals for polynomial fitting @findex polyconf @subheading Helptext: @DOCSTRINGVERBATIM(polyconf) @c ------------------------------------------------------------------ @node LinearRegression @section Function LinearRegression @findex LinearRegression @DOCSTRING(LinearRegression) See also @ref{XREFregress,,regress,octave}, @ref{XREFleasqr,,leasqr}, @ref{XREFnonlin_curvefit,,nonlin_curvefit}, @ref{XREFpolyfit,,polyfit,octave}, @ref{XREFwpolyfit,,wpolyfit}, @ref{XREFexpfit,,expfit}. @c ------------------------------------------------------------------ @node wsolve @section Function wsolve, another linear solver @findex wsolve @subheading Helptext: @DOCSTRINGVERBATIM(wsolve) @c ------------------------------------------------------------------ @node linprog @section Linear programming @findex linprog @DOCSTRING(linprog) See also @ref{XREFglpk,,glpk,octave}. @c ------------------------------------------------------------------ @node Zero finders @chapter Functions for finding the zero of a nonlinear user function @cindex zero finders There is only one dedicated zero finder in the optim package, which is just a vectorized version of Octaves fzero (@mysee @ref{XREFfzero,,fzero,octave}). @menu * vfzero:: A vectorized version of fzero. @end menu @c ------------------------------------------------------------------ @node vfzero @section A vectorized version of fzero @findex vfzero @DOCSTRING(vfzero) @c ------------------------------------------------------------------ @node Gradient functions @chapter Functions for numerical approximation of gradients and Hessians @cindex gradient functions You should not usually need to use these functions directly or pass them as arguments. They should be chosen and used by optimizer functions, possibly subject to their configuration options. @menu * dfpdp:: Direct user interface to default numerical gradient method of new frontends. * deriv:: Higher order numerical derivatives. * numgradient:: Another numerical gradient function. * numhessian:: Numerical Hessian function. * cdiff:: A string, yielding the numerical gradient if evaluated. * jacobs:: Complex step derivatives. @end menu @c ------------------------------------------------------------------ @node dfpdp @section Direct user interface to default numerical gradient method of new frontends @findex dfpdp @subheading Helptext: @DOCSTRINGVERBATIM(dfpdp) @c ------------------------------------------------------------------ @node deriv @section Higher order numerical derivatives @findex deriv @DOCSTRING(deriv) @c ------------------------------------------------------------------ @node numgradient @section Another numerical gradient function @findex numgradient @subheading Helptext: @DOCSTRINGVERBATIM(numgradient) @c ------------------------------------------------------------------ @node numhessian @section Numerical Hessian function @findex numhessian @subheading Helptext: @DOCSTRINGVERBATIM(numhessian) @c ------------------------------------------------------------------ @node cdiff @section A string, yielding the numerical gradient if evaluated @findex cdiff @subheading Helptext: @DOCSTRINGVERBATIM(cdiff) @c ------------------------------------------------------------------ @node jacobs @section Complex step derivatives @findex jacobs @DOCSTRING(jacobs) @c ------------------------------------------------------------------ @node Helper functions @chapter Functions for algebraic tasks common to optimization problems @cindex helper functions @menu * cpiv_bard:: A complementary pivoting algorithm. * gjp:: Gauss-Jordan pivoting. @end menu @c ------------------------------------------------------------------ @node cpiv_bard @section A complementary pivoting algorithm @findex cpiv_bard @subheading Helptext: @DOCSTRINGVERBATIM(cpiv_bard) @c ------------------------------------------------------------------ @node gjp @section Gauss-Jordan pivoting @findex gjp @subheading Helptext: @DOCSTRINGVERBATIM(gjp) @c ------------------------------------------------------------------ @node Documentation @chapter Function optim_doc to view documentation @findex optim_doc @DOCSTRING(optim_doc) @c ------------------------------------------------------------------ @node Common frontend options @chapter Options common to all frontends @cindex common options All frontends for optimization and for result statistics (@ref{nonlin_min}, @ref{nonlin_residmin}, @ref{nonlin_curvefit}, @ref{residmin_stat}, @ref{curvefit_stat})accept the following options, settable with @ref{XREFoptimset,,optimset,octave}. These options are handled within the frontend. @table @code @item diffp Column vector of fractional intervals (doubled for central intervals) supposed to be used by gradient or Jacobian functions performing finite differencing. Default: @code{.001 * ones (size (parameters))}. The default function for finite differencing will use these as absolute intervals for parameters with value zero. @item diff_onesided Logical column vector indicating the parameters for which one-sided intervals should be used by gradient or Jacobian functions performing finite differencing. Default: @code{false (size (parameters))}. @item cstep Scalar step size for complex step derivative approximation of gradients or Jacobians. Default: 1e-20. @item parallel_local Logical scalar, default: @code{false}. If the @code{parallel} package, @code{version >= 2.0.5}, is loaded, estimate gradients of objective function and Jacobians of model function and of constraints in parallel processes. Works for default (real) finite differences and for complex step derivatives. Due to overhead, a speed advantage can only be expected if objective function, model function or constraint functions are time consuming enough. Additionally, this setting is also passed to the individual optimization backends, which may also consider this option (see documentation of backends). If this option is set to @code{true}, a warning (ID: @code{optim:parallel_local}) will be issued if no @code{parallel} package of a correct version is loaded. @item fixed Logical column vector indicating which parameters are not optimized, but kept to their inital value. @end table @c ------------------------------------------------------------------ @node Common optimization options @chapter Options common to all optimization frontends @cindex common optimization options All frontends for optimization (@ref{nonlin_min}, @ref{nonlin_residmin}, @ref{nonlin_curvefit}) accept the following options, settable with @ref{XREFoptimset,,optimset,octave}. @subheading Settings handled within the frontend @table @code @item Algorithm String specifying the backend. @item complex_step_derivative_inequc, @item complex_step_derivative_equc Logical scalars, default: @code{false}. Estimate Jacobian of general inequality constraints and equality constraints, respectively, with complex step derivative approximation. Use only if you know that your function of general inequality constraints or function of general equality constraints, respectively, is suitable for this. No user function for the respective Jacobian must be specified. @end table @subheading Settings passed to the backend Which of these options are actually honored is noted in the descriptions of the individual backends. @table @code @item lbound, @item ubound Column vectors of lower and upper bounds for parameters. Default: @code{-Inf} and @code{+Inf}, respectively. The bounds are non-strict, i.e. parameters are allowed to be exactly equal to a bound. The default function for gradients or Jacobians will respect bounds (but no further inequality constraints) in finite differencing if the backend respects bounds even during the course of optimization. @item inequc Further inequality constraints. Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say @code{m}) and a vector (say @code{v}), specifying linear inequality constraints of the form @code{m.' * parameters + v >= 0}. The first entry for general constraints must be a differentiable column-vector valued function (say @code{h}), specifying general inequality constraints of the form @code{h (p[, idx]) >= 0}; @code{p} is the column vector of optimized paraters and the optional argument @code{idx} is a logical index. @code{h} has to return the values of all constraints if @code{idx} is not given. It may choose to return only the indexed constraints if @code{idx} is given (so computation of the other constraints can be spared); in this case, the additional setting @code{f_inequc_idx} has to be set to @code{true}. In gradient determination, this function may be called with an informational third argument, whose content depends on the function for gradient determination. If a second entry for general inequality constraints is given, it must be a function computing the jacobian of the constraints with respect to the parameters. For this function, the description of the setting @code{dfdp}, @mysee @ref{XREFoptiondfdp,,dfdp}, applies, with 2 exceptions: 1) it is called with 3 arguments since it has an additional argument @code{idx}, a logical index, at second position, indicating which rows of the jacobian must be returned (if the function chooses to return only indexed rows, the additional setting @code{df_inequc_idx} has to be set to @code{true}). 2) the default jacobian function calls @code{h} with 3 arguments, since the argument @code{idx} is also supplied. Note that specifying linear constraints as general constraints will generally waste performance, even if further, non-linear, general constraints are also specified. @item f_inequc_idx, @item df_inequc_idx Indicate that functions for general inequality constraints or their jacobian, respectively, return only the values or derivatives for the indexed parameters. See description of setting @code{inequc} above. @item equc Equality constraints. Specified the same way as inequality constraints (see @code{inequc} above). @item f_equc_idx, @item df_equc_idx As @code{f_inequc_idx} and @code{df_inequc_idx} above, but for equality constraints. @item cpiv Function for complementary pivoting, usable in algorithms for constraints. Default: @code{cpiv_bard}. Only the default function is supplied with the package. @item TolFun Minimum fractional improvement in objective function (e.g. sum of squares) in an iteration (termination criterium). Default: .0001. @item TolX Minimum fractional change in a norm of the parameters in an iteration (termination criterium). Default: backend specific. @item MaxIter Maximum number of iterations (termination criterium). Default: backend-specific. @item fract_prec Column Vector, minimum fractional changes of corresponding parameters in an iteration (termination criterium if violated in two consecutive iterations). Default: backend-specific. @item max_fract_change Column Vector, enforced maximum fractional changes in corresponding parameters in an iteration. Default: backend-specific. @item Display String indicating the degree of verbosity. Default: @qcode{"off"}. Possible values are currently @qcode{"off"} (no messages) and @qcode{"iter"} (some messages after each iteration). Support of this setting and its exact interpretation are backend-specific. @item debug Logical scalar, default: @code{false}. Will be passed to the backend, which might print debugging information if @code{true}. @item FunValCheck If @qcode{"on"}, the output of user functions will be sanity-checked. Default: @qcode{"off"}. @item user_interaction @c This setting has deliberately not been named as its Matlab equivalent @c `OutputFcn' since it differs from the latter by requiring the @c functions to return _two_ outputs. The rationale for the difference @c is that information about the reason for a user-stop should be @c possible to pass in the output. The second output can't be made @c optional without possibly altering the warning state under which the @c user-function runs. Handle to a user function or cell-array with a number of these. Functions must have this interface: @example [@var{stop}, @var{info}] = some_user_function (@var{p}, @var{vals}, @var{state}); @end example If @var{stop} is @code{true}, the algorithm stops. In @var{info} information about the reason for stopping can be returned in a free format. @var{info} can be set to be empty, but it must be set. Note that this is different from the otherwise similar Matlab setting @code{OutputFcn}. The functions will be called by the algorithms at the start with @var{state} set to @qcode{init}, after each iteration with @var{state} set to @qcode{iter}, and at the end with @var{state} set to @qcode{done}. @var{p} contains the current parameters, and @var{vals} is a structure with other current values, the possible fields are currently: @table @code @item iteration number of the current iteration, @item fval value of objective function (for scalar optimization), @item residual residuals (for residual-based optimization), @item model_y in @code{nonlin_curvefit}, the output of the model function, @item observations in @code{nonlin_curvefit}, the constant observations, @item model_x in @code{nonlin_curvefit}, the constant argument @var{x}. @end table Information about the output of these functions when they were called the last time (possibly causing a stop) will be contained in the output @var{outp} of the frontend in field @code{user_interaction}. Subfield @code{stop} is a vector containing the @var{stop} outputs of each function, subfield @code{info} is a cell-array containing the output @var{info} of each function. In the case of a stop, the output @var{cvg} of the frontent will be @code{-1}. @end table @c ------------------------------------------------------------------ @node Parameter structures @chapter Handling of structures of optimized parameters @cindex parameter structures It can be convenient not to handle the optimized parameters as elements of a vector, but as named fields of a structure. The frontends @code{nonlin_residmin}, @code{nonlin_curvefit}, @code{residmin_stat}, @code{curvefit_stat}, and @code{nonlin_min} can accept parameter information in structure form, and can pass the parameters as a structure to user functions, although the backends still handle the parameters as vectors. To use this feature, the initial parameters must be given in structure form, or the setting @code{param_order} must be given, a cell-array with names of the parameters. If both is done, only the parameters in structure fields named in @code{param_order} will be optimized. If there are still some non-structure-based configuration settings or user functions, specifying @code{param_order} is mandatory even if the initial parameters are given in structure form. If the initial parameters are a structure, the parameters being the optimization result will also be returned as a structure. @menu * Structure-based user functions:: Specify which user functions accept parameter structures. * Structure-based gradients and Hessians:: Format of returned values of structure-based gradient and Hessian functions. * Structure-based linear constraints:: Specify structure-based linear constraints. * Structure-based configuration settings:: Parameter-related configuration settings in structure form. * Non-scalar parameters:: Handling named parameter arrays. @end menu @c ------------------------------------------------------------------ @node Structure-based user functions @section Specify which user functions accept parameter structures @cindex structure-based user functions The frontend must be told which user functions accept parameter structures by setting corresponding settings to @code{true}. The names of these settings depend on which user functions are applicable to the respective frontend and are shown in the following table. @multitable {inequality constraints} {@code{hessian_objf_pstruct}} {@code{nonlin_residmin},} @headitem User function @tab Setting @tab Frontends @item Objective function @tab @code{objf_pstruct} @tab @code{nonlin_min} @item Gradient @tab @code{grad_objf_pstruct} @tab @code{nonlin_min} @item Hessian @tab @code{hessian_objf_pstruct} @tab @code{nonlin_min} @item Model function @tab @code{f_pstruct} @tab @code{nonlin_residmin}, @code{nonlin_curvefit}, @code{residmin_stat}, @code{curvefit_stat} @item Jacobian @tab @code{df_pstruct} @tab @code{nonlin_residmin}, @code{nonlin_curvefit}, @code{residmin_stat}, @code{curvefit_stat} @item General inequality constraints @tab @code{f_inequc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @item Jacobian of general inequality constraints @tab @code{df_inequc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @item General equality constraints @tab @code{f_equc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @item Jacobian of general equality constraints @tab @code{df_equc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @end multitable @c ------------------------------------------------------------------ @node Structure-based gradients and Hessians @section Format of returned values of structure-based gradient and Hessian functions @cindex structure-based gradients @cindex structure-based Hessians Structure-based gradient or Jacobian functions, including Jacobians of general constraints, must return the partial derivatives as fields of a structure under the respective parameter names. For gradients, the partial derivatives are scalar. For Jacobians, the partial derivatives must be column vectors. Structure-based Hessian functions must return the 2nd derivatives as subfields in a two-level structure of parameter names. For example, if the parameter names are @code{a} and @code{b}, the 2nd derivative with respect to @code{a} and @code{b} must be in the field @code{returned_structure.a.b} or @code{returned_structure.b.a} (there is no need to specify both). @c ------------------------------------------------------------------ @node Structure-based linear constraints @section Specify structure-based linear constraints @cindex structure-based linear constraints Linear constraints, otherwise specified with a matrix and a vector@c @c (@xref{}) , can be adapted to structure-based parameter handling by specifying, instead of a matrix, a structure containing the rows of the matrix in fields under the respective parameter names. In this case, rows containing only zeros need not be given. @c ------------------------------------------------------------------ @node Structure-based configuration settings @section Parameter-related configuration settings in structure form @cindex structure-based configuration settings The vector-based settings @code{lbound}, @code{ubound}, @code{fixed}, @code{diffp}, @code{diff_onesided}, @code{fract_prec}, and @code{max_fract_change} can be replaced by the setting @code{param_config}. It is a structure that can contain fields corresponding to parameter names. For each such field, there may be subfields with the same names as the above vector-based settings, but containing a scalar value for the respective parameter. For example, if the parameters are named @code{a} and @code{b}, instead of specifying @example settings = optimset ("lbound", [-Inf; 0], "diff_onesided", [true; true]); @end example one can specify @example pconf.b.lbound = 0; pconf.a.diff_onesided = true; pconf.b.diff_onesided = true; settings = optimset ("param_config", pconf); @end example If @code{param_config} is specified, none of the above vector-based settings may be used. @c ------------------------------------------------------------------ @node Non-scalar parameters @section Handling named parameter arrays @cindex non-scalar parameters @cindex named parameter arrays Parameters in named structure fields are allowed to be non-scalar real arrays. In this case, their dimensions must be given by the setting @code{param_dims}, a cell-array of dimension vectors, each containing at least two dimensions; if not given, dimensions are taken from the initial parameters, if these are given in a structure. If there are any vector-based settings or not structure-based linear constraints, they must correspond to an order of parameters defined as follows: All named parameter arrays are reshaped to vectors. Then, all parameters, scalars and vectors, are concatenated in the order of parameter names, given by the user. Structure-based settings or structure-based initial parameters must contain arrays with dimensions reshapable to those of the respective parameters. @c ------------------------------------------------------------------ @node Additional parameters @chapter Passing additional parameters to user functions @cindex additional parameters Optimizers often require the user to supply functions (@abbr{e.g@.} objective functions, model functions, constraint functions). The interface of these functions --- arguments and returned values --- is defined by the optimizer. Often, a user function needs additional arguments, not covered by the defined interface, which are constant throughout the optimization. These can be supplied by wrapping the user function into an anonymous function. @xref{Anonymous Functions,,,octave}, for further explanation and examples. There are also some older optimizers in the optim package, written when anonymous functions were not available in Octave. Some of these offer an interface to user functions which is itself able to pass additional constant variables in arbitrary argument positions. Newer optimizers should not be written this way, since this is an unnecessary complication. Though it is possible to use global variables to pass additional data to user functions, this is not recommended since it introduces the possibility of name conflicts within the pool of global variables. @c ------------------------------------------------------------------ @node Function index @unnumbered Index of functions in optim @printindex fn @c ------------------------------------------------------------------ @node Concept index @unnumbered Concept index @printindex cp @bye optim-1.4.1/doc/optim-mini-howto-2.tex0000644000175000017500000002165312477523726016165 0ustar olafolaf%% LyX 2.0.3 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass[english]{article} \usepackage{helvet} \usepackage[T1]{fontenc} \usepackage[latin9]{inputenc} \usepackage{verbatim} \usepackage{amstext} \usepackage{graphicx} \makeatletter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Textclass specific LaTeX commands. \newenvironment{lyxcode} {\par\begin{list}{}{ \setlength{\rightmargin}{\leftmargin} \setlength{\listparindent}{0pt}% needed for AMS classes \raggedright \setlength{\itemsep}{0pt} \setlength{\parsep}{0pt} \normalfont\ttfamily}% \item[]} {\end{list}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands. \usepackage{dsfont} % No date please \date{} \fontfamily{cmss} \selectfont \makeatother \usepackage{babel} \begin{document} % This is for \LyX \global\long\def\bfrac#1#2{\frac{#1}{#2}} \global\long\def\nth{^{\textrm{th}}} \global\long\def\R{R} \global\long\def\N{N} \global\long\def\Z{Z} \global\long\def\tra{^{T}} \global\long\def\xx{\mathbf{x}} % This is for \LaTeX \fontfamily{cmss}\selectfont \renewcommand{\R}{{\mathds{R}}} \renewcommand{\N}{{\mathds{N}}} \renewcommand{\Z}{{\mathds{Z}}} \renewcommand{\tra}{^{\top}} \renewcommand{\bfrac}[2]{\frac{{\textstyle #1 }}{{\textstyle #2 }}} \title{Mini-HOWTO on using Octave for Unconstrained Nonlinear Optimization% \thanks{Author : Etienne Grossmann \texttt{} (soon replaced by ``Octave-Forge developers''?). This document is free documentation; you can redistribute it and/or modify it under the terms of the GNU Free Documentation License as published by the Free Software Foundation.\protect \\ .~~~This is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.% }} \maketitle \begin{comment} Keywords: nonlinear optimization, octave, tutorial, Nelder-Mead, Conjugate Gradient, Levenberg-Marquardt \end{comment} \emph{This document refers to the frontend function ``minimize'', which is now deprecated and will be removed in a future version of the optim package. An alternative frontend function is ``nonlin\_min'', not described here. The backends of ``minimize'' (``nelder\_mead\_min'', ``bfgsmin'', ``d2\_min'') can also be called directly. This document will be removed in the future.} Nonlinear optimization problems are very common and when a solution cannot be found analytically, one usually tries to find it numerically. This document shows how to perform unconstrained nonlinear minimization using the Octave language for numerical computation. We assume to be so lucky as to have an initial guess from which to start an iterative method, and so impatient as to avoid as much as possible going into the details of the algorithm. In the following examples, we consider multivariable problems, but the single variable case is solved in exactly the same way. All the algorithms used below return numerical approximations of \emph{local minima} of the optimized function. In the following examples, we minimize a function with a single minimum (Figure~\ref{fig:function}), which is relatively easily found. In practice, success of optimization algorithms greatly depend on the optimized function and on the starting point. \section*{} \fontfamily{cmss} \selectfont A simple example \begin{figure} \begin{centering} \raisebox{6mm}{\includegraphics[width=0.3\textwidth]{figures/2D_slice-3.eps2}}~ \par\end{centering} \caption{\label{fig:function} 2D slice of the function that is minimized throughout this tutorial. Although not obvious at first sight, it has a unique minimum.} \end{figure} We will use a call of the type \begin{lyxcode} {[}x\_best,~best\_value,~niter{]}~=~minimize~(func,~x\_init) \end{lyxcode} to find the minimum of \[ \begin{array}{cccc} f\,: & \left(x_{1},.x_{2},x_{3}\right)\in\R^{3} & \longrightarrow & \left(x_{1}-1\right)^{2}/9+\left(x_{3}-1\right)^{2}/9+\left(x_{3}-1\right)^{2}/9\\ & & & -\cos\left(x_{1}-1\right)-\cos\left(x_{2}-1\right)-\cos\left(x_{3}-1\right). \end{array} \] The following commands should find a local minimum of \ensuremath{f()} , using the Nelder-Mead (aka ``downhill simplex'') algorithm and starting from a randomly chosen point \texttt{x0}~: \begin{lyxcode} function~cost~=~foo~(xx) ~~xx-{}-;~~ ~~cost~=~sum~(-cos(xx)+xx.\textasciicircum{}2/9); endfunction x0~=~{[}-1,~3,~-2{]}; {[}x,v,n{]}~=~minimize~(\textquotedbl{}foo\textquotedbl{},~x0) \end{lyxcode} The output should look like~: \begin{lyxcode} x~= ~~1.00000~1.00000~1.00000 v~=~-3.0000 n~=~248 \end{lyxcode} This means that a minimum has been found in \ensuremath{\left(1,1,1\right)} and that the value at that point is \ensuremath{-3} . This is correct, since all the points of the form \ensuremath{x_{1}=1+2i\pi,\, x_{2}=1+2j\pi,\, x_{3}=1+2k\pi} , for some \ensuremath{i,j,k\in\N} , minimize \ensuremath{f()} . The number of function evaluations, 248, is also returned. Note that this number depends on the starting point. You will most likely obtain different numbers if you change \texttt{x0}. The Nelder-Mead algorithm is quite robust, but unfortunately it is not very efficient. For high-dimensional problems, its execution time may become prohibitive. \section*{} \fontfamily{cmss} \selectfont Using the first differential Fortunately, when a function, like \ensuremath{f()} above, is differentiable, more efficient optimization algorithms can be used. If \texttt{minimize()} is given the differential of the optimized function, using the \texttt{\textquotedbl{}df\textquotedbl{}} option, it will use a conjugate gradient method. \begin{lyxcode} \#\#~Function~returning~partial~derivatives function~dc~=~diffoo~(x) ~~~~x~=~x(:)'~-~1; ~~~~dc~=~sin~(x)~+~2{*}x/9; endfunction {[}x,~v,~n{]}~=~minimize~(\textquotedbl{}foo\textquotedbl{},~x0,~\textquotedbl{}df\textquotedbl{},~\textquotedbl{}diffoo\textquotedbl{}) \end{lyxcode} This produces the output~: \begin{lyxcode} x~= ~~1.00000~1.00000~1.00000 v~=~-3~ n~= ~~108~6 \end{lyxcode} The same minimum has been found, but only 108 function evaluations were needed, together with 6 evaluations of the differential. Here, \texttt{diffoo()} takes the same argument as \texttt{foo()} and returns the partial derivatives of \ensuremath{f()} with respect to the corresponding variables. It doesn't matter if it returns a row or column vector or a matrix, as long as the \ensuremath{i\nth} element of \texttt{diffoo(x)} is the partial derivative of \ensuremath{f()} with respect to \ensuremath{x_{i}} . \section*{} \fontfamily{cmss} \selectfont Using numerical approximations of the first differential Sometimes, the minimized function is differentiable, but actually writing down its differential is more work than one would like. Numerical differentiation offers a solution which is less efficient in terms of computation cost, but easy to implement. The \texttt{\textquotedbl{}ndiff\textquotedbl{}} option of \texttt{minimize()} uses numerical differentiation to execute exactly the same algorithm as in the previous example. However, because numerical approximation of the differentia is used, the outpud may differ slightly~: \begin{lyxcode} {[}x,~v,~n{]}~=~minimize~(\textquotedbl{}foo\textquotedbl{},~x0,~\textquotedbl{}ndiff\textquotedbl{}) \end{lyxcode} wich yields~: \begin{lyxcode} x~= ~~1.00000~1.00000~1.00000 v~=~-3~ n~= ~~78~6 \end{lyxcode} Note that each time the differential is numerically approximated, \texttt{foo()} is called 6 times (twice per input element), so that \texttt{foo()} is evaluated a total of (78+6{*}6=) 114 times in this example. \section*{} \fontfamily{cmss} \selectfont Using the first and second differentials When the function is twice differentiable and one knows how to compute its first and second differentials, still more efficient algorithms can be used (in our case, a variant of Levenberg-Marquardt). The option \texttt{\textquotedbl{}d2f\textquotedbl{}} allows to specify a function that returns the value of the function, the first and second differentials of the minimized function. Entering the commands~: \begin{lyxcode} function~{[}c,~dc,~d2c{]}~=~d2foo~(x) ~~~~c~=~foo(x); ~~~~dc~=~diffoo(x); ~~~~d2c~=~diag~(cos~(x(:)-1)~+~2/9); end {[}x,v,n{]}~=~minimize~(\textquotedbl{}foo\textquotedbl{},~x0,~\textquotedbl{}d2f\textquotedbl{},~\textquotedbl{}d2foo\textquotedbl{})~ \end{lyxcode} produces the output~: \begin{lyxcode} x~= ~~1.0000~1.0000~1.0000 v~=~-3 n~= ~~34~5 \end{lyxcode} This time, 34 function evaluations, and 5 evaluations of \texttt{d2foo()} were needed. \section*{} \fontfamily{cmss} \selectfont Summary We have just seen the most basic ways of solving nonlinear unconstrained optimization problems. The online help system of Octave (try e.g. ``\texttt{help minimize}'') will yield information on other issues, such as \emph{passing extra arguments} to the minimized function, \emph{controling the termination} of the optimization process, choosing the algorithm etc. \begin{lyxcode} \end{lyxcode} \end{document} optim-1.4.1/doc/Makefile0000644000175000017500000000206012477523726013513 0ustar olafolafsinclude ../../../Makeconf TEX = optim-mini-howto-2.tex PDF = $(patsubst %.tex,%.pdf,$(TEX)) HTML = $(patsubst %.tex,html/%/index.html,$(TEX)) all : $(PDF) html .PHONY: html html : $(HTML) echo '' > html/index.html; \ echo '

optim-mini-howto-2

' >> html/index.html; \ echo '' >> html/index.html %.pdf : %.tex latex -interaction=nonstopmode $< > /dev/null 2>&1 latex -interaction=nonstopmode $< > /dev/null 2>&1 $(DVIPDF) $(@:.pdf=.dvi) # Note verbosity=0 as well as making latex2html quieter, has the side-effect # of not including a url to the raw text, which it'll get wrong html/%/index.html : %.tex if [ -e `which latex2html` ] ; then \ latex2html -verbosity=0 -local_icons $< ; \ if [ ! -e "html" ]; then \ mkdir html; \ fi; \ mv -f $(patsubst html/%/index.html,%,$@) html ; \ fi clean: rm -fr $(patsubst %.tex,%,$(TEX)) html *.log rm -f $(PDF) *~ rm -f $(patsubst %.tex,%.aux,$(TEX)) rm -f $(patsubst %.tex,%.out,$(TEX)) rm -f $(patsubst %.tex,%.dvi,$(TEX)) optim-1.4.1/doc/macros.texi0000644000175000017500000000472112477523726014240 0ustar olafolaf@c Copyright (C) 2012-2013 John W. Eaton @c @c This file is part of Octave. @c @c Octave is free software; you can redistribute it and/or modify it @c under the terms of the GNU General Public License as published by the @c Free Software Foundation; either version 3 of the License, or (at @c your option) any later version. @c @c Octave is distributed in the hope that it will be useful, but WITHOUT @c ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or @c FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License @c for more details. @c @c You should have received a copy of the GNU General Public License @c along with Octave; see the file COPYING. If not, see @c . @c The following macro marks words that aspell should ignore during @c spellchecking. Within Texinfo it has no effect as it merely replaces @c the macro call with the argument itself. @macro nospell {arg} \arg\ @end macro @c The following macro works around the Info/plain text expansion of @code{XXX} @c which is `XXX'. This looks particularly bad when the macro body is @c single or double-quoted text, such as a property value `"position"' @ifinfo @rmacro qcode{arg} \arg\ @end rmacro @end ifinfo @ifnotinfo @rmacro qcode{arg} @code{\arg\} @end rmacro @end ifnotinfo @c The following macro is used for the on-line help system, but we don't @c want lots of `See also: foo, bar, and baz' strings cluttering the @c printed manual (that information should be in the supporting text for @c each group of functions and variables). @c @c Implementation Note: @c For TeX, @vskip produces a nice separation. @c For Texinfo, '@sp 1' should work, but in practice produces ugly results @c for HTML. We use a simple blank line to produce the correct behavior. @macro seealso {args} @iftex @vskip 2pt @end iftex @ifnottex @end ifnottex @ifnotinfo @noindent @strong{See also:} \args\. @end ifnotinfo @ifinfo @noindent See also: \args\. @end ifinfo @end macro @c The following macro works around a situation where the Info/plain text @c expansion of the @code{XXX} macro is `XXX'. The use of the apostrophe @c can be confusing if the code segment itself ends with a transpose operator. @ifinfo @macro tcode{arg} \arg\ @end macro @end ifinfo @ifnotinfo @macro tcode{arg} @code{\arg\} @end macro @end ifnotinfo @c FIXME: someday, when Texinfo 5.X is standard, we might replace this with @c @backslashchar, which is a new addition to Texinfo. @macro xbackslashchar \\ @end macro optim-1.4.1/doc/figures/0002755000175000017500000000000012477523726013523 5ustar olafolafoptim-1.4.1/doc/figures/2D_slice-3.eps20000644000175000017500000003642312477523726016110 0ustar olafolaf%!PS-Adobe-3.0 EPSF-3.0 %%Creator: (ImageMagick) %%Title: (2D_slice-3.eps2) %%CreationDate: (Fri Oct 4 15:04:09 2002) %%BoundingBox: 0 0 307 276 %%LanguageLevel: 2 %%Pages: 1 %%EndComments %%BeginDefaults %%EndDefaults %%BeginProlog % % Display a color image. The image is displayed in color on % Postscript viewers or printers that support color, otherwise % it is displayed as grayscale. % /DirectClassImage { % % Display a DirectClass image. % colorspace 0 eq { /DeviceRGB setcolorspace << /ImageType 1 /Width columns /Height rows /BitsPerComponent 8 /Decode [0 1 0 1 0 1] /ImageMatrix [columns 0 0 rows neg 0 rows] compression 0 gt { /DataSource pixel_stream /DCTDecode filter } { /DataSource pixel_stream /DCTDecode filter } ifelse >> image } { /DeviceCMYK setcolorspace << /ImageType 1 /Width columns /Height rows /BitsPerComponent 8 /Decode [0 1 0 1 0 1 0 1] /ImageMatrix [columns 0 0 rows neg 0 rows] compression 0 gt { /DataSource pixel_stream /DCTDecode filter } { /DataSource pixel_stream /DCTDecode filter } ifelse >> image } ifelse } bind def /PseudoClassImage { % % Display a PseudoClass image. % % Parameters: % colors: number of colors in the colormap. % currentfile buffer readline pop token pop /colors exch def pop colors 0 eq { % % Image is grayscale. % currentfile buffer readline pop token pop /bits exch def pop /DeviceGray setcolorspace << /ImageType 1 /Width columns /Height rows /BitsPerComponent bits /Decode [0 1] /ImageMatrix [columns 0 0 rows neg 0 rows] compression 0 gt { /DataSource pixel_stream /DCTDecode filter } { /DataSource pixel_stream /DCTDecode filter << /K -1 /Columns columns /Rows rows >> /CCITTFaxDecode filter } ifelse >> image } { % % Parameters: % colormap: red, green, blue color packets. % /colormap colors 3 mul string def currentfile colormap readhexstring pop pop [ /Indexed /DeviceRGB colors 1 sub colormap ] setcolorspace << /ImageType 1 /Width columns /Height rows /BitsPerComponent 8 /Decode [0 255] /ImageMatrix [columns 0 0 rows neg 0 rows] compression 0 gt { /DataSource pixel_stream /DCTDecode filter } { /DataSource pixel_stream /DCTDecode filter } ifelse >> image } ifelse } bind def /DisplayImage { % % Display a DirectClass or PseudoClass image. % % Parameters: % x & y translation. % x & y scale. % label pointsize. % image label. % image columns & rows. % class: 0-DirectClass or 1-PseudoClass. % colorspace: 0-RGB or 1-CMYK. % compression: 0-RunlengthEncodedCompression or 1-NoCompression. % hex color packets. % gsave /buffer 512 string def /pixel_stream currentfile def currentfile buffer readline pop token pop /x exch def token pop /y exch def pop x y translate currentfile buffer readline pop token pop /x exch def token pop /y exch def pop currentfile buffer readline pop token pop /pointsize exch def pop /Helvetica findfont pointsize scalefont setfont x y scale currentfile buffer readline pop token pop /columns exch def token pop /rows exch def pop currentfile buffer readline pop token pop /class exch def pop currentfile buffer readline pop token pop /colorspace exch def pop currentfile buffer readline pop token pop /compression exch def pop class 0 gt { PseudoClassImage } { DirectClassImage } ifelse grestore } bind def %%EndProlog %%Page: 1 1 %%PageBoundingBox: 0 0 307 276 userdict begin %%BeginData: 11757 BINARY Bytes DisplayImage 0 0 307 276 12.000000 307 276 0 0 0 ÿØÿàJFIFHHÿþ6 Image generated by GNU Ghostscript (device=pnmraw) ÿÛC    $.' ",#(7),01444'9=82<.342ÿÛC  2!!22222222222222222222222222222222222222222222222222ÿÀ3"ÿÄÿÄE !1AQ"aqÑ6Uƒ‘”¡³#2BR±Áð3bráñ$CEcs‚’¢²ÿÄÿÄ+!1"AQa2‘q¡±áÿÚ ?µòæ/éJ߈Šysô¥oÄ?Å@E¯‰?˘¿¥+~!þ)åÌ_Ò•¿ÿ¸@ŸåÌ_Ò•¿ÿòæ/éJ߈Š€‰„ þ\Åý)[ññO.bþ”­ø‡ø¨˜@ŸåÌ_Ò•¿ÿòæ/éJ߈Š€‰„ þ\Åý)[ññO.bþ”­ø‡ø¨˜@ŸåÌ_Ò•¿ÿòæ/éJ߈Š€‰„ þ\Åý)[ññO.bþ”­ø‡ø¨˜@ŸåÌ_Ò•¿ÿòæ/éJ߈Š€‰„ þ\Åý)[ññ_£ÅÈ$buÄsÿpü¾j½]á4­u†Fݲ›Xè@ÿ{û”ë¯{Â#)mY!ùsô¥oÄ?Å<¹‹úR·ââ½+py`võ;],vÏõX¼n/”z¤ŸDÿ.bþ”­ø‡ø§—1JVüCüTQÂ='ùsô¥oÄ?Å<¹‹úR·ââ "a—1JVüCüS˘¿¥+~!þ*&'ùsô¥oÄ?Å<¹‹úR·ââ "a—1JVüCüS˘¿¥+~!þ*&'ùsô¥oÄ?Å<¹‹úR·ââ "a—1JVüCüS˘¿¥+~!þ*&'ùsô¥oÄ?Å<¹‹úR·ââ "a¯ì”óTìųÊùevýß#‹‰ë¸jQyìgšt^Óê9IvÉ‘²!D@DDD@DDÜQ>i™M/‘î kF¤œ€Z¸!lLdlV )» ç5 æÚ{;½®q!ëËB´žÚ—¾zn¤Ä]Ìàóüßæ¬£QM¦Wl—jÊeÍÑc1Í%\omˆk$‰Á¦úžü¬3O«Æzyiä1͘áÀ{–» ¤ûqíwØq?½´õ+5v¯eT§¸å¸îÎUàR4Í»-;É ™ƒ.ãÈÛ;{‰±Tëº>6M£‘s \× ‚ …ʶ£g߃W!ß`”ýÓ¿é¶m'ßnÎ&Åc…ø³Sà DEaàDDD@DDoc<Ó¢öŸQÈ›æ´úŽE’]²G$DE¬ˆDDD@DDD@›evjekC ›½AÞx7ãÀeÃ=tȪjN–ò½§ph9«º‰è*S·\ÝG ó Lt²²¶úerµEàé‘ÄÈclqµ¬c@kZÑ`н´W†Ç-¢¨°»\l×Ûª}ÙkŸ+*ÚØ(iÌÓ:ÍÄžCµseTã-­rX§d‰ÕÅOHæ1ò¼ýØ{C€#SŸ+¨Ô[A§£«Ý…Ü/ºsù΋7[ˆºº©Ó¼n—dÖÞû£’ŠùsÝ÷äºpЧ K³4­jY]1cvÖa)†„ßt´½ÂÃ>Ý£­ï^˜8#Øê¤cbhû·¼ÚÖü·å˺ܭ›­«umlµ/2:àr§`¡¥Ñµ{ÝèœìÌx3³Dèe,w 1Íy«Zºn—hãCϱU%õ:ßèsÜ‚"* „D@2" ˆ€""­ìgšt^Óê9c<Ó¢öŸQȲK¶H䈈µ‘ˆ€""ˆˆ" ˆ€""¤QÒK]TÊhX]#î@ØO~@䣮ƒ°WEK.' ëOxãÿ@9x‘lÇåíQrÛÈÆx)"„uZšÑk)" zÀصx¶'ÏKºÉnöœƒÏ1Ú³’DøžY#Ç ZábN­Dl¬ƒO=•“¶×9“Àg±iª6j*Š8¢5ô‘6Í{œ\8~Rrh-ëS°ªAGcþ#úÏ9dywb¹×ê¥)øú4×ZQäåÕ´ÓáÕ/‚vÙã0F„s’ðË5Ók¨`Äi]ì»N`Aæ9Š¡ÙŠÉñÁRÃ1¼ƒGòs¿ËŽy.†Ÿ] A¹ðÑTéiá6W é¦utñÞ6eðÈ»õïÜó &Ñ`þN©EþÎ$Ûtg]Ü…­Ë»²ëuRÀÈa`dlhm$uÔ²SJ.Ɉâ½`޶_>ÿE®¥·.²‹U†Ë'K< /llé%kGáh°.îçï¡š‚©ÐÎÍ×·B4pæ9…¯ÙJN‡}A9Îì€:’9k{ü—G[l>-ݔջ)EÑ1ƒ‚ªGÔa²¶žG˜^>ͼM³åYªÅ©#¥d=ÌâK·N_‹\´×³5ˌԸ52¬X1óJ;˜?“à¦ÃEOmŒ u™¸âµCM)rø*v¢žŸ­«ŒÉOGQ;Ý.Š'8Êàw/Ù°¼FšM=TQ7W¾4ºæCö 68ˆ´Žë¿ýGÃ!êV+ ¬ÄšE«•“ƒ¢ìX¦ „ÖÇ$õÔq;w®ù-}€â[™Ë‚æW]ΦqÜ̆¼õ€å{Xü•ÔÆV&â%%ÊÄ^Ï¥ž;ïBðdqï_ ŠI/¸Ç:Úî‹©8µÚ‘ðŠþƒcqšæ‡vÓ°‚C§vï7XS±Ü ƒf° uM}A,cÞ,Ö´~"˜Ð™½ÍŬ«Þ³‚F»c<Ó¢öŸQÈ›æ´úŽEš]³Ó’""Öˆ„D@DDD@¤ÉQQ7zY^ÆÞ×$ØÚ¨)"Ãè¡¥…¶Ž&Œ…Ïi·©í+°8[g¬›‚ ‘‹ç¼á™×–Y‹ö.Œ³ÚùÁê ©¢™â6:a&óápÐ5ýÇü€UâÁâõ¿m®’Vž êÇþ‘êã™õ«t•o³ôWl¶ÄÒa˜Ì5äDáÑÔÜŽç»áûÙ\.béÇ´±Å®i¸-6 ­f´QÕS½•o ž&['´ ’;yweº­‡•}ªÝË ™â̡ͨ¡ó<îÆÂxó=ƒÃKÝd0œz¢|õéPà's†vàáaÂúrË•£âµ®Äq jM÷I³Z+Fƒ³™í%@p[ôú(*±5Ë*•¯qÖY#%c^Ç5ÂàƒpG5ö±[+Œ˜ÜÌ:w1¬7èœrëø}w${¸‹mWúeLܪRY(öƒ¥©£uDÎ1º!”­nñµô·}^õiHèIÙˆ0†€ËrYd¶ƒmdâžáôì7.ñ?=þý‹Ï®¨ÃÆìO„ÝÌp»IþþËOôÖJ•—ýŠ~XÆfåeö¢¬8ÇHËÓÒ<ò6°¿ÉN¤Ç£©™‘: 6½angOÙx;gMTïž®©Î|™¹¬±ä ¾CMtÅUflãäÜÖ"dŽª×gèN'Ëo_xãž¿—×|ýEicÁpÚ6ºÖõŸ)ÞâMò )‰V2®¹ÎŒd` ѧ_­nZ‡¨NX_e¿–oÑs†<±ÁÌ%®i¸#" ´Ã±ŒDT2&Pl#~dçsÖמ¹’Í  ²žKcz|`µ5»6‰§­']ÿé.Ç˵e-n AŒàõî«–¨¯uúƒ¬Ñß,¯¥Ê¤s\Ç9®ikšlA ­ú=‘­(¾JlË—'’¶ÙʵâBW¤yíwå+ú•Y·/’ÙQÉE€á¬ŠªvG)$½ÜIË&‹“k[.WMeŽ5íŠåŠ–^KÌ—Úý[æ×Zå¤h¼ \ꦖJZ© ˜Yñ¸´å¯hì:©hµ?,v¾Ñå°ÚòGitolŒ%¯i»HȃÌ-Un-Š?f+é­ÒÝ’MÍ–vï ë‘v¢Ñ¶ommCªª#§ˆØ4ŸÄüŽcƹ­»ãkØXæ‡5ÂÄpBÍ­Ô×ò%ŒàP{rs fÚÎãÀ)½#ZÞEh+6U{ŸFñBÌ}È¿õýü)ú|#kq'6pѽÑCÖ±ÊÛ×-rîà´-Mv/þŠå[Ï%æÏáűŠÙ?ÛÔinƒžœcÛ–…bæÛ:‡0 z(Øûædyp·pNÀªq\JGÔUÌ[F , »Î¾ƒ+–ŽüôÏ5ξ‹^l³‚èJ+Åö“p‘ÔÒ Ð-3Û­ÿO»šÎ5öãRè2`ØlŒ,u ,`i÷ŒÕ]VÈRJI¦–H ##×há|þjý.®šá·ýžN©IäÊtƒˆ²ÕlÞZ϶ÌÒáhšà?êõéÝÞªèvx·û-|Ñ5­hxcœ€ß&èm‘¿»n·=É­Ô¦¶Àòš¹Ë?œÐC;Cf‰’4€æ‚/ë^Ê%ˆC†ÐÉS)_ÂÒm¼î ÿM´¢hm.YA´³QPBÈ) Š*—ýö05Ìh:‚47ºÆM8 '€_uu“×V¾B çù5 “s ° ¢i Žì ¿´µys#.åú>dy’Bó©_(‹šääòÍ+Ž·±žiÑ{O¨äMŒóN‹Ú}G"Ç.Ù#‘¸]¤bF«ÑìmE(ãp¾Ý'V3¬l;¸.÷㦓”YLÑ›—¦ÂjÌÑ Æã×gí<ì©“FIc…ÅÂðÅÄQÀ÷ÉÀ^ÃSܾ°ã Ãà4æñÜzõõÞù(k«Œ%”{Á%{S@*jc„Í;æÝ$¤†Žò²ç“‘÷±o#nü²9÷«å•Úªÿðècwùå±÷æÇ±CMJÕµžÍ¥M1AG PXÆÖÒ-ÖòË=T¥†Â±zŒ=­üHs;Ž6µùeÿ+I?C5†ôŒyvèia$÷Zêwi¬ƒmòFE¡bmèì/ÓJŽÃCÏÕp°2oH÷=ÎsœMˉ¹%i+ð¬SÄ_+¢±ÆÌ/Y­d >î%~ò39·ž©Œuôc …»Í–Ý5”Ñ_,ªÅ92› ÈV²,ßÅ!½¬Û‹Ûµt8¡d0²&7uŒhkG 4Xê,N I¡‚PKìé·÷C­¥€-xçòðmM$›¢xä…Æ÷6ÞhõŒþJZº×¹/D«q}—Ê+^0êÍùÿ ¯w[+ö)ÕPÕF$†F½§‹N‡‘ìXŒg8ipÿ ;¶1žbúçÄørYôô;,ÃôY9¨¢®wÉS;æ•îsÜn\x« M Äè÷—¦iá5ÝcÎúðçetr_.ä«„–1©Ith£Û)Ilbƒ¤Þ¤‡¬îÁº}ÊÏÀF1$o®ík#‚ÃtŸÅwÖÐph ì¾ai¯™¤=ÂÑ4´dÓù½zp˽Zãø´x6%Sóõ"níÞA°=™\öǾP…˜§Ñ² µäcö¦¢ƒ…øVLÆO;o<ÁĹ­6ê^÷ÎÂàåc¦yc¤óIQQ$ò»zY^^÷Z×$Ü•æ¼Yk˲XK ˆ‹ß`ë{æ´úŽDØÏ4è½§Ôr,’í’9""-iµÊ"@¨ õk-Â…‚Ìij¥Ãä6i¼ßÿÓ›w«™£én#6çÅPâ=ÌmD=Y¢vûM´!t.«Ñ™¡EáETÚÚ8ª,.G#¡£p½×?ýÉÝ›ÛaÍŠŽ¼)“^­ù† sÈŽÉ4m’7µìpkšn:W WÑÖ`R8C»-;È/…÷±í²¿¼TëöN©1lv 5‰„IU•™À_‹ñ®ë'W¶Õµ­-€2‡ôæð-˜Þþ@  Îé\é ‹œãw9ÆäžÕ·Møç,N}ÎïHÐPm=U-CQi£‘áÒu@pÊ×°Ð ypÕmij଄KŒ‘‡‹N‡‘ì\§|pS°¬Vl.§¦ˆï1ÙI9<‘Z5:Én¯†W]­>N›#w†GÕÍPâ´ÏK#ünÉì#+sìþéeeMŠQTЊÆLÖãœòºyG5ƶ¾iøðÖÚÓnœ‹¹úfÈhu¾\—.=¶KlWF—8¥“Žà³a3‚.úgŸ»’ÚvßßÞm;j%$C ’øƒM¾JÐÖUÃTÚ¡4x»Ü]—.妥•ô¨Œn»G6÷ ò[­•”Ç/’¤£&NÙݨ­Ž†FâÔõW§y“˜Þ4XnŸóvûík˜¸¶ÔÉŒ,1RƒÖa7sóÈŸ߇¼®dŽÉÂ×åCŠP6¤Iq§“Kè+ÿ~EW¢•2³s\žØ¤£„z‚¶;-‹ô¬n1ë1¤ÄâïÄßÓŸÂÜbÃÄð@ƒ¢•ÜǵíqkÚn ˆ<×SU§VÇk3B[YÔ*ª#¥¦’y ˜Æ—ÞÁÚ¹ÔÓIYW$ò¹î.>ü<®!ŠUÖàtÝ$o sˆ–MÛ5äilýg-FZ*Æ3u£,ΫŽ¥\\Ÿe—O'Ûtº¼ÙÊÍ3ªä`-Œî°Õ­ý_Ïb¤do’FFÑw½Á­\öŽ™´tq@Ã}ÆÚüÏïQÖÙ¶Wlò˜eä’ªqÚÿ°aÎ u¦“¨ËÇ3­òyÙZß,Ö®ûv$íÇ^(úŒ±Èó<³£‘63Í:/iõ‹$»dŽHˆ‹Y¡ÔÄÐr5÷÷ÿe1~=¡ì-:m|sOÑãè¡Ãeò~$ú7åAÞŒèí˜õ€=ݪýQbt®ž[vM¸#"âžX+¨™5€á‘£ò¸j?¼,­Ô×µî]3ÈôJDWxÏKŠýô¥ÑRŒ·†®=ž?ÑŠËX”Ôæy`x5F)Qv]0õä·Ès+ßÂæÂ*:®ß¦yê?ø=¿¿¾Û[Óa´Š¶8Ø,à°›AŽKˆRÒ9¢-ëI)ÏOÒ-¯íÛk,úÉÛ R(络焈ÑJmõ…c‡Ñ͈Բ fï=Ú“£Gyñ€`Sp˜Õ±±0ÚHÙœšé˜°¸¾yøt¼.‚Ž‚‘±ÑÆÛç×q¨ñ9Ÿáw/ü‚„qYª4¼òS’iÁ>Íö—ý§{¥ÞÞ=ý­mÞ\/kþËUK%4φVJÃg4®Â³[K€2ºV@ÛU0\€/ÒÃ/ÍËÝÊÙ4z×âÇÃ'ey\ÖXÿ0ëæ†­øm`o]“Ú8¥½¼@Q&‡-:¿²ì_Rœy(Œ°jÚöMžFápBönåM;é¦7cņZx,Þ%\sšhé䚸o0Âr¹<²ãÉH©Æ(é¤-û×8РHàw¬¾zzy×f!Ú5©&¹"ÔS¿ ª0HCšzÍpâ9ü•Æ ‡KŠÖ6´ôL Êý7[ßmy±UXý5tm†JG¼1À‰ ®¶W#l²ãÀ®³Òa’á» ´ä›·=滈uó¿ñkeeгWd(IÇ–R«R–Q`)`û3iŒMtMhhc†ð°Ó^垮ٹçIFCšMÄn6#°>¿šÔ(ÕµÐÓ:iš4WCµrêºÈKǶ]8E®LîBÊ*Ã&$èas15ïngõkÂßÛ+gmÇ9¦¨\1Ä{ÀÍb++WS$òX½ÆæÂð|—•KTº/‰ØãsÞàÑrI°rÍY)6Ÿû&aaûÊ‹°v7ó;mëìX¦Àj§bÕîÄ+å¨7Ý&уùZ4Éí%RVÊZÑNg3Ü»úx-=9}˜äþI&Ë)wËÍsœ·7#RXXˆ¼oc<Ó¢öŸQÈ›æ´úŽE’]²G$DE¬ˆDDz¨ïgØ‘£»¹ªŠi†ââ2OÙêNí¸5ü¯Ow%~@ ‚.¡SâTBzwÄñ˜çnm¦_%n·Úè‹M> Þ³/® ª«¼t׸f†Aüïj×ÔTÁCOº7cc`@S†bõƒgi%Åi䂳¢`EîF[Ö+ØXZöà²8ž3&1!Ü$RpÿÔíîýûµù«~K­p‰†êï”¶µú=1ŒrlNgÁN\Êfþ)4/ìogo»²°Ð~¢éSDjX‰®šcRÂ$Ð×O‡U6¢û¯tpä{NÁñHë)£«„ݯHÀoºî å¨þê¹BŸ…b³á5bhºÌ9I9<‘RœçÙrghkšö‡pt_ªŸ¯†H™,OA(¸s¾þån"àä³4zc¶«hÄ©c;×¼Ín–ý~>þeAÁ6MÕíeUaèéÜAlvëHÞþNÓÙ‘V›G´ý þûI$oäì½¼;ôüÀvšñ²ŸqÞu³ÿ7¿BWZ3Õ-7ûƒ;Ù¼¿§§¤m,15°Y­oåíïýÖwظqo¾†@ʰ,^[Õ[ m§ ùp9[^^U1RÓ¾i¤ ‚îq๰²jY]—½¸8•v [†HWLè\t¸¸vCG Ž£B½ðš¼B’³¥ éºpÓ~‰›Çwųj´øÎÔÕb$ÀALöº7œðrÎã,¸ÜÏ ¥¬†15sÇtk$ P]êë¶Uù¤Œ¤ø7¸NÐbP“]‡¶¸â×—–‡:ù7t‚[ßžŠ«¦Äñ:£,µcrŽ0çÁÿ×3ÌÿG¶ ³8„¡µu[@sadî9þnYpçÂÊ×ÙZ EFç0±Žé {Fv'´ël×)Y mÊkû¥º<˜: 3.ìB…æÄ†Äç8›pɶ»-&Êã8ZV¦’hµOŒžØþ*+fl琉ܸÄìÆGñìU s£psIk¸#"b¬†¨_]ÃÈè¦6qn°÷.Õt*࣌r›o%Ô;G_NÍÒæK¤nbÝ¢×õ«†ËS´X{ÚÝê&¾:ý Îã…†™÷Žk+‡Ò¿®eú2µ;7‡aT2Vâ53I$#hhp¸¶Ìær½Æ¼5\îi:YŸ&ènó‰ÝmìÞÁ|ì´›c R¼RS=®¤§99Ž$Hëfye˜³|ÖaR¬²k3eª1]DCЈˆ" :ÞÆy§Eí>£‘63Í:/iõ‹$»dŽHˆ‹Yˆˆ†×GC;^øšNöRnÝ;]öPQFqÝÕoÅ-Ø7ôÕQUÄœ!SâØvôô j[Àª:éh¤Þk‹™ú9w-vŠCW MõEŠçµerÉØN½Dxÿ)ìto,{K\5|­¦)ƒÃ[{@l–ÉÀ,M,´’ôs6ÇàVºîRãÙÍ¿Lë[—ñ=Ð}`ŸÙCÁ„áõÒÉ_PÙ%ˆcèÉý@Û2;­Ä_„½¢Ú1P_EE'Ýi$?°vvñî×42]Jauµ?š\2‰8Åø£«E,sÆ$ŠF½‡BÓp}kÕrÚJúš7©¦tn:¡ïN«[…í8«ížžWT1¥Ö‚2ðáÂÃPs¶yv®v£Ee\®Q|.O²ß¬¥¡¢|ÕyÆÝÜãÀ4s\¿ÅäÄj˜èéã4ð¼˜Û+œE­Öﶃç`TÜ[¨Åêºië[”qƒ“3ýŽ…§…5¿K Œcº}²©ÜÛÂ,hñŠJ¶µµOlÉɧ] 8«xºxWNì–>JPs-¹æÕï‡yPÌ ÃŸ+ÝknpÑqŽC>9*µKÎ,œ.ôjf£ ¯%Õ9³‘†Úm¢¤Ä¨¢ÂãœN$ÑC+ ù͂Ӝ¥Ô-Ž®° ˆ³¤§§K˜VÇ95¬ÿ¨˜•¦JjG0¼ÖµÀ‘ÆÆæÞåÚº¬\˜ìiéBÓ{‘Ä›{´ÈrºÏ"ö¶«–äºå ¨°ÎÄs Ý’.©T‡›…7‚Zê¸à…ñ²G‘á£2{…ÏbêÕ¬„——iTýéâ’ªvC äy³ZCÁ0va4Ť—Îû4Ë@;ÏÈyàX->‰=%Còt¥¶Ëïîõs5ºÇsÙâ[]{VYI‹lý.)y3Š Ûï/p9Ž?¾C’ÇW`Uôô°°zHs@2yzìºbò’FÅ{ÜÖ‚I&ÀeWF¶Ê±Êú%:ÔŽM<•S²ç=ÆÍhâºV„Óàô}CyîÍò›ð9 㮩ť¨¢{ `’!ѾA•ˈÌè-~d3U2cxïå »ÿï»ÅtuQ¶úÔŸ èª Eárkö¿1:I€¬ÞBÁœm·>9vÛ<²\ñ}=î‘î{Ü\÷\ç’y•ò¹±ŠŠÂ4DS<ˆ¼D@Öö3Í:/iõ‰±~iQ{O¨äY%Û$rDDZÈ„D@DDá…ÃÉg‡â¯¦sY!.fÜ–ªžª:ˆÅÈ7â°JUt´o»I,âߎê=Äê鵪IBßò_b¸eÞšœÈs#ƒ–fH߆9Zñ¨+eC‰2¦0opWæ!…Ã[ðd isA]´8ž#F)jj7£ÞÞ;­ .ÒÀÚÙ _þªHº»M á-ó!<µ„{>bãÁy“sr¿°Eu÷»eú<„B",äÂ" ˆ€""ˆˆ" :ÞÆy§Eí>£‘63Í:/iõ‹$»dŽHˆ‹Z"DDD@éòSȈ-ðZj,F:¨k Ÿ(J¶w+²Çǯ£#$o…ý­Ý/æ÷µ›»ÆÛÆÀðºØ×áôõLã=A…]ˆa‘TÓ:.¡¶µ€V­SK“/úde&âðŠ^q¶H$u4ÀïF뉾øçßÏûoE¶ß£“en¹¸K°ˆŠD" ˆ€""ˆˆ" ˆ€""­ìgšt^Óê9c<Ó¢öŸQȲK¶H䈈µ‘ˆ€""ˆˆ" ˆ€""ˆˆö¦–hægBzÄnx¯¸ä1HÙ´Ü¡dwDº‰ìµ<àØ°õ@&æËè´8*J\b782O»qÐ;CÜ}JÖ9šñpA\™G úe‰,ÅäªÅð¯´Å½Ü•¹±ÃPU $Þs˜æ–HÌœÃÃý–ÝÃxX*,_ sœ* Éãñ?þ订ÇÏF]^ŸçŽWòEB/ƹ¯hsMÁ_«¦žyGÏa®%ØDD<ˆ€""ˆˆ" ˆ€""­ìgšt^Óê9c<Ó¢öŸQȲK¶H䈈µ‘ˆ€""ˆˆ" ˆ€""ˆˆ" Ö™ uãŧ0¼‘BUÆ]¢Êío0eå65€Ÿ©ÛÁ[2FJ̈ ¬jõ‚¢Zcx»íÁdž“ÄéÕù,ñjþäì[ |šxÁÆò†‹é½â«í.4É:•,Ýÿ0̲ƒˆP¶'§ëÓ¼ä[sº{yøä¥Lå¶d5•BÕòÔóö@DE°å„D@DDD@Öö3Í:/iõ‰±žiÑ{O¨äY%Û$r;¥Ñ”D].ˆ½étDétDétDétDétDétDétDétDétD=ô.½"¨–Æò-¨àì¸ "õ® ÔÚšÁðç}ëÀhü _—DQDl].ˆ½<K¢ K¢ K¢ K¢ K¢'°uÝŽ»)D?×ÿöäDYeÛ$ÿÙ %%EndBinary %%EndData end %%PageTrailer %%Trailer %%BoundingBox: 0 0 307 276 %%EOF optim-1.4.1/doc/optim.info0000644000175000017500000037634012477524021014064 0ustar olafolafThis is ../doc/optim.info, produced by makeinfo version 4.13 from ../doc/optim.texi. Additional documentation for the optim package for Octave. Copyright (C) > You can redistribute this documentation and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This documentation is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this documentation; if not, see .  File: optim.info, Node: Top, Next: Scalar optimization, Up: (dir) Additional documentation for the optim package for Octave ********************************************************* The optim package is a collection of additional functions related to numerical optimization. For Octaves core optimization functions (not contained in this package) *note Optimization: (octave)Optimization. * Menu: Types of functions in the optim package * Scalar optimization:: Functions for optimization of a scalar objective function. * Residual optimization:: Functions for optimization of a model function returning an array. * Zero finders:: Functions for finding the zero of a scalar or array valued nonlinear user function. * Gradient functions:: Functions for numerical approximation of gradients and Hessians. * Helper functions:: Functions for algebraic tasks common to optimization problems. * Documentation:: Function optim_doc to view documentation. Configuration * Common frontend options:: Options common to all frontends. * Common optimization options:: Options common to all optimization frontends. * Parameter structures:: Handling of structures of optimized parameters. * Additional parameters:: Passing additional parameters to user functions. Indices * Function index:: Index of functions in optim. * Concept index:: Concept index.  File: optim.info, Node: Scalar optimization, Next: Residual optimization, Prev: Top, Up: Top 1 Functions for optimization of a scalar objective function *********************************************************** * Menu: Frontend * nonlin_min:: Interface for scalar non-linear optimization. Backends * lm_feasible:: L/M-like optimizer, constraints met throughout optimization. * octave_sqp:: A wrapper to core Octaves sqp function. * siman:: Simulated annealing with constraints. * d2_min:: Newton-like optimizer, no constraints. Alternative frontend * fmins:: Frontend for direct search (gradient-less) algorithms. Standalone functions * nmsmax:: A Nelder-Mead simplex algorithm. * mdsmax:: A multidirectional search algorithm. * adsmax:: An alternating directions algorithm. * nelder_mead_min:: Another Nelder-Mead algorithm. * powell:: Direction-set (Powell's) method. * standalone d2_min:: Unconstrained Newton-like minimization, standalone function. * bfgsmin:: Unconstrained BFGS algorithm. * nrm:: Newton-Raphson algorithm. * cg_min:: A conjugate gradient method. * brent_line_min:: Linesearch, Brent method. * line_min:: Linesearch (minimize a function along dx). * samin:: A simulated annealing algorithm. * de_min:: A differential evolution algorithm. * battery:: Repeatedly call bfgsmin.  File: optim.info, Node: nonlin_min, Next: lm_feasible, Up: Scalar optimization 1.1 Frontend nonlin_min for scalar non-linear optimization ========================================================== -- Function File: [P, OBJF, CVG, OUTP] = nonlin_min (F, PIN) -- Function File: [P, OBJF, CVG, OUTP] = nonlin_min (F, PIN, SETTINGS) Frontend for nonlinear minimization of a scalar objective function. The functions supplied by the user have a minimal interface; any additionally needed constants can be supplied by wrapping the user functions into anonymous functions. The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately. F: objective function. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination. PIN: real column vector of initial parameters. SETTINGS: structure whose fields stand for optional settings referred to below. The fields can be set by `optimset()'. The returned values are the column vector of final parameters P, the final value of the objective function OBJF, an integer CVG indicating if and how optimization succeeded or failed, and a structure OUTP with additional information, curently with possible fields: `niter', the number of iterations, `nobjf', the number of objective function calls (indirect calls by gradient function not counted), `lambda', the lambda of constraints at the result, and `user_interaction', information on user stops (see settings). The backend may define additional fields. CVG is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be `0' (maximum number of iterations exceeded), `1' (success without further specification of criteria), `2' (parameter change less than specified precision in two consecutive iterations), `3' (improvement in objective function less than specified), `-1' (algorithm aborted by a user function), or `-4' (algorithm got stuck). Settings -------- The fields of the SETTINGS structure can be set with *note optimset: (octave)XREFoptimset. For settings common to all frontends (including these for statistics) *note Common frontend options::. For additional settings common to all optimization frontends *note Common optimization options::. Specific defaults: .................. `Algorithm': "lm_feasible" Additional settings: .................... `objf_grad' Function computing the gradient of the objective function with respect to the parameters. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If `dfdp' was specified by the user, the informational structure has the fields `f': value of objective function for current parameters, `fixed': logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, `diffp', `diff_onesided', `lbound', `ubound': identical to the user settings of this name, `plabels': 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, *note Parameter structures::, contain the names of the parameters and the subindices of the parameters, *note Non-scalar parameters::, respectively. The default gradient function will call the objective function with the second argument set with fields `f': as the `f' passed to the gradient function, `plabels': cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of `plabels' as passed to the jacobian function corresponding to current parameter, `side': `0' for one-sided interval, `1' or `2', respectively, for the sides of a two-sided interval, and `parallel': logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative. `objf_hessian' Function computing the Hessian of the objective function with respect to the parameters. The default is backend specific. Will be called with the column vector of parameters as argument. `inverse_hessian' Logical scalar, indicating whether the Hessian function passed by the user actually returns the inverse of the Hessian. `complex_step_derivative_objf' Logical scalar, default: `false'. Estimate gradient of objective function with complex step derivative approximation. Use only if you know that your objective function is suitable for this. No user function for the gradient (`objf_grad') must be specified. `save_state' String with path to a file which will be created for periodical saving of the state of optimization. Useful for very long optimizations which might get interrupted. The format of the saved state will be backend-specific. Currently, only the "siman" backend honours this option. Default: empty string, meaning no saving of state. `recover_state' String with path to a file created due to option `save_state', which is used to recover a saved state before starting optimization. Default: empty string, meaning no recovering of state. Structure based parameter handling ---------------------------------- Please *note Parameter structures::. Backend information ------------------- Please *note Scalar optimization:: and choose backend from menu.  File: optim.info, Node: lm_feasible, Next: octave_sqp, Prev: nonlin_min, Up: Scalar optimization 1.2 Default backend lm_feasible of scalar optimization ====================================================== A Levenberg/Marquardt-like algorithm, attempting to honour constraints throughout the course of optimization. This means that the initial parameters must not violate constraints (to find an initial feasible set of parameters, e.g. core Octaves `sqp' can be used ( *note octave_sqp::), by specifying an objective function which is constant or which returns a norm of the distances to the initial values). The Hessian is either supplied by the user or is approximated by the BFGS algorithm. Core Octaves `sqp' performed better in some tests with _unconstrained_ problems. Returned value CVG will be `2' or `3' for success and `0' or `-4' for failure ( *note nonlin_min:: for meaning). Returned structure OUTP will have the fields `niter', `nobjf', and `user_interaction'. Backend-specific defaults are: `MaxIter': 20, `fract_prec': `zeros (size (parameters))', `max_fract_change': `Inf' for all parameters. The setting `TolX' is not honoured. Interpretation of `Display': if set to "iter", currently only information on applying `max_fract_change' is printed.  File: optim.info, Node: octave_sqp, Next: siman, Prev: lm_feasible, Up: Scalar optimization 1.3 Backend wrapping core Octaves sqp function ============================================== This backend calls the `sqp' function of core Octave, a sequential quadratic programming algorithm with BFGS, so that it is usable with the `nonlin_min' frontend of the optim package. `sqp' honours constraints for the returned result, but not necessarily during the course of optimization. Compared to calling `sqp' directly ( *note sqp: (octave)XREFsqp.), * a different default (numerical) gradient function is used (that of the frontend), * bounds are set to `+-Inf' by default, not to `+-realmax'. The value of the additional setting `octave_sqp_tolerance', a tolerance to several termination criteria, is passed as the respective argument to Octaves `sqp', which provides the default `sqrt (eps)'. The settings `TolFun', `TolX', `fract_prec', `max_fract_change', `Display', and `user_interaction' are not honoured. Returned value CVG will be `1' for success and `0' or `-4' for failure ( *note nonlin_min:: for meaning). Returned structure OUTP will have the fields `niter', `nobjf', and `lambda'. The default of `MaxIter' is that of Octaves `sqp' (100).  File: optim.info, Node: siman, Next: d2_min, Prev: octave_sqp, Up: Scalar optimization 1.4 Simulated annealing backend siman of scalar optimization ============================================================ A simulated annealing (stochastic) optimizer, changing all parameters at once in a single step, so being suitable for non-bound constraints. No gradient or hessian of the objective function is used. The settings `MaxIter', `fract_prec', `TolFun', `TolX', and `max_fract_change' are not honoured. Accepts the additional settings `T_init' (initial temperature, default 0.01), `T_min' (final temperature, default 1.0e-5), `mu_T' (factor of temperature decrease, default 1.005), `iters_fixed_T' (iterations within one temperature step, default 10), `max_rand_step' (column vector or structure-based configuration of maximum random steps for each parameter, default 0.005 * PIN), `stoch_regain_constr' (if `true', regain constraints after a random step, otherwise take new random value until constraints are met, default `false'), `trace_steps' (set field `trace' of OUTP with a matrix with a row for each step, first column iteration number, second column repeat number within iteration, third column value of objective function, rest columns parameter values, default `false'), and `siman_log' (set field `log' of OUTP with a matrix with a row for each iteration, first column temperature, second column value of objective function, rest columns numbers of tries with decrease, no decrease but accepted, and no decrease and rejected. Steps with increase `diff' of objective function are accepted if `rand (1) < exp (- diff / T)', where `T' is the temperature of the current iteration. If regaining of constraints failed, optimization will be aborted and returned value of CVG will be `0'. Otherwise, CVG will be `1'. Returned structure OUTP, additionally to the possible fields `trace' and `log' described above, will have the fields `niter' and `user_interaction'. Interpretation of `Display': if set to "iter", an informational line is printed after each iteration. If `parallel_local' is set to `true', the objective function is evaluated for several parameter combinations in parallel. The course of optimization won't be changed by parallelization, provided the random number generator starts with the same state. To achieve this, some of the parallel results are discarded, causing the speedup to be smaller if the rate of acceptance of results is high. Also, due to overhead, there won't be any speedup, but even a slowdown, if the objective function is not computationally extensive enough. Honours options `save_state' and `recover_state', described for the frontend.  File: optim.info, Node: d2_min, Next: fmins, Prev: siman, Up: Scalar optimization 1.5 Unconstrained Newton-like optimization ========================================== This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. If the supplied Hessian function actually returns the inverse of the Hessian, set `inverse_hessian' to `true'. Supplying the inverse Hessian is preferable, if possible. Returned value CVG will be `2' or `3' for success and `0' or `-1' for failure ( *note nonlin_min:: for meaning). Returned structure OUTP will have the fields `niter', `nobjf', and `user_interaction'. Interpretation of `Display': if set to "iter", some diagnostics are printed.  File: optim.info, Node: fmins, Next: nmsmax, Prev: d2_min, Up: Scalar optimization 1.6 Alternative frontend for gradient-less algorithms ===================================================== -- Function File: [X] = fmins (F,X0,OPTIONS,GRAD,P1,P2, ...) Find the minimum of a funtion of several variables. By default the method used is the Nelder&Mead Simplex algorithm Example usage: fmins(inline('(x(1)-5).^2+(x(2)-8).^4'),[0;0]) *Inputs* F A string containing the name of the function to minimize X0 A vector of initial parameters fo the function F. OPTIONS Vector with control parameters (not all parameters are used) options(1) - Show progress (if 1, default is 0, no progress) options(2) - Relative size of simplex (default 1e-3) options(6) - Optimization algorithm if options(6)==0 - Nelder & Mead simplex (default) if options(6)==1 - Multidirectional search Method if options(6)==2 - Alternating Directions search options(5) if options(6)==0 && options(5)==0 - regular simplex if options(6)==0 && options(5)==1 - right-angled simplex Comment: the default is set to "right-angled simplex". this works better for me on a broad range of problems, although the default in nmsmax is "regular simplex" options(10) - Maximum number of function evaluations GRAD Unused (For compatibility with Matlab) P1, P2, ... Optional parameters for function F  File: optim.info, Node: nmsmax, Next: mdsmax, Prev: fmins, Up: Scalar optimization 1.7 A Nelder-Mead simplex algorithm =================================== Helptext: --------- NMSMAX Nelder-Mead simplex method for direct search optimization. [x, fmax, nf] = NMSMAX(FUN, x0, STOPIT, SAVIT) attempts to maximize the function FUN, using the starting vector x0. The Nelder-Mead direct search method is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative size of the simplex is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). The form of the initial simplex is determined by STOPIT(4): STOPIT(4) = 0: regular simplex (sides of equal length, the default) STOPIT(4) = 1: right-angled simplex. Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). STOPIT(6) indicates the direction (ie. minimization or maximization.) Default is 1, maximization. set STOPIT(6)=-1 for minimization If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. NMSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). References: N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. C. T. Kelley, Iterative Methods for Optimization, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1999.  File: optim.info, Node: mdsmax, Next: adsmax, Prev: nmsmax, Up: Scalar optimization 1.8 A multidirectional search algorithm ======================================= Helptext: --------- MDSMAX Multidirectional search method for direct search optimization. [x, fmax, nf] = MDSMAX(FUN, x0, STOPIT, SAVIT) attempts to maximize the function FUN, using the starting vector x0. The method of multidirectional search is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative size of the simplex is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). The form of the initial simplex is determined by STOPIT(4): STOPIT(4) = 0: regular simplex (sides of equal length, the default), STOPIT(4) = 1: right-angled simplex. Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. MDSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). This implementation uses 2n^2 elements of storage (two simplices), where x0 is an n-vector. It is based on the algorithm statement in [2, sec.3], modified so as to halve the storage (with a slight loss in readability). References: [1] V. J. Torczon, Multi-directional search: A direct search algorithm for parallel machines, Ph.D. Thesis, Rice University, Houston, Texas, 1989. [2] V. J. Torczon, On the convergence of the multidirectional search algorithm, SIAM J. Optimization, 1 (1991), pp. 123-145. [3] N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. [4] N. J. Higham, Accuracy and Stability of Numerical Algorithms, Second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2002; sec. 20.5.  File: optim.info, Node: adsmax, Next: nelder_mead_min, Prev: mdsmax, Up: Scalar optimization 1.9 An alternating directions algorithm ======================================= Helptext: --------- ADSMAX Alternating directions method for direct search optimization. [x, fmax, nf] = ADSMAX(FUN, x0, STOPIT, SAVIT, P) attempts to maximize the function FUN, using the starting vector x0. The alternating directions direct search method is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative increase in function value between successive iterations is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. By default, the search directions are the co-ordinate directions. The columns of a fifth parameter matrix P specify alternative search directions (P = EYE is the default). NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. ADSMAX(fun, x0, STOPIT, SAVIT, P, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). Reference: N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. N. J. Higham, Accuracy and Stability of Numerical Algorithms, Second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2002; sec. 20.5.  File: optim.info, Node: nelder_mead_min, Next: powell, Prev: adsmax, Up: Scalar optimization 1.10 Another Nelder-Mead algorithm ================================== This function does gradient-less minimization using the Nelder-Mead algorithm. No constraints are honoured. Helptext: --------- [x0,v,nev] = nelder_mead_min (f,args,ctl) - Nelder-Mead minimization Minimize 'f' using the Nelder-Mead algorithm. This function is inspired from the that found in the book "Numerical Recipes". ARGUMENTS --------- f : string : Name of function. Must return a real value args : list : Arguments passed to f. or matrix : f's only argument ctl : vector : (Optional) Control variables, described below or struct RETURNED VALUES --------------- x0 : matrix : Local minimum of f v : real : Value of f in x0 nev : number : Number of function evaluations CONTROL VARIABLE : (optional) may be named arguments (i.e. "name",value ------------------ pairs), a struct, or a vector of length <= 6, where NaN's are ignored. Default values are written . OPT. VECTOR NAME POS ftol,f N/A : Stopping criterion : stop search when values at simplex vertices are all alike, as tested by f > (max_i (f_i) - min_i (f_i)) /max(max(|f_i|),1) where f_i are the values of f at the vertices. <10*eps> rtol,r N/A : Stop search when biggest radius of simplex, using infinity-norm, is small, as tested by : ctl(2) > Radius <10*eps> vtol,v N/A : Stop search when volume of simplex is small, tested by ctl(2) > Vol crit,c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'rtol' (c=2) or 'vtol' (c=3) to the value of the 'tol' option. <1> tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> narg ctl(3) : Position of the minimized argument in args <1> maxev ctl(4) : Maximum number of function evaluations. This number may be slightly exceeded. isz ctl(5) : Size of initial simplex, which is : <1> { x + e_i | i in 0..N } Where x == args{narg} is the initial value e_0 == zeros (size (x)), e_i(j) == 0 if j != i and e_i(i) == ctl(5) e_i has same size as x Set ctl(5) to the distance you expect between the starting point and the minimum. rst ctl(6) : When a minimum is found the algorithm restarts next to it until the minimum does not improve anymore. ctl(6) is the maximum number of restarts. Set ctl(6) to zero if you know the function is well-behaved or if you don't mind not getting a true minimum. <0> verbose, v Be more or less verbose (quiet=0) <0>  File: optim.info, Node: powell, Next: standalone d2_min, Prev: nelder_mead_min, Up: Scalar optimization 1.11 Direction-set (Powell's) method ==================================== -- Function File: [P, OBJ_VALUE, CONVERGENCE, ITERS, NEVS] = powell (F, P0, CONTROL) Multidimensional minimization (direction-set method). Implements a direction-set (Powell's) method for multidimensional minimization of a function without calculation of the gradient [1, 2] Arguments --------- * F: name of function to minimize (string or handle), which should accept one input variable (see example for how to pass on additional input arguments) * P0: An initial value of the function argument to minimize * OPTIONS: an optional structure, which can be generated by optimset, with some or all of the following fields: - MaxIter: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) - TolFun: minimum amount by which function value must decrease in each iteration to continue (default is 1E-8) - MaxFunEvals: maximum function evaluations (positive integer, or -1 or Inf for unlimited (default)) - SearchDirections: an n*n matrix whose columns contain the initial set of (presumably orthogonal) directions to minimize along, where n is the number of elements in the argument to be minimized for; or an n*1 vector of magnitudes for the initial directions (defaults to the set of unit direction vectors) Examples -------- y = @(x, s) x(1) ^ 2 + x(2) ^ 2 + s; o = optimset('MaxIter', 100, 'TolFun', 1E-10); s = 1; [x_optim, y_min, conv, iters, nevs] = powell(@(x) y(x, s), [1 0.5], o); %pass y wrapped in an anonymous function so that all other arguments to y, which are held constant, are set %should return something like x_optim = [4E-14 3E-14], y_min = 1, conv = 1, iters = 2, nevs = 24 Returns: -------- * P: the minimizing value of the function argument * OBJ_VALUE: the value of F() at P * CONVERGENCE: 1 if normal convergence, 0 if not * ITERS: number of iterations performed * NEVS: number of function evaluations References ---------- 1. Powell MJD (1964), An efficient method for finding the minimum of a function of several variables without calculating derivatives, `Computer Journal', 7 :155-162 2. Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (1992). `Numerical Recipes in Fortran: The Art of Scientific Computing' (2nd Ed.). New York: Cambridge University Press (Section 10.5)  File: optim.info, Node: standalone d2_min, Next: bfgsmin, Prev: powell, Up: Scalar optimization 1.12 Unconstrained Newton-like minimization, standalone function ================================================================ This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. This function is deprecated. It is now available as a backend for `nonlin_min' ( *note nonlin_min::). Helptext: --------- [x,v,nev,h,args] = d2_min(f,d2f,args,ctl,code) - Newton-like minimization Minimize f(x) using 1st and 2nd derivatives. Any function w/ second derivatives can be minimized, as in Newton. f(x) decreases at each iteration, as in Levenberg-Marquardt. This function is inspired from the Levenberg-Marquardt algorithm found in the book "Numerical Recipes". ARGUMENTS : f : string : Cost function's name d2f : string : Name of function returning the cost (1x1), its differential (1xN) and its second differential or it's pseudo-inverse (NxN) (see ctl(5) below) : [v,dv,d2v] = d2f (x). args : list : f and d2f's arguments. By default, minimize the 1st or matrix : argument. ctl : vector : Control arguments (see below) or struct code : string : code will be evaluated after each outer loop that produced some (any) improvement. Variables visible from "code" include "x", the best parameter found, "v" the best value and "args", the list of all arguments. All can be modified. This option can be used to re-parameterize the argument space during optimization CONTROL VARIABLE ctl : (optional). May be a struct or a vector of length ---------------------- 5 or less where NaNs are ignored. Default values are written . FIELD VECTOR NAME POS ftol, f N/A : Stop search when value doesn't improve, as tested by f > Deltaf/max(|f(x)|,1) where Deltaf is the decrease in f observed in the last iteration. <10*sqrt(eps)> utol, u N/A : Stop search when updates are small, as tested by u > max { dx(i)/max(|x(i)|,1) | i in 1..N } where dx is the change in the x that occured in the last iteration. dtol, d N/A : Stop search when derivative is small, as tested by d > norm (dv) crit, c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'utol' (c=2) or 'dtol' (c=3) to the value of by the 'tol' option. <1> tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> narg, n ctl(3) : Position of the minimized argument in args <1> maxev,m ctl(4) : Maximum number of function evaluations maxout,m : Maximum number of outer loops id2f, i ctl(5) : 0 if d2f returns the 2nd derivatives, 1 if <0> it returns its pseudo-inverse. verbose, v N/A : Be more or less verbose (quiet=0) <0>  File: optim.info, Node: bfgsmin, Next: nrm, Prev: standalone d2_min, Up: Scalar optimization 1.13 Unconstrained BFGS algorithm ================================= BFGS or limited memory BFGS minimization of a function. No constraits are honoured. Helptext: --------- bfgsmin: bfgs or limited memory bfgs minimization of function Usage: [x, obj_value, convergence, iters] = bfgsmin(f, args, control) The function must be of the form [value, return_2,..., return_m] = f(arg_1, arg_2,..., arg_n) By default, minimization is w.r.t. arg_1, but it can be done w.r.t. any argument that is a vector. Numeric derivatives are used unless analytic derivatives are supplied. See bfgsmin_example.m for methods. Arguments: * f: name of function to minimize (string) * args: a cell array that holds all arguments of the function The argument with respect to which minimization is done MUST be a vector * control: an optional cell array of 1-8 elements. If a cell array shorter than 8 elements is provided, the trailing elements are provided with default values. * elem 1: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) * elem 2: verbosity 0 = no screen output (default) 1 = only final results 2 = summary every iteration 3 = detailed information * elem 3: convergence criterion 1 = strict (function, gradient and param change) (default) 0 = weak - only function convergence required * elem 4: arg in f_args with respect to which minimization is done (default is first) * elem 5: (optional) Memory limit for lbfgs. If it's a positive integer then lbfgs will be use. Otherwise ordinary bfgs is used * elem 6: function change tolerance, default 1e-12 * elem 7: parameter change tolerance, default 1e-6 * elem 8: gradient tolerance, default 1e-5 Returns: * x: the minimizer * obj_value: the value of f() at x * convergence: 1 if normal conv, other values if not * iters: number of iterations performed Example: see bfgsmin_example.m  File: optim.info, Node: nrm, Next: cg_min, Prev: bfgsmin, Up: Scalar optimization 1.14 Newton-Raphson algorithm ============================= No constraints are honoured. -- Function File: XMIN = nrm (F,X0) Using X0 as a starting point find a minimum of the scalar function F. The Newton-Raphson method is used.  File: optim.info, Node: cg_min, Next: brent_line_min, Prev: nrm, Up: Scalar optimization 1.15 A conjugate gradient method ================================ -- Function File: [X0,V,NEV] cg_min ( F,DF,ARGS,CTL ) NonLinear Conjugate Gradient method to minimize function F. Arguments --------- * F : string : Name of function. Return a real value * DF : string : Name of f's derivative. Returns a (R*C) x 1 vector * ARGS: cell : Arguments passed to f. * CTL : 5-vec : (Optional) Control variables, described below Returned values --------------- * X0 : matrix : Local minimum of f * V : real : Value of f in x0 * NEV : 1 x 2 : Number of evaluations of f and of df Control Variables ----------------- * CTL(1) : 1 or 2 : Select stopping criterion amongst : * CTL(1)==0 : Default value * CTL(1)==1 : Stopping criterion : Stop search when value doesn't improve, as tested by ctl(2) > Deltaf/max(|f(x)|,1) where Deltaf is the decrease in f observed in the last iteration (each iteration consists R*C line searches). * CTL(1)==2 : Stopping criterion : Stop search when updates are small, as tested by ctl(2) > max { dx(i)/max(|x(i)|,1) | i in 1..N } where dx is the change in the x that occured in the last iteration. * CTL(2) : Threshold used in stopping tests. Default=10*eps * CTL(2)==0 : Default value * CTL(3) : Position of the minimized argument in args Default=1 * CTL(3)==0 : Default value * CTL(4) : Maximum number of function evaluations Default=inf * CTL(4)==0 : Default value * CTL(5) : Type of optimization: * CTL(5)==1 : "Fletcher-Reves" method * CTL(5)==2 : "Polak-Ribiere" (Default) * CTL(5)==3 : "Hestenes-Stiefel" method CTL may have length smaller than 4. Default values will be used if ctl is not passed or if nan values are given. Example: -------- function r=df( l ) b=[1;0;-1]; r = -( 2*l{1} - 2*b + rand(size(l{1}))); endfunction function r=ff( l ) b=[1;0;-1]; r = (l{1}-b)' * (l{1}-b); endfunction ll = { [10; 2; 3] }; ctl(5) = 3; [x0,v,nev]=cg_min( "ff", "df", ll, ctl ) Comment: In general, BFGS method seems to be better performin in many cases but requires more computation per iteration See also http://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient. See also: *note bfgsmin: XREFbfgsmin.  File: optim.info, Node: brent_line_min, Next: line_min, Prev: cg_min, Up: Scalar optimization 1.16 Linesearch, Brent method ============================= -- Function File: [S,V,N] brent_line_min ( F,DF,ARGS,CTL ) Line minimization of f along df Finds minimum of f on line x0 + dx*w | a < w < b by bracketing. a and b are passed through argument ctl. Arguments --------- * F : string : Name of function. Must return a real value * ARGS : cell : Arguments passed to f or RxC : f's only argument. x0 must be at ARGS{ CTL(2) } * CTL : 5 : (optional) Control variables, described below. Returned values --------------- * S : 1 : Minimum is at x0 + s*dx * V : 1 : Value of f at x0 + s*dx * NEV : 1 : Number of function evaluations Control Variables ----------------- * CTL(1) : Upper bound for error on s Default=sqrt(eps) * CTL(2) : Position of minimized argument in args Default= 1 * CTL(3) : Maximum number of function evaluations Default= inf * CTL(4) : a Default=-inf * CTL(5) : b Default= inf Default values will be used if ctl is not passed or if nan values are given.  File: optim.info, Node: line_min, Next: samin, Prev: brent_line_min, Up: Scalar optimization 1.17 Linesearch, minimize a function along dx ============================================= Helptext: --------- [a,fx,nev] = line_min (f, dx, args, narg, h, nev_max) - Minimize f() along dx INPUT ---------- f : string : Name of minimized function dx : matrix : Direction along which f() is minimized args : cell : Arguments of f narg : integer : Position of minimized variable in args. Default=1 h : scalar : Step size to use for centered finite difference approximation of first and second derivatives. Default=1E-3. nev_max : integer : Maximum number of function evaluations. Default=30 OUTPUT --------- a : scalar : Value for which f(x+a*dx) is a minimum (*) fx : scalar : Value of f(x+a*dx) at minimum (*) nev : integer : Number of function evaluations (*) The notation f(x+a*dx) assumes that args == {x}. Reference: David G Luenberger's Linear and Nonlinear Programming  File: optim.info, Node: samin, Next: de_min, Prev: line_min, Up: Scalar optimization 1.18 A simulated annealing (stochastic) optimizer ================================================= This simulated annealing algorithm implicitely honours bound constraints. Helptext: --------- samin: simulated annealing minimization of a function. See samin_example.m usage: [x, obj, convergence, details] = samin(\"f\", {args}, {control}) Arguments: * \"f\": function name (string) * {args}: a cell array that holds all arguments of the function, * {control}: a cell array with 11 elements * LB - vector of lower bounds * UB - vector of upper bounds * nt - integer: # of iterations between temperature reductions * ns - integer: # of iterations between bounds adjustments * rt - (0 < rt <1): temperature reduction factor * maxevals - integer: limit on function evaluations * neps - integer: number of values final result is compared to * functol - (> 0): the required tolerance level for function value comparisons * paramtol - (> 0): the required tolerance level for parameters * verbosity - scalar: 0, 1, or 2. * 0 = no screen output * 1 = only final results to screen * 2 = summary every temperature change * minarg - integer: which of function args is minimization over? Returns: * x: the minimizer * obj: the value of f() at x * convergence: 0 if no convergence within maxevals function evaluations 1 if normal convergence to a point interior to the parameter space 2 if convergence to point very near bounds of parameter space (suggest re-running with looser bounds) * details: a px3 matrix. p is the number of times improvements were found. The columns record information at the time an improvement was found * first: cumulative number of function evaluations * second: temperature * third: function value Example: see samin_example  File: optim.info, Node: de_min, Next: battery, Prev: samin, Up: Scalar optimization 1.19 A differential evolution (stochastic) optimizer ==================================================== Helptext: --------- de_min: global optimisation using differential evolution Usage: [x, obj_value, nfeval, convergence] = de_min(fcn, control) minimization of a user-supplied function with respect to x(1:D), using the differential evolution (DE) method based on an algorithm by Rainer Storn (http://www.icsi.berkeley.edu/~storn/code.html) See: http://www.softcomputing.net/tevc2009_1.pdf Arguments: --------------- fcn string : Name of function. Must return a real value control vector : (Optional) Control variables, described below or struct Returned values: ---------------- x vector : parameter vector of best solution obj_value scalar : objective function value of best solution nfeval scalar : number of function evaluations convergence : 1 = best below value to reach (VTR) 0 = population has reached defined quality (tol) -1 = some values are close to constraints/boundaries -2 = max number of iterations reached (maxiter) -3 = max number of functions evaluations reached (maxnfe) Control variable: (optional) may be named arguments (i.e. "name",value ---------------- pairs), a struct, or a vector, where NaN's are ignored. XVmin : vector of lower bounds of initial population *** note: by default these are no constraints *** XVmax : vector of upper bounds of initial population constr : 1 -> enforce the bounds not just for the initial population const : data vector (remains fixed during the minimization) NP : number of population members F : difference factor from interval [0, 2] CR : crossover probability constant from interval [0, 1] strategy : 1 --> DE/best/1/exp 7 --> DE/best/1/bin 2 --> DE/rand/1/exp 8 --> DE/rand/1/bin 3 --> DE/target-to-best/1/exp 9 --> DE/target-to-best/1/bin 4 --> DE/best/2/exp 10--> DE/best/2/bin 5 --> DE/rand/2/exp 11--> DE/rand/2/bin 6 --> DEGL/SAW/exp else DEGL/SAW/bin refresh : intermediate output will be produced after "refresh" iterations. No intermediate output will be produced if refresh is < 1 VTR : Stopping criterion: "Value To Reach" de_min will stop when obj_value <= VTR. Use this if you know which value you expect. tol : Stopping criterion: "tolerance" stops if (best-worst)/max(1,worst) < tol This stops basically if the whole population is "good". maxnfe : maximum number of function evaluations maxiter : maximum number of iterations (generations) The algorithm seems to work well only if [XVmin,XVmax] covers the region where the global minimum is expected. DE is also somewhat sensitive to the choice of the difference factor F. A good initial guess is to choose F from interval [0.5, 1], e.g. 0.8. CR, the crossover probability constant from interval [0, 1] helps to maintain the diversity of the population and is rather uncritical but affects strongly the convergence speed. If the parameters are correlated, high values of CR work better. The reverse is true for no correlation. Experiments suggest that /bin likes to have a slightly larger CR than /exp. The number of population members NP is also not very critical. A good initial guess is 10*D. Depending on the difficulty of the problem NP can be lower than 10*D or must be higher than 10*D to achieve convergence. Default Values: --------------- XVmin = [-2]; XVmax = [ 2]; constr= 0; const = []; NP = 10 *D F = 0.8; CR = 0.9; strategy = 12; refresh = 0; VTR = -Inf; tol = 1.e-3; maxnfe = 1e6; maxiter = 1000; Example to find the minimum of the Rosenbrock saddle: ---------------------------------------------------- Define f as: function result = f(x); result = 100 * (x(2) - x(1)^2)^2 + (1 - x(1))^2; end Then type: ctl.XVmin = [-2 -2]; ctl.XVmax = [ 2 2]; [x, obj_value, nfeval, convergence] = de_min (@f, ctl); Keywords: global-optimisation optimisation minimisation  File: optim.info, Node: battery, Prev: de_min, Up: Scalar optimization 1.20 Repeatedly call bfgsmin ============================ Helptext: --------- battery.m: repeatedly call bfgs using a battery of start values, to attempt to find global min of a nonconvex function INPUTS: func: function to mimimize args: args of function minarg: argument to minimize w.r.t. (usually = 1) startvals: kxp matrix of values to try for sure (don't include all zeros, that's automatic) max iters per start value number of additional random start values to try OUTPUT: theta - the best value found - NOT iterated to convergence  File: optim.info, Node: Residual optimization, Next: Zero finders, Prev: Scalar optimization, Up: Top 2 Functions for optimization of a model function returning an array ******************************************************************* Model functions whose parameters are to be optimized may return a vector or array of values. Either these or their differences to some constant values (curve fitting) can be minimized in some sense, often, but not necessarily, by minimizing the sum of their squares. It is usually preferable to use optimizers designed for residual optimization for this purpose. These can exploit information contained in the individual elements of the returned array, which would not be possible if the user calculated a norm (e.g. sum of squares) of the elements and performed a scalar optimization. * Menu: Optimization frontends * nonlin_residmin:: The standard interface for non-linear residual minimization. * nonlin_curvefit:: A convenience interface, curve fitting. Optimization backends * lm_svd_feasible:: L/M algorithm with SVD, constraints met throughout optimization. Statistics frontends * residmin_stat:: Statistics for residual minimization. * curvefit_stat:: Statistics for curve fitting. Statistics backends * wls:: Statistics for weighted least squares. Standalone functions * leasqr:: An older function for curve fitting. * expfit:: Prony's method for non-linear exponential fitting. * polyfitinf:: Function polyfitinf for polynomial fitting. * wpolyfit:: Polynomial fitting suitable for polyconf. * polyconf:: Confidence and prediction intervals for polynomial fitting. * LinearRegression:: Function LinearRegression. * wsolve:: Another linear solver. * linprog:: Linear programming.  File: optim.info, Node: nonlin_residmin, Next: nonlin_curvefit, Up: Residual optimization 2.1 Frontend nonlin_residmin for non-linear residual minimization ================================================================= -- Function File: [P, RESID, CVG, OUTP] = nonlin_residmin (F, PIN) -- Function File: [P, RESID, CVG, OUTP] = nonlin_residmin (F, PIN, SETTINGS) Frontend for nonlinear minimization of residuals returned by a model function. The functions supplied by the user have a minimal interface; any additionally needed constants (e.g. observed values) can be supplied by wrapping the user functions into anonymous functions. The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately. F: function returning the array of residuals. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination. PIN: real column vector of initial parameters. SETTINGS: structure whose fields stand for optional settings referred to below. The fields can be set by `optimset()'. The returned values are the column vector of final parameters P, the final array of residuals RESID, an integer CVG indicating if and how optimization succeeded or failed, and a structure OUTP with additional information, curently with the fields: `niter', the number of iterations and `user_interaction', information on user stops (see settings). The backend may define additional fields. CVG is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be `0' (maximum number of iterations exceeded), `2' (parameter change less than specified precision in two consecutive iterations), or `3' (improvement in objective function - e.g. sum of squares - less than specified), or `-1' (algorithm aborted by a user function). See also: *note nonlin_curvefit: XREFnonlin_curvefit. Settings -------- The fields of the SETTINGS structure can be set with *note optimset: (octave)XREFoptimset. For settings common to all frontends (including these for statistics) *note Common frontend options::. For additional settings common to all optimization frontends *note Common optimization options::. Specific defaults: .................. `Algorithm': "lm_svd_feasible" Additional settings: .................... `weights' Array of weights for the residuals. Dimensions must match. `dfdp' Function computing the Jacobian of the residuals with respect to the parameters, assuming residuals are reshaped to a column vector. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If `dfdp' was specified by the user, the informational structure has the fields `f': value of residuals for current parameters, reshaped to a column vector, `fixed': logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, `diffp', `diff_onesided', `lbound', `ubound': identical to the user settings of this name, `plabels': 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, *note Parameter structures::, contain the names of the parameters and the subindices of the parameters, *note Non-scalar parameters::, respectively. The default jacobian function will call the model function with the second argument set with fields `f': as the `f' passed to the jacobian function, `plabels': cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of `plabels' as passed to the jacobian function corresponding to current parameter, `side': `0' for one-sided interval, `1' or `2', respectively, for the sides of a two-sided interval, and `parallel': logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative. `complex_step_derivative_f' Logical scalar, default: `false'. Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (`dfdp') must be specified. `plot_cmd' Function enabling backend to plot results or intermediate results. Will be called with current computed residuals. Default: plot nothing. This setting is deprecated and will disappear. Please use `user_interaction' instead ( *note Common optimization options::). Structure based parameter handling ---------------------------------- Please *note Parameter structures::. Backend information ------------------- Please *note Residual optimization:: and choose backend from menu under `Optimization backends'.  File: optim.info, Node: nonlin_curvefit, Next: lm_svd_feasible, Prev: nonlin_residmin, Up: Residual optimization 2.2 Function nonlin_curvefit() for curve fitting ================================================ In curve fitting, the model function computes values from a constant set of `independents', and the intention is to minimize the differences of these computed values to a constant set of `observations'. This can be done with `nonlin_residmin', but it is more convenient to use `nonlin_curvefit', which cares for passing the constant `independents' to the model function and for calculating the differences to the constant `observations'. However, if in some optimization problem you notice that you end up with passing dummy-values for the `independents' and zeros for the `observations', you can more naturally use `nonlin_residmin' instead of `nonlin_curvefit'. -- Function File: [P, FY, CVG, OUTP] = nonlin_curvefit (F, PIN, X, Y) -- Function File: [P, FY, CVG, OUTP] = nonlin_curvefit (F, PIN, X, Y, SETTINGS) Frontend for nonlinear fitting of values, computed by a model function, to observed values. Please refer to the description of `nonlin_residmin'. The differences to `nonlin_residmin' are the additional arguments X (independent values, mostly, but not necessarily, an array of the same dimensions or the same number of rows as Y) and Y (array of observations), the returned value FY (final guess for observed values) instead of RESID, that the model function has a second obligatory argument which will be set to X and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to X. See also: *note nonlin_residmin: XREFnonlin_residmin. Also, if the setting `user_interaction' is given, additional information is passed to these functions, *note Common optimization options::.  File: optim.info, Node: lm_svd_feasible, Next: residmin_stat, Prev: nonlin_curvefit, Up: Residual optimization 2.3 Default backend lm_svd_feasible of residual minimization ============================================================ Levenberg/Marquardt algorithm using singular value decomposition. Constraints must be met by the initial parameters and are attempted to be kept met throughout the optimization. Returned value CVG will be `0', `1', or `2'. Returned structure OUTP will have the fields `niter' and `user_interaction'. Backend-specific defaults are: `MaxIter': 20, `fract_prec': `zeros (size (parameters))', `max_fract_change': `Inf' for all parameters. The setting `TolX' is not honoured. Interpretation of `Display': if set to "iter", currently some diagnostics are printed. Specific option: `lm_svd_feasible_alt_s': if falling back to nearly gradient descent, do it more like original Levenberg/Marquardt method, with descent in each gradient component; for testing only.  File: optim.info, Node: residmin_stat, Next: curvefit_stat, Prev: lm_svd_feasible, Up: Residual optimization 2.4 Statistics for residual minimization ======================================== -- Function File: INFO = residmin_stat (F, P, SETTINGS) Frontend for computation of statistics for a residual-based minimization. SETTINGS is a structure whose fields can be set by `optimset'. With SETTINGS the computation of certain statistics is requested by setting the fields `ret_' to `true'. The respective statistics will be returned in a structure as fields with name `'. Depending on the requested statistic and on the additional information provided in SETTINGS, F and P may be empty. Otherwise, F is the model function of an optimization (the interface of F is described e.g. in `nonlin_residmin', please see there), and P is a real column vector with parameters resulting from the same optimization. Currently, the following statistics (or general information) can be requested: `dfdp': Jacobian of model function with respect to parameters. `covd': Covariance matrix of data (typically guessed by applying a factor to the covariance matrix of the residuals). `covp': Covariance matrix of final parameters. `corp': Correlation matrix of final parameters. See also: *note curvefit_stat: XREFcurvefit_stat. Further settings ---------------- The fields of the SETTINGS structure can be set with *note optimset: (octave)XREFoptimset. For settings common to all frontends *note Common frontend options::. Additional settings: .................... `objf_type' Type of objective function of the optimization; must be specified in many cases. This determines which backends to use. Currently, there are only backends for the type "wls" (weighted least squares). `residuals' `covd' Optional information on the result of optimization, residuals and covariance matrix of data, respectively. `weights' Array of weights applied to the residuals in the previous optimization. Dimensions must match those of the residuals. `dfdp' Can be set in the same way and has the same default as in `nonlin_residmin' ( *note nonlin_residmin::), but alternatively may already contain the computed Jacobian of the model function at the final parameters in matrix- or structure-form. `complex_step_derivative_f' Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (`dfdp') must be specified. Structure based parameter handling ---------------------------------- Please *note Parameter structures::. Backend information ------------------- Please *note Residual optimization:: and choose backend from menu under `Statistics backends'.  File: optim.info, Node: curvefit_stat, Next: wls, Prev: residmin_stat, Up: Residual optimization 2.5 Statistics for curve fitting ================================ As `nonlin_curvefit' can be used instead of `nonlin_residmin' for curve fitting ( *note nonlin_curvefit::, *note nonlin_residmin::), `curvefit_stat' can be used instead of `residmin_stat' ( *note residmin_stat::) for statistics on the results of curve fitting. -- Function File: INFO = curvefit_stat (F, P, X, Y, SETTINGS) Frontend for computation of statistics for fitting of values, computed by a model function, to observed values. Please refer to the description of `residmin_stat'. The only differences to `residmin_stat' are the additional arguments X (independent values) and Y (observations), that the model function F, if provided, has a second obligatory argument which will be set to X and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to X. See also: *note residmin_stat: XREFresidmin_stat.  File: optim.info, Node: wls, Next: leasqr, Prev: curvefit_stat, Up: Residual optimization 2.6 Statistics for weighted least squares ========================================= The backends for `objf_type == "wls"' (currently the only supported type of objective function) compute `covd' (due to user request or as a prerequisite for `covp' and `corp') as a diagonal matrix by assuming that the variances of data points are proportional to the reciprocal of the squared `weights' and guessing the factor of proportionality from the residuals. If `covp' is not defined (e.g. because the Jacobian has no full rank), an attempt is made to still compute its uniquely defined elements, if any. In `corp', interdependent parameters can cause elements of `1' or `-1', which in this case are not the real coefficients of correlation, but rather indicate the direction of parameter interdependence. To be consistent with this, an attempt is made (often not successful) to identify parameter interdependence and mark it with elements of `1' or `-1' in `corp' even if the respective elements of `covp' can not be computed.  File: optim.info, Node: leasqr, Next: expfit, Prev: wls, Up: Residual optimization 2.7 An older function for curve fitting ======================================= This was a popular function for curve fitting and has been enhanced to honour constraints. `nonlin_curvefit' ( *note nonlin_curvefit::) does now the same job if used with the default backend, and should be prefered due to its more powerful interface. The statistics returned by `leasqr' can also (and partially better) be computed with `curvefit_stat' ( *note curvefit_stat::). There are currently two things which still only `leasqr' does: * internally providing a function for plotting fits during optimization, * returning a pre-computed matrix for determining confidence regions. -- Function File: leasqr (X, Y, PIN, F) -- Function File: leasqr (X, Y, PIN, F, STOL) -- Function File: leasqr (X, Y, PIN, F, STOL, NITER) -- Function File: leasqr (X, Y, PIN, F, STOL, NITER, WT) -- Function File: leasqr (X, Y, PIN, F, STOL, NITER, WT, DP) -- Function File: leasqr (X, Y, PIN, F, STOL, NITER, WT, DP, DFDP) -- Function File: leasqr (X, Y, PIN, F, STOL, NITER, WT, DP, DFDP, OPTIONS) -- Function File: [F, P, CVG, ITER, CORP, COVP, COVR, STDRESID, Z, R2] = leasqr (...) Levenberg-Marquardt nonlinear regression. Input arguments: X Vector or matrix of independent variables. Y Vector or matrix of observed values. PIN Vector of initial parameters to be adjusted by leasqr. F Name of function or function handle. The function must be of the form `y = f(x, p)', with y, x, p of the form Y, X, PIN. STOL Scalar tolerance on fractional improvement in scalar sum of squares, i.e., `sum ((WT .* (Y-F))^2)'. Set to 0.0001 if empty or not given; NITER Maximum number of iterations. Set to 20 if empty or not given. WT Statistical weights (same dimensions as Y). These should be set to be proportional to `sqrt (Y) ^-1', i.e., the covariance matrix of the data is assumed to be proportional to diagonal with diagonal equal to `(WT.^2)^-1'. The constant of proportionality will be estimated. Set to `ones (size (Y))' if empty or not given. DP Fractional increment of P for numerical partial derivatives. Set to `0.001 * ones (size (PIN))' if empty or not given. * dp(j) > 0 means central differences on j-th parameter p(j). * dp(j) < 0 means one-sided differences on j-th parameter p(j). * dp(j) = 0 holds p(j) fixed, i.e., leasqr won't change initial guess: pin(j) DFDP Name of partial derivative function in quotes or function handle. If not given or empty, set to `dfdp', a slow but general partial derivatives function. The function must be of the form `prt = dfdp (x, f, p, dp, F [,bounds])'. For backwards compatibility, the function will only be called with an extra 'bounds' argument if the 'bounds' option is explicitly specified to leasqr (see dfdp.m). OPTIONS Structure with multiple options. The following fields are recognized: fract_prec Column vector (same length as PIN) of desired fractional precisions in parameter estimates. Iterations are terminated if change in parameter vector (chg) relative to current parameter estimate is less than their corresponding elements in 'fract_prec', i.e., `all (abs (chg) < abs (options.fract_prec .* current_parm_est))' on two consecutive iterations. Defaults to `zeros (size (PIN))'. max_fract_change Column vector (same length as PIN) of maximum fractional step changes in parameter vector. Fractional change in elements of parameter vector is constrained to be at most 'max_fract_change' between sucessive iterations, i.e., `abs (chg(i)) = abs (min([chg(i), options.max_fract_change(i) * current param estimate]))'. Defaults to `Inf * ones (size (PIN))'. inequc Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Initial parameters must satisfy these constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say m) and a vector (say v), specifying linear inequality constraints of the form `m.' * parameters + v >= 0'. If the constraints are just bounds, it is suggested to specify them in 'options.bounds' instead, since then some sanity tests are performed, and since the function 'dfdp.m' is guarantied not to violate constraints during determination of the numeric gradient only for those constraints specified as 'bounds' (possibly with violations due to a certain inaccuracy, however, except if no constraints except bounds are specified). The first entry for general constraints must be a differentiable vector valued function (say h), specifying general inequality constraints of the form `h (p[, idx]) >= 0'; p is the column vector of optimized paraters and the optional argument idx is a logical index. h has to return the values of all constraints if idx is not given, and has to return only the indexed constraints if idx is given (so computation of the other constraints can be spared). If a second entry for general constraints is given, it must be a function (say dh) which returnes a matrix whos rows contain the gradients of the constraint function h with respect to the optimized parameters. It has the form jac_h = dh (vh, p, dp, h, idx[, bounds]); p is the column vector of optimized parameters, and idx is a logical index -- only the rows indexed by idx must be returned (so computation of the others can be spared). The other arguments of dh are for the case that dh computes numerical gradients: vh is the column vector of the current values of the constraint function h, with idx already applied. h is a function h (p) to compute the values of the constraints for parameters p, it will return only the values indexed by idx. dp is a suggestion for relative step width, having the same value as the argument 'dp' of leasqr above. If bounds were specified to leasqr, they are provided in the argument bounds of dh, to enable their consideration in determination of numerical gradients. If dh is not specified to leasqr, numerical gradients are computed in the same way as with 'dfdp.m' (see above). If some constraints are linear, they should be specified as linear constraints (or bounds, if applicable) for reasons of performance, even if general constraints are also specified. bounds Two-column-matrix, one row for each parameter in PIN. Each row contains a minimal and maximal value for each parameter. Default: [-Inf, Inf] in each row. If this field is used with an existing user-side function for 'dFdp' (see above) the functions interface might have to be changed. equc Equality constraints, specified the same way as inequality constraints (see field 'options.inequc'). Initial parameters must satisfy these constraints. Note that there is possibly a certain inaccuracy in honoring constraints, except if only bounds are specified. _Warning_: If constraints (or bounds) are set, returned guesses of CORP, COVP, and Z are generally invalid, even if no constraints are active for the final parameters. If equality constraints are specified, CORP, COVP, and Z are not guessed at all. cpiv Function for complementary pivot algorithm for inequality constraints. Defaults to cpiv_bard. No different function is supplied. For backwards compatibility, OPTIONS can also be a matrix whose first and second column contains the values of fract_prec and max_fract_change, respectively. Output: F Column vector of values computed: f = F(x,p). P Column vector trial or final parameters, i.e, the solution. CVG Scalar: = 1 if convergence, = 0 otherwise. ITER Scalar number of iterations used. CORP Correlation matrix for parameters. COVP Covariance matrix of the parameters. COVR Diag(covariance matrix of the residuals). STDRESID Standardized residuals. Z Matrix that defines confidence region (see comments in the source). R2 Coefficient of multiple determination, intercept form. Not suitable for non-real residuals. References: Bard, Nonlinear Parameter Estimation, Academic Press, 1974. Draper and Smith, Applied Regression Analysis, John Wiley and Sons, 1981.  File: optim.info, Node: expfit, Next: polyfitinf, Prev: leasqr, Up: Residual optimization 2.8 Prony's method for non-linear exponential fitting ===================================================== Helptext: --------- USAGE [alpha,c,rms] = expfit( deg, x1, h, y ) Prony's method for non-linear exponential fitting Fit function: \sum_1^{deg} c(i)*exp(alpha(i)*x) Elements of data vector y must correspond to equidistant x-values starting at x1 with stepsize h The method is fully compatible with complex linear coefficients c, complex nonlinear coefficients alpha and complex input arguments y, x1, non-zero h . Fit-order deg must be a real positive integer. Returns linear coefficients c, nonlinear coefficients alpha and root mean square error rms. This method is known to be more stable than 'brute-force' non-linear least squares fitting. Example x0 = 0; step = 0.05; xend = 5; x = x0:step:xend; y = 2*exp(1.3*x)-0.5*exp(2*x); error = (rand(1,length(y))-0.5)*1e-4; [alpha,c,rms] = expfit(2,x0,step,y+error) alpha = 2.0000 1.3000 c = -0.50000 2.00000 rms = 0.00028461 The fit is very sensitive to the number of data points. It doesn't perform very well for small data sets. Theoretically, you need at least 2*deg data points, but if there are errors on the data, you certainly need more. Be aware that this is a very (very,very) ill-posed problem. By the way, this algorithm relies heavily on computing the roots of a polynomial. I used 'roots.m', if there is something better please use that code. Demo for a complex fit-function: deg= 2; N= 20; x1= -(1+i), x= linspace(x1,1+i/2,N).'; h = x(2) - x(1) y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x ); A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise [alpha,c,rms]= expfit( deg, x1, h, y )  File: optim.info, Node: polyfitinf, Next: wpolyfit, Prev: expfit, Up: Residual optimization 2.9 Function polyfitinf for polynomial fitting. =============================================== Helptext: --------- function [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT,REF0) Best polynomial approximation in discrete uniform norm INPUT VARIABLES: M : degree of the fitting polynomial N : number of data points X(N) : x-coordinates of data points Y(N) : y-coordinates of data points K : character of the polynomial: K = 0 : mixed parity polynomial K = 1 : odd polynomial ( X(1) must be > 0 ) K = 2 : even polynomial ( X(1) must be >= 0 ) EPSH : tolerance for leveling. A useful value for 24-bit mantissa is EPSH = 2.0E-7 MAXIT : upper limit for number of exchange steps REF0(M2): initial alternating set ( N-vector ). This is an OPTIONAL argument. The length M2 is given by: M2 = M + 2 , if K = 0 M2 = integer part of (M+3)/2 , if K = 1 M2 = 2 + M/2 (M must be even) , if K = 2 OUTPUT VARIABLES: A : polynomial coefficients of the best approximation in order of increasing powers: p*(x) = A(1) + A(2)*x + A(3)*x^2 + ... REF : selected alternating set of points HMAX : maximum deviation ( uniform norm of p* - f ) H : pointwise approximation errors R : total number of iterations EQUAL : success of failure of algorithm EQUAL=1 : succesful EQUAL=0 : convergence not acheived EQUAL=-1: input error EQUAL=-2: algorithm failure Relies on function EXCH, provided below. Example: M = 5; N = 10000; K = 0; EPSH = 10^-12; MAXIT = 10; X = linspace(-1,1,N); % uniformly spaced nodes on [-1,1] k=1; Y = abs(X).^k; % the function Y to approximate [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); p = polyval(A,X); plot(X,Y,X,p) % p is the best approximation Note: using an even value of M, e.g., M=2, in the example above, makes the algorithm to fail with EQUAL=-2, because of collocation, which appears because both the appriximating function and the polynomial are even functions. The way aroung it is to approximate only the right half of the function, setting K = 2 : even polynomial. For example: N = 10000; K = 2; EPSH = 10^-12; MAXIT = 10; X = linspace(0,1,N); for i = 1:2 k = 2*i-1; Y = abs(X).^k; for j = 1:4 M = 2^j; [~,~,HMAX] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); approxerror(i,j) = HMAX; end end disp('Table 3.1 from Approximation theory and methods, M.J.D.POWELL, p. 27'); disp(' '); disp(' n K=1 K=3'); disp(' '); format short g; disp([(2.^(1:4))' approxerror']); ALGORITHM: Computation of the polynomial that best approximates the data (X,Y) in the discrete uniform norm, i.e. the polynomial with the minimum value of max{ | p(x_i) - y_i | , x_i in X } . That polynomial, also known as minimax polynomial, is obtained by the exchange algorithm, a finite iterative process requiring, at most, n ( ) iterations ( usually p = M + 2. See also function EXCH ). p since this number can be very large , the routine may not converge within MAXIT iterations . The other possibility of failure occurs when there is insufficient floating point precision for the input data chosen. CREDITS: This routine was developed and modified as computer assignments in Approximation Theory courses by Prof. Andrew Knyazev, University of Colorado Denver, USA. Team Fall 98 (Revision 1.0): Chanchai Aniwathananon Crhistopher Mehl David A. Duran Saulo P. Oliveira Team Spring 11 (Revision 1.1): Manuchehr Aminian The algorithm and the comments are based on a FORTRAN code written by Joseph C. Simpson. The code is available on Netlib repository: http://www.netlib.org/toms/501 See also: Communications of the ACM, V14, pp.355-356(1971) NOTES: 1) A may contain the collocation polynomial 2) If MAXIT is exceeded, REF contains a new reference set 3) M, EPSH and REF can be altered during the execution 4) To keep consistency to the original code , EPSH can be negative. However, the use of REF0 is *NOT* determined by EPSH< 0, but only by its inclusion as an input parameter. Some parts of the code can still take advantage of vectorization. Revision 1.0 from 1998 is a direct human translation of the FORTRAN code http://www.netlib.org/toms/501 Revision 1.1 is a clean-up and technical update. Tested on MATLAB Version 7.11.0.584 (R2010b) and GNU Octave Version 3.2.4  File: optim.info, Node: wpolyfit, Next: polyconf, Prev: polyfitinf, Up: Residual optimization 2.10 Polynomial fitting suitable for polyconf ============================================= -- Function File: [P, S] = wpolyfit (X, Y, DY, N) Return the coefficients of a polynomial P(X) of degree N that minimizes `sumsq (p(x(i)) - y(i))', to best fit the data in the least squares sense. The standard error on the observations Y if present are given in DY. The returned value P contains the polynomial coefficients suitable for use in the function polyval. The structure S returns information necessary to compute uncertainty in the model. To compute the predicted values of y with uncertainty use [y,dy] = polyconf(p,x,s,'ci'); You can see the effects of different confidence intervals and prediction intervals by calling the wpolyfit internal plot function with your fit: feval('wpolyfit:plt',x,y,dy,p,s,0.05,'pi') Use DY=[] if uncertainty is unknown. You can use a chi^2 test to reject the polynomial fit: p = 1-chi2cdf(s.normr^2,s.df); p is the probability of seeing a chi^2 value higher than that which was observed assuming the data are normally distributed around the fit. If p < 0.01, you can reject the fit at the 1% level. You can use an F test to determine if a higher order polynomial improves the fit: [poly1,S1] = wpolyfit(x,y,dy,n); [poly2,S2] = wpolyfit(x,y,dy,n+1); F = (S1.normr^2 - S2.normr^2)/(S1.df-S2.df)/(S2.normr^2/S2.df); p = 1-f_cdf(F,S1.df-S2.df,S2.df); p is the probability of observing the improvement in chi^2 obtained by adding the extra parameter to the fit. If p < 0.01, you can reject the lower order polynomial at the 1% level. You can estimate the uncertainty in the polynomial coefficients themselves using dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr; but the high degree of covariance amongst them makes this a questionable operation. -- Function File: [P, S, MU] = wpolyfit (...) If an additional output `mu = [mean(x),std(x)]' is requested then the X values are centered and normalized prior to computing the fit. This will give more stable numerical results. To compute a predicted Y from the returned model use `y = polyval(p, (x-mu(1))/mu(2)' -- Function File: wpolyfit (...) If no output arguments are requested, then wpolyfit plots the data, the fitted line and polynomials defining the standard error range. Example x = linspace(0,4,20); dy = (1+rand(size(x)))/2; y = polyval([2,3,1],x) + dy.*randn(size(x)); wpolyfit(x,y,dy,2); -- Function File: wpolyfit (..., 'origin') If 'origin' is specified, then the fitted polynomial will go through the origin. This is generally ill-advised. Use with caution. Hocking, RR (2003). Methods and Applications of Linear Models. New Jersey: John Wiley and Sons, Inc. See also: *note polyconf: XREFpolyconf. See also *note polyfit: (octave)XREFpolyfit.  File: optim.info, Node: polyconf, Next: LinearRegression, Prev: wpolyfit, Up: Residual optimization 2.11 Confidence and prediction intervals for polynomial fitting =============================================================== Helptext: --------- [y,dy] = polyconf(p,x,s) Produce prediction intervals for the fitted y. The vector p and structure s are returned from polyfit or wpolyfit. The x values are where you want to compute the prediction interval. polyconf(...,['ci'|'pi']) Produce a confidence interval (range of likely values for the mean at x) or a prediction interval (range of likely values seen when measuring at x). The prediction interval tells you the width of the distribution at x. This should be the same regardless of the number of measurements you have for the value at x. The confidence interval tells you how well you know the mean at x. It should get smaller as you increase the number of measurements. Error bars in the physical sciences usually show a 1-alpha confidence value of erfc(1/sqrt(2)), representing one standandard deviation of uncertainty in the mean. polyconf(...,1-alpha) Control the width of the interval. If asking for the prediction interval 'pi', the default is .05 for the 95% prediction interval. If asking for the confidence interval 'ci', the default is erfc(1/sqrt(2)) for a one standard deviation confidence interval. Example: [p,s] = polyfit(x,y,1); xf = linspace(x(1),x(end),150); [yf,dyf] = polyconf(p,xf,s,'ci'); plot(xf,yf,'g-;fit;',xf,yf+dyf,'g.;;',xf,yf-dyf,'g.;;',x,y,'xr;data;'); plot(x,y-polyval(p,x),';residuals;',xf,dyf,'g-;;',xf,-dyf,'g-;;');  File: optim.info, Node: LinearRegression, Next: wsolve, Prev: polyconf, Up: Residual optimization 2.12 Function LinearRegression ============================== -*- texinfo -*- -- Function File: [P,E_VAR,R,P_VAR,Y_VAR] = LinearRegression (F,Y) -- Function File: [P,E_VAR,R,P_VAR,Y_VAR] = LinearRegression (F,Y,W) general linear regression determine the parameters p_j (j=1,2,...,m) such that the function f(x) = sum_(i=1,...,m) p_j*f_j(x) is the best fit to the given values y_i = f(x_i) parameters: * F is an n*m matrix with the values of the basis functions at the support points. In column j give the values of f_j at the points x_i (i=1,2,...,n) * Y is a column vector of length n with the given values * W is n column vector of of length n vector with the weights of data points return values: * P is the vector of length m with the estimated values of the parameters * E_VAR is the estimated variance of the difference between fitted and measured values * R is the weighted norm of the residual * P_VAR is the estimated variance of the parameters p_j * Y_VAR is the estimated variance of the dependend variables Caution: do NOT request Y_VAR for large data sets, as a n by n matrix is generated See also *note regress: (octave)XREFregress, *note leasqr: XREFleasqr, *note nonlin_curvefit: XREFnonlin_curvefit, *note polyfit: (octave)XREFpolyfit, *note wpolyfit: XREFwpolyfit, *note expfit: XREFexpfit.  File: optim.info, Node: wsolve, Next: linprog, Prev: LinearRegression, Up: Residual optimization 2.13 Function wsolve, another linear solver =========================================== Helptext: --------- [x,s] = wsolve(A,y,dy) Solve a potentially over-determined system with uncertainty in the values. A x = y +/- dy Use QR decomposition for increased accuracy. Estimate the uncertainty for the solution from the scatter in the data. The returned structure s contains normr = sqrt( A x - y ), weighted by dy R such that R'R = A'A df = n-p, n = rows of A, p = columns of A See polyconf for details on how to use s to compute dy. The covariance matrix is inv(R'*R). If you know that the parameters are independent, then uncertainty is given by the diagonal of the covariance matrix, or dx = sqrt(N*sumsq(inv(s.R'))') where N = normr^2/df, or N = 1 if df = 0. Example 1: weighted system A=[1,2,3;2,1,3;1,1,1]; xin=[1;2;3]; dy=[0.2;0.01;0.1]; y=A*xin+randn(size(dy)).*dy; [x,s] = wsolve(A,y,dy); dx = sqrt(sumsq(inv(s.R'))'); res = [xin, x, dx] Example 2: weighted overdetermined system y = x1 + 2*x2 + 3*x3 + e A = fullfact([3,3,3]); xin=[1;2;3]; y = A*xin; dy = rand(size(y))/50; y+=dy.*randn(size(y)); [x,s] = wsolve(A,y,dy); dx = s.normr*sqrt(sumsq(inv(s.R'))'/s.df); res = [xin, x, dx] Note there is a counter-intuitive result that scaling the uncertainty in the data does not affect the uncertainty in the fit. Indeed, if you perform a monte carlo simulation with x,y datasets selected from a normal distribution centered on y with width 10*dy instead of dy you will see that the variance in the parameters indeed increases by a factor of 100. However, if the error bars really do increase by a factor of 10 you should expect a corresponding increase in the scatter of the data, which will increase the variance computed by the fit.  File: optim.info, Node: linprog, Prev: wsolve, Up: Residual optimization 2.14 Linear programming ======================= -- Function File: X = linprog (F, A, B) -- Function File: X = linprog (F, A, B, AEQ, BEQ) -- Function File: X = linprog (F, A, B, AEQ, BEQ, LB, UB) -- Function File: [X, FVAL] = linprog (...) Solve a linear problem. Finds min (f' * x) (both f and x are column vectors) subject to A * x <= b Aeq * x = beq lb <= x <= ub If not specified, AEQ and BEQ default to empty matrices. If not specified, the lower bound LB defaults to minus infinite and the upper bound UB defaults to infinite. See also *note glpk: (octave)XREFglpk.  File: optim.info, Node: Zero finders, Next: Gradient functions, Prev: Residual optimization, Up: Top 3 Functions for finding the zero of a nonlinear user function ************************************************************* There is only one dedicated zero finder in the optim package, which is just a vectorized version of Octaves fzero ( *note fzero: (octave)XREFfzero.). * Menu: * vfzero:: A vectorized version of fzero.  File: optim.info, Node: vfzero, Up: Zero finders 3.1 A vectorized version of fzero ================================= -- Function File: vfzero (FUN, X0) -- Function File: vfzero (FUN, X0, OPTIONS) -- Function File: [X, FVAL, INFO, OUTPUT] = vfzero (...) A variant of `fzero'. Finds a zero of a vector-valued multivariate function where each output element only depends on the input element with the same index (so the Jacobian is diagonal). FUN should be a handle or name of a function returning a column vector. X0 should be a two-column matrix, each row specifying two points which bracket a zero of the respective output element of FUN. If X0 is a single-column matrix then several nearby and distant values are probed in an attempt to obtain a valid bracketing. If this is not successful, the function fails. OPTIONS is a structure specifying additional options. Currently, `vfzero' recognizes these options: `"FunValCheck"', `"OutputFcn"', `"TolX"', `"MaxIter"', `"MaxFunEvals"'. For a description of these options, see optimset. On exit, the function returns X, the approximate zero and FVAL, the function value thereof. INFO is a column vector of exit flags that can have these values: * 1 The algorithm converged to a solution. * 0 Maximum number of iterations or function evaluations has been reached. * -1 The algorithm has been terminated from user output function. * -5 The algorithm may have converged to a singular point. OUTPUT is a structure containing runtime information about the `fzero' algorithm. Fields in the structure are: * iterations Number of iterations through loop. * nfev Number of function evaluations. * bracketx A two-column matrix with the final bracketing of the zero along the x-axis. * brackety A two-column matrix with the final bracketing of the zero along the y-axis.  File: optim.info, Node: Gradient functions, Next: Helper functions, Prev: Zero finders, Up: Top 4 Functions for numerical approximation of gradients and Hessians ***************************************************************** You should not usually need to use these functions directly or pass them as arguments. They should be chosen and used by optimizer functions, possibly subject to their configuration options. * Menu: * dfpdp:: Direct user interface to default numerical gradient method of new frontends. * deriv:: Higher order numerical derivatives. * numgradient:: Another numerical gradient function. * numhessian:: Numerical Hessian function. * cdiff:: A string, yielding the numerical gradient if evaluated. * jacobs:: Complex step derivatives.  File: optim.info, Node: dfpdp, Next: deriv, Up: Gradient functions 4.1 Direct user interface to default numerical gradient method of new frontends =============================================================================== Helptext: --------- function jac = dfpdp (p, func[, hook]) Returns Jacobian of func (p) with respect to p with finite differencing. The optional argument hook is a structure which can contain the following fields at the moment: hook.f: value of func(p) for p as given in the arguments hook.diffp: positive vector of fractional steps from given p in finite differencing (actual steps may be smaller if bounds are given). The default is .001 * ones (size (p)). hook.diff_onesided: logical vector, indexing elements of p for which only one-sided differences should be computed (faster); even if not one-sided, differences might not be exactly central if bounds are given. The default is false (size (p)). hook.fixed: logical vector, indexing elements of p for which zero should be returned instead of the guessed partial derivatives (useful in optimization if some parameters are not optimized, but are 'fixed'). hook.lbound, hook.ubound: vectors of lower and upper parameter bounds (or -Inf or +Inf, respectively) to be respected in finite differencing. The consistency of bounds is not checked.  File: optim.info, Node: deriv, Next: numgradient, Prev: dfpdp, Up: Gradient functions 4.2 Higher order numerical derivatives ====================================== -- Function File: DX = deriv (F, X0) -- Function File: DX = deriv (F, X0, H) -- Function File: DX = deriv (F, X0, H, O) -- Function File: DX = deriv (F, X0, H, O, N) Calculate derivate of function F. F must be a function handle or the name of a function that takes X0 and returns a variable of equal length and orientation. X0 must be a numeric vector or scalar. H defines the step taken for the derivative calculation. Defaults to 1e-7. O defines the order of the calculation. Supported values are 2 (h^2 order) or 4 (h^4 order). Defaults to 2. N defines the derivative order. Defaults to the 1st derivative of the function. Can be up to the 4th derivative. Reference: Numerical Methods for Mathematics, Science, and Engineering by John H. Mathews.  File: optim.info, Node: numgradient, Next: numhessian, Prev: deriv, Up: Gradient functions 4.3 Another numerical gradient function ======================================= Helptext: --------- numgradient(f, {args}, minarg) Numeric central difference gradient of f with respect to argument \"minarg\". * first argument: function name (string) * second argument: all arguments of the function (cell array) * third argument: (optional) the argument to differentiate w.r.t. (scalar, default=1) \"f\" may be vector-valued. If \"f\" returns an n-vector, and the argument is a k-vector, the gradient will be an nxk matrix Example: function a = f(x); a = [x'*x; 2*x]; endfunction numgradient(\"f\", {ones(2,1)}) ans = 2.00000 2.00000 2.00000 0.00000 0.00000 2.00000  File: optim.info, Node: numhessian, Next: cdiff, Prev: numgradient, Up: Gradient functions 4.4 Numerical Hessian function ============================== Helptext: --------- numhessian(f, {args}, minarg) Numeric second derivative of f with respect to argument \"minarg\". * first argument: function name (string) * second argument: all arguments of the function (cell array) * third argument: (optional) the argument to differentiate w.r.t. (scalar, default=1) If the argument is a k-vector, the Hessian will be a kxk matrix function a = f(x, y) a = x'*x + log(y); endfunction numhessian(\"f\", {ones(2,1), 1}) ans = 2.0000e+00 -7.4507e-09 -7.4507e-09 2.0000e+00 Now, w.r.t. second argument: numhessian(\"f\", {ones(2,1), 1}, 2) ans = -1.0000  File: optim.info, Node: cdiff, Next: jacobs, Prev: numhessian, Up: Gradient functions 4.5 A string, yielding the numerical gradient if evaluated ========================================================== Helptext: --------- c = cdiff (func,wrt,N,dfunc,stack,dx) - Code for num. differentiation = "function df = dfunc (var1,..,dvar,..,varN) .. endfunction Returns a string of octave code that defines a function 'dfunc' that returns the derivative of 'func' with respect to it's 'wrt'th argument. The derivatives are obtained by symmetric finite difference. dfunc()'s return value is in the same format as that of ndiff() func : string : name of the function to differentiate wrt : int : position, in argument list, of the differentiation variable. Default:1 N : int : total number of arguments taken by 'func'. If N=inf, dfunc will take variable argument list. Default:wrt dfunc : string : Name of the octave function that returns the derivatives. Default:['d',func] stack : string : Indicates whether 'func' accepts vertically (stack="rstack") or horizontally (stack="cstack") arguments. Any other string indicates that 'func' does not allow stacking. Default:'' dx : real : Step used in the symmetric difference scheme. Default:10*sqrt(eps) See also : ndiff, eval, todisk  File: optim.info, Node: jacobs, Prev: cdiff, Up: Gradient functions 4.6 Complex step derivatives ============================ -- Function File: Df = jacobs (X, F) -- Function File: Df = jacobs (X, F, HOOK) Calculate the jacobian of a function using the complex step method. Let F be a user-supplied function. Given a point X at which we seek for the Jacobian, the function `jacobs' returns the Jacobian matrix `d(f(1), ..., df(end))/d(x(1), ..., x(n))'. The function uses the complex step method and thus can be applied to real analytic functions. The optional argument HOOK is a structure with additional options. HOOK can have the following fields: * `h' - can be used to define the magnitude of the complex step and defaults to 1e-20; steps larger than 1e-3 are not allowed. * `fixed' - is a logical vector internally usable by some optimization functions; it indicates for which elements of X no gradient should be computed, but zero should be returned. For example: f = @(x) [x(1)^2 + x(2); x(2)*exp(x(1))]; Df = jacobs ([1, 2], f)  File: optim.info, Node: Helper functions, Next: Documentation, Prev: Gradient functions, Up: Top 5 Functions for algebraic tasks common to optimization problems *************************************************************** * Menu: * cpiv_bard:: A complementary pivoting algorithm. * gjp:: Gauss-Jordan pivoting.  File: optim.info, Node: cpiv_bard, Next: gjp, Up: Helper functions 5.1 A complementary pivoting algorithm ====================================== Helptext: --------- [lb, idx, ridx, mv] = cpiv_bard (v, m[, incl]) v: column vector; m: matrix; incl (optional): index. length (v) must equal rows (m). Finds column vectors w and l with w == v + m * l, w >= 0, l >= 0, l.' * w == 0. Chooses idx, w, and l so that l(~idx) == 0, l(idx) == -inv (m(idx, idx)) * v(idx), w(idx) roughly == 0, and w(~idx) == v(~idx) + m(idx, ~idx).' * l(idx). idx indexes at least everything indexed by incl, but l(incl) may be < 0. lb: l(idx) (column vector); idx: logical index, defined above; ridx: ~idx & w roughly == 0; mv: [m, v] after performing a Gauss-Jordan 'sweep' (with gjp.m) on each diagonal element indexed by idx. Except the handling of incl (which enables handling of equality constraints in the calling code), this is called solving the 'complementary pivot problem' (Cottle, R. W. and Dantzig, G. B., 'Complementary pivot theory of mathematical programming', Linear Algebra and Appl. 1, 102--125. References for the current algorithm: Bard, Y.: Nonlinear Parameter Estimation, p. 147--149, Academic Press, New York and London 1974; Bard, Y., 'An eclectic approach to nonlinear programming', Proc. ANU Sem. Optimization, Canberra, Austral. Nat. Univ.).  File: optim.info, Node: gjp, Prev: cpiv_bard, Up: Helper functions 5.2 Gauss-Jordan pivoting ========================= Helptext: --------- m = gjp (m, k[, l]) m: matrix; k, l: row- and column-index of pivot, l defaults to k. Gauss-Jordon pivot as defined in Bard, Y.: Nonlinear Parameter Estimation, p. 296, Academic Press, New York and London 1974. In the pivot column, this seems not quite the same as the usual Gauss-Jordan(-Clasen) pivot. Bard gives Beaton, A. E., 'The use of special matrix operators in statistical calculus' Research Bulletin RB-64-51 (1964), Educational Testing Service, Princeton, New Jersey as a reference, but this article is not easily accessible. Another reference, whose definition of gjp differs from Bards by some signs, is Clarke, R. B., 'Algorithm AS 178: The Gauss-Jordan sweep operator with detection of collinearity', Journal of the Royal Statistical Society, Series C (Applied Statistics) (1982), 31(2), 166--168.  File: optim.info, Node: Documentation, Next: Common frontend options, Prev: Helper functions, Up: Top 6 Function optim_doc to view documentation ****************************************** -- Function File: optim_doc () -- Function File: optim_doc (KEYWORD) Show optim package documentation. Runs the info viewer Octave is configured with on the documentation in info format of the installed optim package. Without argument, the top node of the documentation is displayed. With an argument, the respective index entry is searched for and its node displayed.  File: optim.info, Node: Common frontend options, Next: Common optimization options, Prev: Documentation, Up: Top 7 Options common to all frontends ********************************* All frontends for optimization and for result statistics (*note nonlin_min::, *note nonlin_residmin::, *note nonlin_curvefit::, *note residmin_stat::, *note curvefit_stat::)accept the following options, settable with *note optimset: (octave)XREFoptimset. These options are handled within the frontend. `diffp' Column vector of fractional intervals (doubled for central intervals) supposed to be used by gradient or Jacobian functions performing finite differencing. Default: `.001 * ones (size (parameters))'. The default function for finite differencing will use these as absolute intervals for parameters with value zero. `diff_onesided' Logical column vector indicating the parameters for which one-sided intervals should be used by gradient or Jacobian functions performing finite differencing. Default: `false (size (parameters))'. `cstep' Scalar step size for complex step derivative approximation of gradients or Jacobians. Default: 1e-20. `parallel_local' Logical scalar, default: `false'. If the `parallel' package, `version >= 2.0.5', is loaded, estimate gradients of objective function and Jacobians of model function and of constraints in parallel processes. Works for default (real) finite differences and for complex step derivatives. Due to overhead, a speed advantage can only be expected if objective function, model function or constraint functions are time consuming enough. Additionally, this setting is also passed to the individual optimization backends, which may also consider this option (see documentation of backends). If this option is set to `true', a warning (ID: `optim:parallel_local') will be issued if no `parallel' package of a correct version is loaded. `fixed' Logical column vector indicating which parameters are not optimized, but kept to their inital value.  File: optim.info, Node: Common optimization options, Next: Parameter structures, Prev: Common frontend options, Up: Top 8 Options common to all optimization frontends ********************************************** All frontends for optimization (*note nonlin_min::, *note nonlin_residmin::, *note nonlin_curvefit::) accept the following options, settable with *note optimset: (octave)XREFoptimset. Settings handled within the frontend ------------------------------------ `Algorithm' String specifying the backend. `complex_step_derivative_inequc,' `complex_step_derivative_equc' Logical scalars, default: `false'. Estimate Jacobian of general inequality constraints and equality constraints, respectively, with complex step derivative approximation. Use only if you know that your function of general inequality constraints or function of general equality constraints, respectively, is suitable for this. No user function for the respective Jacobian must be specified. Settings passed to the backend ------------------------------ Which of these options are actually honored is noted in the descriptions of the individual backends. `lbound,' `ubound' Column vectors of lower and upper bounds for parameters. Default: `-Inf' and `+Inf', respectively. The bounds are non-strict, i.e. parameters are allowed to be exactly equal to a bound. The default function for gradients or Jacobians will respect bounds (but no further inequality constraints) in finite differencing if the backend respects bounds even during the course of optimization. `inequc' Further inequality constraints. Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say `m') and a vector (say `v'), specifying linear inequality constraints of the form `m.' * parameters + v >= 0'. The first entry for general constraints must be a differentiable column-vector valued function (say `h'), specifying general inequality constraints of the form `h (p[, idx]) >= 0'; `p' is the column vector of optimized paraters and the optional argument `idx' is a logical index. `h' has to return the values of all constraints if `idx' is not given. It may choose to return only the indexed constraints if `idx' is given (so computation of the other constraints can be spared); in this case, the additional setting `f_inequc_idx' has to be set to `true'. In gradient determination, this function may be called with an informational third argument, whose content depends on the function for gradient determination. If a second entry for general inequality constraints is given, it must be a function computing the jacobian of the constraints with respect to the parameters. For this function, the description of the setting `dfdp', *note dfdp: XREFoptiondfdp, applies, with 2 exceptions: 1) it is called with 3 arguments since it has an additional argument `idx', a logical index, at second position, indicating which rows of the jacobian must be returned (if the function chooses to return only indexed rows, the additional setting `df_inequc_idx' has to be set to `true'). 2) the default jacobian function calls `h' with 3 arguments, since the argument `idx' is also supplied. Note that specifying linear constraints as general constraints will generally waste performance, even if further, non-linear, general constraints are also specified. `f_inequc_idx,' `df_inequc_idx' Indicate that functions for general inequality constraints or their jacobian, respectively, return only the values or derivatives for the indexed parameters. See description of setting `inequc' above. `equc' Equality constraints. Specified the same way as inequality constraints (see `inequc' above). `f_equc_idx,' `df_equc_idx' As `f_inequc_idx' and `df_inequc_idx' above, but for equality constraints. `cpiv' Function for complementary pivoting, usable in algorithms for constraints. Default: `cpiv_bard'. Only the default function is supplied with the package. `TolFun' Minimum fractional improvement in objective function (e.g. sum of squares) in an iteration (termination criterium). Default: .0001. `TolX' Minimum fractional change in a norm of the parameters in an iteration (termination criterium). Default: backend specific. `MaxIter' Maximum number of iterations (termination criterium). Default: backend-specific. `fract_prec' Column Vector, minimum fractional changes of corresponding parameters in an iteration (termination criterium if violated in two consecutive iterations). Default: backend-specific. `max_fract_change' Column Vector, enforced maximum fractional changes in corresponding parameters in an iteration. Default: backend-specific. `Display' String indicating the degree of verbosity. Default: "off". Possible values are currently "off" (no messages) and "iter" (some messages after each iteration). Support of this setting and its exact interpretation are backend-specific. `debug' Logical scalar, default: `false'. Will be passed to the backend, which might print debugging information if `true'. `FunValCheck' If "on", the output of user functions will be sanity-checked. Default: "off". `user_interaction' Handle to a user function or cell-array with a number of these. Functions must have this interface: [STOP, INFO] = some_user_function (P, VALS, STATE); If STOP is `true', the algorithm stops. In INFO information about the reason for stopping can be returned in a free format. INFO can be set to be empty, but it must be set. Note that this is different from the otherwise similar Matlab setting `OutputFcn'. The functions will be called by the algorithms at the start with STATE set to init, after each iteration with STATE set to iter, and at the end with STATE set to done. P contains the current parameters, and VALS is a structure with other current values, the possible fields are currently: `iteration' number of the current iteration, `fval' value of objective function (for scalar optimization), `residual' residuals (for residual-based optimization), `model_y' in `nonlin_curvefit', the output of the model function, `observations' in `nonlin_curvefit', the constant observations, `model_x' in `nonlin_curvefit', the constant argument X. Information about the output of these functions when they were called the last time (possibly causing a stop) will be contained in the output OUTP of the frontend in field `user_interaction'. Subfield `stop' is a vector containing the STOP outputs of each function, subfield `info' is a cell-array containing the output INFO of each function. In the case of a stop, the output CVG of the frontent will be `-1'.  File: optim.info, Node: Parameter structures, Next: Additional parameters, Prev: Common optimization options, Up: Top 9 Handling of structures of optimized parameters ************************************************ It can be convenient not to handle the optimized parameters as elements of a vector, but as named fields of a structure. The frontends `nonlin_residmin', `nonlin_curvefit', `residmin_stat', `curvefit_stat', and `nonlin_min' can accept parameter information in structure form, and can pass the parameters as a structure to user functions, although the backends still handle the parameters as vectors. To use this feature, the initial parameters must be given in structure form, or the setting `param_order' must be given, a cell-array with names of the parameters. If both is done, only the parameters in structure fields named in `param_order' will be optimized. If there are still some non-structure-based configuration settings or user functions, specifying `param_order' is mandatory even if the initial parameters are given in structure form. If the initial parameters are a structure, the parameters being the optimization result will also be returned as a structure. * Menu: * Structure-based user functions:: Specify which user functions accept parameter structures. * Structure-based gradients and Hessians:: Format of returned values of structure-based gradient and Hessian functions. * Structure-based linear constraints:: Specify structure-based linear constraints. * Structure-based configuration settings:: Parameter-related configuration settings in structure form. * Non-scalar parameters:: Handling named parameter arrays.  File: optim.info, Node: Structure-based user functions, Next: Structure-based gradients and Hessians, Up: Parameter structures 9.1 Specify which user functions accept parameter structures ============================================================ The frontend must be told which user functions accept parameter structures by setting corresponding settings to `true'. The names of these settings depend on which user functions are applicable to the respective frontend and are shown in the following table. User function Setting Frontends ----------------------------------------------------------------------- Objective function `objf_pstruct' `nonlin_min' Gradient `grad_objf_pstruct' `nonlin_min' Hessian `hessian_objf_pstruct' `nonlin_min' Model function `f_pstruct' `nonlin_residmin', `nonlin_curvefit', `residmin_stat', `curvefit_stat' Jacobian `df_pstruct' `nonlin_residmin', `nonlin_curvefit', `residmin_stat', `curvefit_stat' General inequality `f_inequc_pstruct' `nonlin_min', constraints `nonlin_residmin', `nonlin_curvefit' Jacobian of general `df_inequc_pstruct' `nonlin_min', inequality constraints `nonlin_residmin', `nonlin_curvefit' General equality `f_equc_pstruct' `nonlin_min', constraints `nonlin_residmin', `nonlin_curvefit' Jacobian of general `df_equc_pstruct' `nonlin_min', equality constraints `nonlin_residmin', `nonlin_curvefit'  File: optim.info, Node: Structure-based gradients and Hessians, Next: Structure-based linear constraints, Prev: Structure-based user functions, Up: Parameter structures 9.2 Format of returned values of structure-based gradient and Hessian functions =============================================================================== Structure-based gradient or Jacobian functions, including Jacobians of general constraints, must return the partial derivatives as fields of a structure under the respective parameter names. For gradients, the partial derivatives are scalar. For Jacobians, the partial derivatives must be column vectors. Structure-based Hessian functions must return the 2nd derivatives as subfields in a two-level structure of parameter names. For example, if the parameter names are `a' and `b', the 2nd derivative with respect to `a' and `b' must be in the field `returned_structure.a.b' or `returned_structure.b.a' (there is no need to specify both).  File: optim.info, Node: Structure-based linear constraints, Next: Structure-based configuration settings, Prev: Structure-based gradients and Hessians, Up: Parameter structures 9.3 Specify structure-based linear constraints ============================================== Linear constraints, otherwise specified with a matrix and a vector, can be adapted to structure-based parameter handling by specifying, instead of a matrix, a structure containing the rows of the matrix in fields under the respective parameter names. In this case, rows containing only zeros need not be given.  File: optim.info, Node: Structure-based configuration settings, Next: Non-scalar parameters, Prev: Structure-based linear constraints, Up: Parameter structures 9.4 Parameter-related configuration settings in structure form ============================================================== The vector-based settings `lbound', `ubound', `fixed', `diffp', `diff_onesided', `fract_prec', and `max_fract_change' can be replaced by the setting `param_config'. It is a structure that can contain fields corresponding to parameter names. For each such field, there may be subfields with the same names as the above vector-based settings, but containing a scalar value for the respective parameter. For example, if the parameters are named `a' and `b', instead of specifying settings = optimset ("lbound", [-Inf; 0], "diff_onesided", [true; true]); one can specify pconf.b.lbound = 0; pconf.a.diff_onesided = true; pconf.b.diff_onesided = true; settings = optimset ("param_config", pconf); If `param_config' is specified, none of the above vector-based settings may be used.  File: optim.info, Node: Non-scalar parameters, Prev: Structure-based configuration settings, Up: Parameter structures 9.5 Handling named parameter arrays =================================== Parameters in named structure fields are allowed to be non-scalar real arrays. In this case, their dimensions must be given by the setting `param_dims', a cell-array of dimension vectors, each containing at least two dimensions; if not given, dimensions are taken from the initial parameters, if these are given in a structure. If there are any vector-based settings or not structure-based linear constraints, they must correspond to an order of parameters defined as follows: All named parameter arrays are reshaped to vectors. Then, all parameters, scalars and vectors, are concatenated in the order of parameter names, given by the user. Structure-based settings or structure-based initial parameters must contain arrays with dimensions reshapable to those of the respective parameters.  File: optim.info, Node: Additional parameters, Next: Function index, Prev: Parameter structures, Up: Top 10 Passing additional parameters to user functions ************************************************** Optimizers often require the user to supply functions (e.g. objective functions, model functions, constraint functions). The interface of these functions -- arguments and returned values -- is defined by the optimizer. Often, a user function needs additional arguments, not covered by the defined interface, which are constant throughout the optimization. These can be supplied by wrapping the user function into an anonymous function. *Note Anonymous Functions: (octave)Anonymous Functions, for further explanation and examples. There are also some older optimizers in the optim package, written when anonymous functions were not available in Octave. Some of these offer an interface to user functions which is itself able to pass additional constant variables in arbitrary argument positions. Newer optimizers should not be written this way, since this is an unnecessary complication. Though it is possible to use global variables to pass additional data to user functions, this is not recommended since it introduces the possibility of name conflicts within the pool of global variables.  File: optim.info, Node: Function index, Next: Concept index, Prev: Additional parameters, Up: Top Index of functions in optim *************************** [index] * Menu: * adsmax: adsmax. (line 6) * battery: battery. (line 6) * bfgsmin: bfgsmin. (line 6) * brent_line_min: brent_line_min. (line 6) * cdiff: cdiff. (line 6) * cg_min: cg_min. (line 6) * cpiv_bard: cpiv_bard. (line 6) * curvefit_stat: curvefit_stat. (line 6) * de_min: de_min. (line 6) * deriv: deriv. (line 6) * dfpdp: dfpdp. (line 6) * expfit: expfit. (line 6) * fmins: fmins. (line 6) * gjp: gjp. (line 6) * jacobs: jacobs. (line 6) * leasqr: leasqr. (line 6) * line_min: line_min. (line 6) * LinearRegression: LinearRegression. (line 6) * linprog: linprog. (line 6) * mdsmax: mdsmax. (line 6) * nelder_mead_min: nelder_mead_min. (line 6) * nmsmax: nmsmax. (line 6) * nonlin_curvefit: nonlin_curvefit. (line 6) * nonlin_min: nonlin_min. (line 6) * nonlin_residmin: nonlin_residmin. (line 6) * nrm: nrm. (line 6) * numgradient: numgradient. (line 6) * numhessian: numhessian. (line 6) * optim_doc: Documentation. (line 6) * polyconf: polyconf. (line 6) * polyfitinf: polyfitinf. (line 6) * powell: powell. (line 6) * residmin_stat: residmin_stat. (line 6) * samin: samin. (line 6) * standalone d2_min: standalone d2_min. (line 6) * vfzero: vfzero. (line 6) * wpolyfit: wpolyfit. (line 6) * wsolve: wsolve. (line 6)  File: optim.info, Node: Concept index, Prev: Function index, Up: Top Concept index ************* [index] * Menu: * additional parameters: Additional parameters. (line 6) * common optimization options: Common optimization options. (line 6) * common options: Common frontend options. (line 6) * d2_min: d2_min. (line 6) * gradient functions: Gradient functions. (line 6) * helper functions: Helper functions. (line 6) * lm_feasible: lm_feasible. (line 6) * lm_svd_feasible: lm_svd_feasible. (line 6) * named parameter arrays: Non-scalar parameters. (line 6) * non-scalar parameters: Non-scalar parameters. (line 6) * octave_sqp: octave_sqp. (line 6) * parameter structures: Parameter structures. (line 6) * residual optimization: Residual optimization. (line 6) * scalar optimization: Scalar optimization. (line 6) * siman: siman. (line 6) * statistics for weighted least squares: wls. (line 6) * structure-based configuration settings: Structure-based configuration settings. (line 6) * structure-based gradients: Structure-based gradients and Hessians. (line 6) * structure-based Hessians: Structure-based gradients and Hessians. (line 6) * structure-based linear constraints: Structure-based linear constraints. (line 6) * structure-based user functions: Structure-based user functions. (line 6) * wls: wls. (line 6) * zero finders: Zero finders. (line 6)  Tag Table: Node: Top814 Node: Scalar optimization2578 Node: nonlin_min4381 Ref: XREFnonlin_min4585 Node: lm_feasible10441 Node: octave_sqp11730 Node: siman13014 Node: d2_min15737 Node: fmins16488 Ref: XREFfmins16687 Node: nmsmax18101 Ref: XREFnmsmax18285 Node: mdsmax20220 Ref: XREFmdsmax20413 Node: adsmax22735 Ref: XREFadsmax22937 Node: nelder_mead_min24817 Ref: XREFnelder_mead_min25118 Node: powell27799 Ref: XREFpowell27985 Node: standalone d2_min30636 Ref: XREFd2_min31118 Node: bfgsmin33854 Ref: XREFbfgsmin34129 Node: nrm35840 Ref: XREFnrm36020 Node: cg_min36174 Ref: XREFcg_min36337 Node: brent_line_min38845 Ref: XREFbrent_line_min39007 Node: line_min40193 Ref: XREFline_min40407 Node: samin41199 Ref: XREFsamin41488 Node: de_min43133 Ref: XREFde_min43352 Node: battery47691 Ref: XREFbattery47848 Node: Residual optimization48311 Node: nonlin_residmin50444 Ref: XREFnonlin_residmin50673 Ref: XREFoptiondfdp53222 Node: nonlin_curvefit55910 Ref: XREFnonlin_curvefit56798 Node: lm_svd_feasible57976 Node: residmin_stat58991 Ref: XREFresidmin_stat59190 Node: curvefit_stat61985 Ref: XREFcurvefit_stat62419 Node: wls63194 Node: leasqr64312 Ref: XREFleasqr65089 Node: expfit74579 Ref: XREFexpfit74806 Node: polyfitinf76389 Ref: XREFpolyfitinf76606 Node: wpolyfit81373 Ref: XREFwpolyfit81567 Node: polyconf84554 Ref: XREFpolyconf84811 Node: LinearRegression86221 Node: wsolve87840 Ref: XREFwsolve88054 Node: linprog89749 Ref: XREFlinprog89877 Node: Zero finders90490 Node: vfzero90941 Ref: XREFvfzero91064 Node: Gradient functions92974 Node: dfpdp93884 Ref: XREFdfpdp94139 Node: deriv95220 Ref: XREFderiv95392 Node: numgradient96212 Ref: XREFnumgradient96412 Node: numhessian96994 Ref: XREFnumhessian97176 Node: cdiff97766 Ref: XREFcdiff97999 Node: jacobs99353 Ref: XREFjacobs99486 Node: Helper functions100516 Node: cpiv_bard100870 Ref: XREFcpiv_bard101043 Node: gjp102222 Ref: XREFgjp102369 Node: Documentation103185 Ref: XREFoptim_doc103381 Node: Common frontend options103782 Node: Common optimization options105906 Node: Parameter structures113472 Node: Structure-based user functions115269 Node: Structure-based gradients and Hessians117454 Node: Structure-based linear constraints118438 Node: Structure-based configuration settings119030 Node: Non-scalar parameters120169 Node: Additional parameters121170 Node: Function index122492 Node: Concept index125450  End Tag Table optim-1.4.1/doc/development/0002755000175000017500000000000012477523726014401 5ustar olafolafoptim-1.4.1/doc/development/interfaces.txt0000644000175000017500000002003212477523726017260 0ustar olafolafCopyright (C) 2010 Olaf Till License terms: You can redistribute and/or modify this text under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This text is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this text; If not, see . ################################################################### Optimization Interfaces It is planned (at least by me) to have common front-ends for some optimization backends. This document should describe some of the interfaces. (While frontends are supposed to honour options set by optimset, keeping Matlab compatibility with respect to option names, but possibly using additional options (not present in Matlab), the interface to the backend does not pass the options as returned by optimset, but uses different variables.) This document is possibly not up-to-date. The interface of backends in fitting of residuals (e.g. curve fitting) is (example implementation is in __nonlin_residmin__.m): function [p, resid, cvg, outp] = __backend_name__ (f, pin, hook) where in the arguments 'f' is a handle of the residual function, accepting as a single argument the vector of optimized paramters, and returning a matrix of residuals; (constant variables as, e.g., independents and observed values in curve fitting, can be provided by the front-end if 'f' is an anonymous function), 'pin' is the vector of initial parameters, and 'hook' is a structure with additional information; some fields of 'hook' may be not obligatory, and additional fields may be defined in the future (the fields will be described below), and in the returned values 'p' is the vector of computed parameters, 'resid' are the residuals for the result, 'cvg' is a flag (described below) characterizing the result, and 'outp' is a structure (described below) with additional information; some fields of 'outp' may not always be present, and additional fields may be defined in the future. The structure 'hook' can have the following fields: (Constraints are violated if constraint functions return vector elements which belong to inequality constraints and are smaller than zero or a small positive value, or vector elements which belong to equality constraints and have an absolute value larger than or equal to a small positive value.) 'mc' (required): matrix (possibly empty) of the function "mc.' * parameters + vc" of linear constraints, 'vc' (required): vector (possibly empty) of the function "mc.' * parameters + vc" of linear constraints, If bounds have been specified, they are contained in 'mc' and 'vc' _before_ all other linear constraints. 'n_gencstr' (required): number of general constraints (except the linear constraints given by 'mc' and 'vc', 'f_cstr' (required): handle of function of all constraints, accepting as arguments the vector of parameters and a logical index of constraints, and returning a vector of constraint values, the linear constraints being the first elements; if no second argument is given, the function returnes the vector of all constraints; if a second argument is given, only the values of the indicated constraints are returned (possibly sparing computation of the others), 'eq_idx' (required): logical index of equality constraints (as opposed to inequality constraints) within the vector of all constraints returned by 'f_cstr' with an index of all true as second argument, 'df_cstr' (required): handle of function for jacobian of all constraints, accepting as arguments the vector of parameters, a logical index of constraints, and a structure 'dfdp_hook' (described below) with additional information; the returned jacobian only contains rows for those constraints indicated by the logical vector, 'bounds' (required): 2-column matrix with bounds, one row per parameter, lacking bounds are indicated by -Inf or Inf, respectively; each bound (except -Inf or Inf) is also contained in 'mc' and 'vc' above; all (bounds(:, 1) <= bounds(:, 2)) is guaranteed, but any (bounds(:, 1) == bounds(:, 2)) is possible, 'pin_cstr' (required): structure with values of constraints (possibly empty vectors) for initial parameters, field 'inequ.lin_except_bounds': linear inequality constraints except bounds, field 'inequ.gen': general inequality constraints, field 'equ.lin' linear equality constraints, field 'equ.gen' general constraints, (so backend can decide what to do if initial parameters violate constraints), 'f_pin' (required): returned value of f (pin), 'dfdp' (required): handle of function returning the jacobian of the residual function, accepting as arguments the vector of parameters and a structure 'dfdp_hook' (described below) with additional information, 'dfdp_pin' (optional): returned value of dfdp for initial parameters and dfdp_hook with fields as settable by the front end; may be different from what the backend would compute, but might be used by the backend; could be useful if user requests checks of dfdp return value, as suggested by some existing Matlab optimset option --- these checks can be done better in the frontend, 'cpiv' (required): handle of function for complementary pivoting, the interface is not yet documented here and might change, for an example see "cpiv_bard.m" and its usage by "leasqr.m", 'max_fract_change' (required, elements possibly NA): column vector of maximum fractional changes in parameters between iterations, Inf for unlimited, 'fract_prec' (required, elements possibly NA): column vector of desired fractional precisions in parameter estimates (0 for disabled); typically, backends will abort optimization if fractional change is less than this in two successive iterations, 'TolFun' (required): (as corresponding Matlab compatible optimset option) scalar, minimum fractional improvement in sum of squared residuals between iterations; criterion for aborting optimization, 'MaxIter' (required, but may be empty): (as corresponding Matlab compatible optimset option) maximum number of iterations, 'weights' (required): weights for the residuals, same matrix size, 'fixed' (required): logical vector, indicates that parameters are not optimized, but keep their values, 'Display' (required): as corresponding Matlab compatible optimset option, 'plot_cmd': function for plotting (intermediate) results, accepting current computed residuals as argument, The returned value 'cvg' has the same meaning as 'exitflag' in Matlabs 'lsqcurvefit'. The returned value 'outp' is a structure with --- possibly --- the following fields: 'niter': number of iterations. The fields 'diffp', 'diff_onesided', 'bounds', and 'plabels' in the 'dfdp_hook' structure are not set in the backend, but can be set (e.g. by wrapping the jacobian function) in the frontend. The frontend also must correctly adapt the field 'fixed' (considering the parameters not seen by the backend). If _pstruct is set for the respective jacobian function, these values are passed as structures with each original row in a field with the name of a parameter. The structure 'dfdp_hook' can have the following fields: 'diffp': positive column vector, relative parameter change (or absolute, if a parameter is zero) in finite differencing for derivatives, 'diff_onesided': logical vector, indicates usage of one-sided intervals in finite differencing for derivatives, 'fixed' (required): logical vector, indicating which of the passed parameters is not optimized, but fixed, so the respective derivatives need not be computed but should be filled in with zero, 'bounds': as in structure 'hook' above, if given, bounds should not be violated even in finite differencing 'f': returned value of the passed residual function for the current parameters; can spare re-computation in finite differencing. 'plabels': a 2-dimensional cell array with parameter labels, one row for each parameter. optim-1.4.1/doc/html/0002755000175000017500000000000012477525061013015 5ustar olafolafoptim-1.4.1/doc/html/wsolve.html0000644000175000017500000001014012477524752015222 0ustar olafolaf wsolve - optim_doc

2.13 Function wsolve, another linear solver

Helptext:

[x,s] = wsolve(A,y,dy)

Solve a potentially over-determined system with uncertainty in
the values. 

    A x = y +/- dy

Use QR decomposition for increased accuracy.  Estimate the 
uncertainty for the solution from the scatter in the data.

The returned structure s contains

   normr = sqrt( A x - y ), weighted by dy
   R such that R'R = A'A
   df = n-p, n = rows of A, p = columns of A

See polyconf for details on how to use s to compute dy.
The covariance matrix is inv(R'*R).  If you know that the
parameters are independent, then uncertainty is given by
the diagonal of the covariance matrix, or 

   dx = sqrt(N*sumsq(inv(s.R'))')

where N = normr^2/df, or N = 1 if df = 0.

Example 1: weighted system

   A=[1,2,3;2,1,3;1,1,1]; xin=[1;2;3]; 
   dy=[0.2;0.01;0.1]; y=A*xin+randn(size(dy)).*dy;
   [x,s] = wsolve(A,y,dy);
   dx = sqrt(sumsq(inv(s.R'))');
   res = [xin, x, dx]

Example 2: weighted overdetermined system  y = x1 + 2*x2 + 3*x3 + e

   A = fullfact([3,3,3]); xin=[1;2;3];
   y = A*xin; dy = rand(size(y))/50; y+=dy.*randn(size(y));
   [x,s] = wsolve(A,y,dy);
   dx = s.normr*sqrt(sumsq(inv(s.R'))'/s.df);
   res = [xin, x, dx]

Note there is a counter-intuitive result that scaling the
uncertainty in the data does not affect the uncertainty in
the fit.  Indeed, if you perform a monte carlo simulation
with x,y datasets selected from a normal distribution centered
on y with width 10*dy instead of dy you will see that the
variance in the parameters indeed increases by a factor of 100.
However, if the error bars really do increase by a factor of 10
you should expect a corresponding increase in the scatter of 
the data, which will increase the variance computed by the fit.

optim-1.4.1/doc/html/Common-optimization-options.html0000644000175000017500000002600012477524752021352 0ustar olafolaf Common optimization options - optim_doc

8 Options common to all optimization frontends

All frontends for optimization (nonlin_min, nonlin_residmin, nonlin_curvefit) accept the following options, settable with optimset.

Settings handled within the frontend

Algorithm
String specifying the backend.
complex_step_derivative_inequc,
complex_step_derivative_equc
Logical scalars, default: false. Estimate Jacobian of general inequality constraints and equality constraints, respectively, with complex step derivative approximation. Use only if you know that your function of general inequality constraints or function of general equality constraints, respectively, is suitable for this. No user function for the respective Jacobian must be specified.

Settings passed to the backend

Which of these options are actually honored is noted in the descriptions of the individual backends.

lbound,
ubound
Column vectors of lower and upper bounds for parameters. Default: -Inf and +Inf, respectively. The bounds are non-strict, i.e. parameters are allowed to be exactly equal to a bound. The default function for gradients or Jacobians will respect bounds (but no further inequality constraints) in finite differencing if the backend respects bounds even during the course of optimization.
inequc
Further inequality constraints. Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say m) and a vector (say v), specifying linear inequality constraints of the form m.' * parameters + v >= 0. The first entry for general constraints must be a differentiable column-vector valued function (say h), specifying general inequality constraints of the form h (p[, idx]) >= 0; p is the column vector of optimized paraters and the optional argument idx is a logical index. h has to return the values of all constraints if idx is not given. It may choose to return only the indexed constraints if idx is given (so computation of the other constraints can be spared); in this case, the additional setting f_inequc_idx has to be set to true. In gradient determination, this function may be called with an informational third argument, whose content depends on the function for gradient determination. If a second entry for general inequality constraints is given, it must be a function computing the jacobian of the constraints with respect to the parameters. For this function, the description of the setting dfdp, see dfdp, applies, with 2 exceptions: 1) it is called with 3 arguments since it has an additional argument idx, a logical index, at second position, indicating which rows of the jacobian must be returned (if the function chooses to return only indexed rows, the additional setting df_inequc_idx has to be set to true). 2) the default jacobian function calls h with 3 arguments, since the argument idx is also supplied. Note that specifying linear constraints as general constraints will generally waste performance, even if further, non-linear, general constraints are also specified.
f_inequc_idx,
df_inequc_idx
Indicate that functions for general inequality constraints or their jacobian, respectively, return only the values or derivatives for the indexed parameters. See description of setting inequc above.
equc
Equality constraints. Specified the same way as inequality constraints (see inequc above).
f_equc_idx,
df_equc_idx
As f_inequc_idx and df_inequc_idx above, but for equality constraints.
cpiv
Function for complementary pivoting, usable in algorithms for constraints. Default: cpiv_bard. Only the default function is supplied with the package.
TolFun
Minimum fractional improvement in objective function (e.g. sum of squares) in an iteration (termination criterium). Default: .0001.
TolX
Minimum fractional change in a norm of the parameters in an iteration (termination criterium). Default: backend specific.
MaxIter
Maximum number of iterations (termination criterium). Default: backend-specific.
fract_prec
Column Vector, minimum fractional changes of corresponding parameters in an iteration (termination criterium if violated in two consecutive iterations). Default: backend-specific.
max_fract_change
Column Vector, enforced maximum fractional changes in corresponding parameters in an iteration. Default: backend-specific.
Display
String indicating the degree of verbosity. Default: "off". Possible values are currently "off" (no messages) and "iter" (some messages after each iteration). Support of this setting and its exact interpretation are backend-specific.
debug
Logical scalar, default: false. Will be passed to the backend, which might print debugging information if true.
FunValCheck
If "on", the output of user functions will be sanity-checked. Default: "off".
user_interaction
Handle to a user function or cell-array with a number of these. Functions must have this interface:
          [stop, info] = some_user_function (p, vals,
                                                         state);

If stop is true, the algorithm stops. In info information about the reason for stopping can be returned in a free format. info can be set to be empty, but it must be set. Note that this is different from the otherwise similar Matlab setting OutputFcn. The functions will be called by the algorithms at the start with state set to init, after each iteration with state set to iter, and at the end with state set to done. p contains the current parameters, and vals is a structure with other current values, the possible fields are currently:

iteration
number of the current iteration,
fval
value of objective function (for scalar optimization),
residual
residuals (for residual-based optimization),
model_y
in nonlin_curvefit, the output of the model function,
observations
in nonlin_curvefit, the constant observations,
model_x
in nonlin_curvefit, the constant argument x.
Information about the output of these functions when they were called the last time (possibly causing a stop) will be contained in the output outp of the frontend in field user_interaction. Subfield stop is a vector containing the stop outputs of each function, subfield info is a cell-array containing the output info of each function. In the case of a stop, the output cvg of the frontent will be -1.
optim-1.4.1/doc/html/mdsmax.html0000644000175000017500000001117712477524752015207 0ustar olafolaf mdsmax - optim_doc

Next: , Previous: nmsmax, Up: Scalar optimization


1.8 A multidirectional search algorithm

Helptext:

MDSMAX  Multidirectional search method for direct search optimization.
       [x, fmax, nf] = MDSMAX(FUN, x0, STOPIT, SAVIT) attempts to
       maximize the function FUN, using the starting vector x0.
       The method of multidirectional search is used.
       Output arguments:
              x    = vector yielding largest function value found,
              fmax = function value at x,
              nf   = number of function evaluations.
       The iteration is terminated when either
              - the relative size of the simplex is <= STOPIT(1)
                (default 1e-3),
              - STOPIT(2) function evaluations have been performed
                (default inf, i.e., no limit), or
              - a function value equals or exceeds STOPIT(3)
                (default inf, i.e., no test on function values).
       The form of the initial simplex is determined by STOPIT(4):
         STOPIT(4) = 0: regular simplex (sides of equal length, the default),
         STOPIT(4) = 1: right-angled simplex.
       Progress of the iteration is not shown if STOPIT(5) = 0 (default 1).
       If a non-empty fourth parameter string SAVIT is present, then
       `SAVE SAVIT x fmax nf' is executed after each inner iteration.
       NB: x0 can be a matrix.  In the output argument, in SAVIT saves,
           and in function calls, x has the same shape as x0.
       MDSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional
       arguments to be passed to fun, via feval(fun,x,P1,P2,...).

This implementation uses 2n^2 elements of storage (two simplices), where x0
is an n-vector.  It is based on the algorithm statement in [2, sec.3],
modified so as to halve the storage (with a slight loss in readability).

References:
[1] V. J. Torczon, Multi-directional search: A direct search algorithm for
    parallel machines, Ph.D. Thesis, Rice University, Houston, Texas, 1989.
[2] V. J. Torczon, On the convergence of the multidirectional search
    algorithm, SIAM J. Optimization, 1 (1991), pp. 123-145.
[3] N. J. Higham, Optimization by direct search in matrix computations,
    SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993.
[4] N. J. Higham, Accuracy and Stability of Numerical Algorithms,
       Second edition, Society for Industrial and Applied Mathematics,
       Philadelphia, PA, 2002; sec. 20.5.

optim-1.4.1/doc/html/nonlin_005fresidmin.html0000644000175000017500000002160212477524752017472 0ustar olafolaf nonlin_residmin - optim_doc

2.1 Frontend nonlin_residmin for non-linear residual minimization

— Function File: [p, resid, cvg, outp] = nonlin_residmin (f, pin)
— Function File: [p, resid, cvg, outp] = nonlin_residmin (f, pin, settings)

Frontend for nonlinear minimization of residuals returned by a model function.

The functions supplied by the user have a minimal interface; any additionally needed constants (e.g. observed values) can be supplied by wrapping the user functions into anonymous functions.

The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately.

f: function returning the array of residuals. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination.

pin: real column vector of initial parameters.

settings: structure whose fields stand for optional settings referred to below. The fields can be set by optimset().

The returned values are the column vector of final parameters p, the final array of residuals resid, an integer cvg indicating if and how optimization succeeded or failed, and a structure outp with additional information, curently with the fields: niter, the number of iterations and user_interaction, information on user stops (see settings). The backend may define additional fields. cvg is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be 0 (maximum number of iterations exceeded), 2 (parameter change less than specified precision in two consecutive iterations), or 3 (improvement in objective function – e.g. sum of squares – less than specified), or -1 (algorithm aborted by a user function).

See also: nonlin_curvefit.

Settings

The fields of the settings structure can be set with optimset.

For settings common to all frontends (including these for statistics) see Common frontend options.

For additional settings common to all optimization frontends see Common optimization options.

Specific defaults:

Algorithm: "lm_svd_feasible"

Additional settings:
weights
Array of weights for the residuals. Dimensions must match.
dfdp
Function computing the Jacobian of the residuals with respect to the parameters, assuming residuals are reshaped to a column vector. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If dfdp was specified by the user, the informational structure has the fields f: value of residuals for current parameters, reshaped to a column vector, fixed: logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, diffp, diff_onesided, lbound, ubound: identical to the user settings of this name, plabels: 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, see

Parameter structures, contain the names of the parameters and the subindices of the parameters, see Non-scalar parameters, respectively. The default jacobian function will call the model function with the second argument set with fields f: as the f passed to the jacobian function, plabels: cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of plabels as passed to the jacobian function corresponding to current parameter, side: 0 for one-sided interval, 1 or 2, respectively, for the sides of a two-sided interval, and parallel: logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative.

complex_step_derivative_f
Logical scalar, default: false. Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (dfdp) must be specified.
plot_cmd
Function enabling backend to plot results or intermediate results. Will be called with current computed residuals. Default: plot nothing. This setting is deprecated and will disappear. Please use user_interaction instead (see Common optimization options).

Structure based parameter handling

Please see Parameter structures.

Backend information

Please see Residual optimization and choose backend from menu under `Optimization backends'. optim-1.4.1/doc/html/curvefit_005fstat.html0000644000175000017500000000773212477524752017175 0ustar olafolaf curvefit_stat - optim_doc

Next: , Previous: residmin_stat, Up: Residual optimization


2.5 Statistics for curve fitting

As nonlin_curvefit can be used instead of nonlin_residmin for curve fitting (see nonlin_curvefit, see nonlin_residmin), curvefit_stat can be used instead of residmin_stat (see residmin_stat) for statistics on the results of curve fitting.

— Function File: info = curvefit_stat (f, p, x, y, settings)

Frontend for computation of statistics for fitting of values, computed by a model function, to observed values.

Please refer to the description of residmin_stat. The only differences to residmin_stat are the additional arguments x (independent values) and y (observations), that the model function f, if provided, has a second obligatory argument which will be set to x and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to x.

See also: residmin_stat.

optim-1.4.1/doc/html/nrm.html0000644000175000017500000000521612477524752014507 0ustar olafolaf nrm - optim_doc

Next: , Previous: bfgsmin, Up: Scalar optimization


1.14 Newton-Raphson algorithm

No constraints are honoured.

— Function File: xmin = nrm (f,x0)

Using x0 as a starting point find a minimum of the scalar function f. The Newton-Raphson method is used.

optim-1.4.1/doc/html/XREFfmins.html0000644000175000017500000000010212477524752015501 0ustar olafolaf optim-1.4.1/doc/html/Common-frontend-options.html0000644000175000017500000001107012477524752020444 0ustar olafolaf Common frontend options - optim_doc

Next: , Previous: Documentation, Up: Top


7 Options common to all frontends

All frontends for optimization and for result statistics (nonlin_min, nonlin_residmin, nonlin_curvefit, residmin_stat, curvefit_stat)accept the following options, settable with optimset.

These options are handled within the frontend.

diffp
Column vector of fractional intervals (doubled for central intervals) supposed to be used by gradient or Jacobian functions performing finite differencing. Default: .001 * ones (size (parameters)). The default function for finite differencing will use these as absolute intervals for parameters with value zero.
diff_onesided
Logical column vector indicating the parameters for which one-sided intervals should be used by gradient or Jacobian functions performing finite differencing. Default: false (size (parameters)).
cstep
Scalar step size for complex step derivative approximation of gradients or Jacobians. Default: 1e-20.
parallel_local
Logical scalar, default: false. If the parallel package, version >= 2.0.5, is loaded, estimate gradients of objective function and Jacobians of model function and of constraints in parallel processes. Works for default (real) finite differences and for complex step derivatives. Due to overhead, a speed advantage can only be expected if objective function, model function or constraint functions are time consuming enough. Additionally, this setting is also passed to the individual optimization backends, which may also consider this option (see documentation of backends). If this option is set to true, a warning (ID: optim:parallel_local) will be issued if no parallel package of a correct version is loaded.
fixed
Logical column vector indicating which parameters are not optimized, but kept to their inital value.
optim-1.4.1/doc/html/XREFwsolve.html0000644000175000017500000000010412477524752015706 0ustar olafolaf optim-1.4.1/doc/html/XREFsamin.html0000644000175000017500000000010212477524752015474 0ustar olafolaf optim-1.4.1/doc/html/powell.html0000644000175000017500000001247612477524752015223 0ustar olafolaf powell - optim_doc

1.11 Direction-set (Powell's) method

— Function File: [p, obj_value, convergence, iters, nevs] = powell (f, p0, control)

Multidimensional minimization (direction-set method). Implements a direction-set (Powell's) method for multidimensional minimization of a function without calculation of the gradient [1, 2]

Arguments

  • f: name of function to minimize (string or handle), which should accept one input variable (see example for how to pass on additional input arguments)
  • p0: An initial value of the function argument to minimize
  • options: an optional structure, which can be generated by optimset, with some or all of the following fields:
    • MaxIter: maximum iterations (positive integer, or -1 or Inf for unlimited (default))
    • TolFun: minimum amount by which function value must decrease in each iteration to continue (default is 1E-8)
    • MaxFunEvals: maximum function evaluations (positive integer, or -1 or Inf for unlimited (default))
    • SearchDirections: an n*n matrix whose columns contain the initial set of (presumably orthogonal) directions to minimize along, where n is the number of elements in the argument to be minimized for; or an n*1 vector of magnitudes for the initial directions (defaults to the set of unit direction vectors)

Examples

          y = @(x, s) x(1) ^ 2 + x(2) ^ 2 + s;
          o = optimset('MaxIter', 100, 'TolFun', 1E-10);
          s = 1;
          [x_optim, y_min, conv, iters, nevs] = powell(@(x) y(x, s), [1 0.5], o); %pass y wrapped in an anonymous function so that all other arguments to y, which are held constant, are set
          %should return something like x_optim = [4E-14 3E-14], y_min = 1, conv = 1, iters = 2, nevs = 24
          

Returns:

  • p: the minimizing value of the function argument
  • obj_value: the value of f() at p
  • convergence: 1 if normal convergence, 0 if not
  • iters: number of iterations performed
  • nevs: number of function evaluations

References

  1. Powell MJD (1964), An efficient method for finding the minimum of a function of several variables without calculating derivatives, Computer Journal, 7 :155-162
  2. Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (1992). Numerical Recipes in Fortran: The Art of Scientific Computing (2nd Ed.). New York: Cambridge University Press (Section 10.5)

optim-1.4.1/doc/html/numgradient.html0000644000175000017500000000572112477524752016231 0ustar olafolaf numgradient - optim_doc

Next: , Previous: deriv, Up: Gradient functions


4.3 Another numerical gradient function

Helptext:

numgradient(f, {args}, minarg)

Numeric central difference gradient of f with respect
to argument \"minarg\".
* first argument: function name (string)
* second argument: all arguments of the function (cell array)
* third argument: (optional) the argument to differentiate w.r.t.
	(scalar, default=1)

\"f\" may be vector-valued. If \"f\" returns
an n-vector, and the argument is a k-vector, the gradient
will be an nxk matrix

Example:
function a = f(x);
	a = [x'*x; 2*x];
endfunction
numgradient(\"f\", {ones(2,1)})
ans =

  2.00000  2.00000
  2.00000  0.00000
  0.00000  2.00000


optim-1.4.1/doc/html/bfgsmin.html0000644000175000017500000001025312477524752015335 0ustar olafolaf bfgsmin - optim_doc

Next: , Previous: standalone d2_min, Up: Scalar optimization


1.13 Unconstrained BFGS algorithm

BFGS or limited memory BFGS minimization of a function. No constraits are honoured.

Helptext:

bfgsmin: bfgs or limited memory bfgs minimization of function

Usage: [x, obj_value, convergence, iters] = bfgsmin(f, args, control)

The function must be of the form
[value, return_2,..., return_m] = f(arg_1, arg_2,..., arg_n)
By default, minimization is w.r.t. arg_1, but it can be done
w.r.t. any argument that is a vector. Numeric derivatives are
used unless analytic derivatives are supplied. See bfgsmin_example.m
for methods.

Arguments:
* f: name of function to minimize (string)
* args: a cell array that holds all arguments of the function
	The argument with respect to which minimization is done
	MUST be a vector
* control: an optional cell array of 1-8 elements. If a cell
  array shorter than 8 elements is provided, the trailing elements
  are provided with default values.
	* elem 1: maximum iterations  (positive integer, or -1 or Inf for unlimited (default))
	* elem 2: verbosity
		0 = no screen output (default)
		1 = only final results
		2 = summary every iteration
		3 = detailed information
	* elem 3: convergence criterion
		1 = strict (function, gradient and param change) (default)
		0 = weak - only function convergence required
	* elem 4: arg in f_args with respect to which minimization is done (default is first)
	* elem 5: (optional) Memory limit for lbfgs. If it's a positive integer
		then lbfgs will be use. Otherwise ordinary bfgs is used
	* elem 6: function change tolerance, default 1e-12
	* elem 7: parameter change tolerance, default 1e-6
	* elem 8: gradient tolerance, default 1e-5

Returns:
* x: the minimizer
* obj_value: the value of f() at x
* convergence: 1 if normal conv, other values if not
* iters: number of iterations performed

Example: see bfgsmin_example.m

optim-1.4.1/doc/html/nelder_005fmead_005fmin.html0000644000175000017500000001232512477524752020002 0ustar olafolaf nelder_mead_min - optim_doc

Next: , Previous: adsmax, Up: Scalar optimization


1.10 Another Nelder-Mead algorithm

This function does gradient-less minimization using the Nelder-Mead algorithm. No constraints are honoured.

Helptext:

[x0,v,nev] = nelder_mead_min (f,args,ctl) - Nelder-Mead minimization

Minimize 'f' using the Nelder-Mead algorithm. This function is inspired
from the that found in the book "Numerical Recipes".

ARGUMENTS
---------
f     : string : Name of function. Must return a real value
args  : list   : Arguments passed to f.
     or matrix : f's only argument
ctl   : vector : (Optional) Control variables, described below
     or struct

RETURNED VALUES
---------------
x0  : matrix   : Local minimum of f
v   : real     : Value of f in x0
nev : number   : Number of function evaluations

CONTROL VARIABLE : (optional) may be named arguments (i.e. "name",value
------------------ pairs), a struct, or a vector of length <= 6, where
                   NaN's are ignored. Default values are written <value>.
 OPT.   VECTOR
 NAME    POS
ftol,f  N/A    : Stopping criterion : stop search when values at simplex
                 vertices are all alike, as tested by 

                  f > (max_i (f_i) - min_i (f_i)) /max(max(|f_i|),1)

                 where f_i are the values of f at the vertices.  <10*eps>

rtol,r  N/A    : Stop search when biggest radius of simplex, using
                 infinity-norm, is small, as tested by :

             ctl(2) > Radius                                     <10*eps>

vtol,v  N/A    : Stop search when volume of simplex is small, tested by
           
             ctl(2) > Vol

crit,c ctl(1)  : Set one stopping criterion, 'ftol' (c=1), 'rtol' (c=2)
                 or 'vtol' (c=3) to the value of the 'tol' option.    <1>

tol, t ctl(2)  : Threshold in termination test chosen by 'crit'  <10*eps>

narg  ctl(3)  : Position of the minimized argument in args            <1>
maxev ctl(4)  : Maximum number of function evaluations. This number <inf>
                may be slightly exceeded.
isz   ctl(5)  : Size of initial simplex, which is :                   <1>

               { x + e_i | i in 0..N } 

               Where x == args{narg} is the initial value 
                e_0    == zeros (size (x)), 
                e_i(j) == 0 if j != i and e_i(i) == ctl(5)
                e_i    has same size as x

               Set ctl(5) to the distance you expect between the starting
               point and the minimum.

rst   ctl(6)   : When a minimum is found the algorithm restarts next to
                 it until the minimum does not improve anymore. ctl(6) is
                 the maximum number of restarts. Set ctl(6) to zero if
                 you know the function is well-behaved or if you don't
                 mind not getting a true minimum.                     <0>

verbose, v     Be more or less verbose (quiet=0)                      <0>

optim-1.4.1/doc/html/XREFwpolyfit.html0000644000175000017500000000011012477524752016241 0ustar olafolaf optim-1.4.1/doc/html/XREFpowell.html0000644000175000017500000000010412477524752015671 0ustar olafolaf optim-1.4.1/doc/html/XREFline_005fmin.html0000644000175000017500000000011612477524752016557 0ustar olafolaf optim-1.4.1/doc/html/XREFbattery.html0000644000175000017500000000010612477524752016043 0ustar olafolaf optim-1.4.1/doc/html/XREFnumhessian.html0000644000175000017500000000011412477524752016542 0ustar olafolaf optim-1.4.1/doc/html/Structure_002dbased-linear-constraints.html0000644000175000017500000000621612477524752023255 0ustar olafolaf Structure-based linear constraints - optim_doc

9.3 Specify structure-based linear constraints

Linear constraints, otherwise specified with a matrix and a vector , can be adapted to structure-based parameter handling by specifying, instead of a matrix, a structure containing the rows of the matrix in fields under the respective parameter names. In this case, rows containing only zeros need not be given. optim-1.4.1/doc/html/wls.html0000644000175000017500000000667212477524752014527 0ustar olafolaf wls - optim_doc

Next: , Previous: curvefit_stat, Up: Residual optimization


2.6 Statistics for weighted least squares

The backends for objf_type == "wls" (currently the only supported type of objective function) compute covd (due to user request or as a prerequisite for covp and corp) as a diagonal matrix by assuming that the variances of data points are proportional to the reciprocal of the squared weights and guessing the factor of proportionality from the residuals. If covp is not defined (e.g. because the Jacobian has no full rank), an attempt is made to still compute its uniquely defined elements, if any. In corp, interdependent parameters can cause elements of 1 or -1, which in this case are not the real coefficients of correlation, but rather indicate the direction of parameter interdependence. To be consistent with this, an attempt is made (often not successful) to identify parameter interdependence and mark it with elements of 1 or -1 in corp even if the respective elements of covp can not be computed. optim-1.4.1/doc/html/cdiff.html0000644000175000017500000000732612477524752014772 0ustar olafolaf cdiff - optim_doc

Next: , Previous: numhessian, Up: Gradient functions


4.5 A string, yielding the numerical gradient if evaluated

Helptext:

c = cdiff (func,wrt,N,dfunc,stack,dx) - Code for num. differentiation
  = "function df = dfunc (var1,..,dvar,..,varN) .. endfunction

Returns a string of octave code that defines a function 'dfunc' that
returns the derivative of 'func' with respect to it's 'wrt'th
argument.

The derivatives are obtained by symmetric finite difference.

dfunc()'s return value is in the same format as that of  ndiff()

func  : string : name of the function to differentiate

wrt   : int    : position, in argument list, of the differentiation
                 variable.                                Default:1

N     : int    : total number of arguments taken by 'func'. 
                 If N=inf, dfunc will take variable argument list.
                                                        Default:wrt

dfunc : string : Name of the octave function that returns the
                  derivatives.                   Default:['d',func]

stack : string : Indicates whether 'func' accepts vertically
                 (stack="rstack") or horizontally (stack="cstack")
                 arguments. Any other string indicates that 'func'
                 does not allow stacking.                Default:''

dx    : real   : Step used in the symmetric difference scheme.
                                                 Default:10*sqrt(eps)

See also : ndiff, eval, todisk

optim-1.4.1/doc/html/XREFgjp.html0000644000175000017500000000007612477524752015157 0ustar olafolaf optim-1.4.1/doc/html/dfpdp.html0000644000175000017500000000645312477524752015014 0ustar olafolaf dfpdp - optim_doc

Next: , Up: Gradient functions


4.1 Direct user interface to default numerical gradient method of new frontends

Helptext:

function jac = dfpdp (p, func[, hook])

Returns Jacobian of func (p) with respect to p with finite
differencing. The optional argument hook is a structure which can
contain the following fields at the moment:

hook.f: value of func(p) for p as given in the arguments

hook.diffp: positive vector of fractional steps from given p in
finite differencing (actual steps may be smaller if bounds are
given). The default is .001 * ones (size (p)).

hook.diff_onesided: logical vector, indexing elements of p for
which only one-sided differences should be computed (faster); even
if not one-sided, differences might not be exactly central if
bounds are given. The default is false (size (p)).

hook.fixed: logical vector, indexing elements of p for which zero
should be returned instead of the guessed partial derivatives
(useful in optimization if some parameters are not optimized, but
are 'fixed').

hook.lbound, hook.ubound: vectors of lower and upper parameter
bounds (or -Inf or +Inf, respectively) to be respected in finite
differencing. The consistency of bounds is not checked.

optim-1.4.1/doc/html/XREFcpiv_005fbard.html0000644000175000017500000000012012477524752016711 0ustar olafolaf optim-1.4.1/doc/html/XREFjacobs.html0000644000175000017500000000010412477524752015630 0ustar olafolaf optim-1.4.1/doc/html/index.html0000644000175000017500000001162712477524752015025 0ustar olafolaf optim_doc

optim_doc

Next: , Up: (dir)


Additional documentation for the optim package for Octave

Note: Some links in this document lead to Octaves core documentation. These links probably don't work in the html version of this document. They should work, however, in the info version accessible from the Octave commandline with optim_doc().

The optim package is a collection of additional functions related to numerical optimization. For Octaves core optimization functions (not contained in this package) see Optimization.

Types of functions in the optim package

Configuration

Indices

optim-1.4.1/doc/html/Structure_002dbased-user-functions.html0000644000175000017500000001142612477524752022421 0ustar olafolaf Structure-based user functions - optim_doc

9.1 Specify which user functions accept parameter structures

The frontend must be told which user functions accept parameter structures by setting corresponding settings to true. The names of these settings depend on which user functions are applicable to the respective frontend and are shown in the following table.

User function Setting Frontends
Objective function objf_pstruct nonlin_min
Gradient grad_objf_pstruct nonlin_min
Hessian hessian_objf_pstruct nonlin_min
Model function f_pstruct nonlin_residmin, nonlin_curvefit, residmin_stat, curvefit_stat
Jacobian df_pstruct nonlin_residmin, nonlin_curvefit, residmin_stat, curvefit_stat
General inequality constraints f_inequc_pstruct nonlin_min, nonlin_residmin, nonlin_curvefit
Jacobian of general inequality constraints df_inequc_pstruct nonlin_min, nonlin_residmin, nonlin_curvefit
General equality constraints f_equc_pstruct nonlin_min, nonlin_residmin, nonlin_curvefit
Jacobian of general equality constraints df_equc_pstruct nonlin_min, nonlin_residmin, nonlin_curvefit
optim-1.4.1/doc/html/XREFpolyconf.html0000644000175000017500000000011012477524752016215 0ustar olafolaf optim-1.4.1/doc/html/XREFnonlin_005fcurvefit.html0000644000175000017500000000013412477524752020171 0ustar olafolaf optim-1.4.1/doc/html/jacobs.html0000644000175000017500000000676012477524752015161 0ustar olafolaf jacobs - optim_doc

Previous: cdiff, Up: Gradient functions


4.6 Complex step derivatives

— Function File: Df = jacobs (x, f)
— Function File: Df = jacobs (x, f, hook)

Calculate the jacobian of a function using the complex step method.

Let f be a user-supplied function. Given a point x at which we seek for the Jacobian, the function jacobs returns the Jacobian matrix d(f(1), ..., df(end))/d(x(1), ..., x(n)). The function uses the complex step method and thus can be applied to real analytic functions.

The optional argument hook is a structure with additional options. hook can have the following fields:

  • h - can be used to define the magnitude of the complex step and defaults to 1e-20; steps larger than 1e-3 are not allowed.
  • fixed - is a logical vector internally usable by some optimization functions; it indicates for which elements of x no gradient should be computed, but zero should be returned.

For example:

          f = @(x) [x(1)^2 + x(2); x(2)*exp(x(1))];
          Df = jacobs ([1, 2], f)
optim-1.4.1/doc/html/leasqr.html0000644000175000017500000003127312477524752015204 0ustar olafolaf leasqr - optim_doc

Next: , Previous: wls, Up: Residual optimization


2.7 An older function for curve fitting

This was a popular function for curve fitting and has been enhanced to honour constraints. nonlin_curvefit (see nonlin_curvefit) does now the same job if used with the default backend, and should be prefered due to its more powerful interface. The statistics returned by leasqr can also (and partially better) be computed with curvefit_stat (see curvefit_stat). There are currently two things which still only leasqr does:

  • internally providing a function for plotting fits during optimization,
  • returning a pre-computed matrix for determining confidence regions.

— Function File: leasqr (x, y, pin, F)
— Function File: leasqr (x, y, pin, F, stol)
— Function File: leasqr (x, y, pin, F, stol, niter)
— Function File: leasqr (x, y, pin, F, stol, niter, wt)
— Function File: leasqr (x, y, pin, F, stol, niter, wt, dp)
— Function File: leasqr (x, y, pin, F, stol, niter, wt, dp, dFdp)
— Function File: leasqr (x, y, pin, F, stol, niter, wt, dp, dFdp, options)
— Function File: [f, p, cvg, iter, corp, covp, covr, stdresid, Z, r2] = leasqr (...)

Levenberg-Marquardt nonlinear regression.

Input arguments:

x
Vector or matrix of independent variables.
y
Vector or matrix of observed values.
pin
Vector of initial parameters to be adjusted by leasqr.
F
Name of function or function handle. The function must be of the form y = f(x, p), with y, x, p of the form y, x, pin.
stol
Scalar tolerance on fractional improvement in scalar sum of squares, i.e., sum ((wt .* (y-f))^2). Set to 0.0001 if empty or not given;
niter
Maximum number of iterations. Set to 20 if empty or not given.
wt
Statistical weights (same dimensions as y). These should be set to be proportional to sqrt (y) ^-1, i.e., the covariance matrix of the data is assumed to be proportional to diagonal with diagonal equal to (wt.^2)^-1. The constant of proportionality will be estimated. Set to ones (size (y)) if empty or not given.
dp
Fractional increment of p for numerical partial derivatives. Set to 0.001 * ones (size (pin)) if empty or not given.
  • dp(j) > 0 means central differences on j-th parameter p(j).
  • dp(j) < 0 means one-sided differences on j-th parameter p(j).
  • dp(j) = 0 holds p(j) fixed, i.e., leasqr won't change initial guess: pin(j)

dFdp
Name of partial derivative function in quotes or function handle. If not given or empty, set to dfdp, a slow but general partial derivatives function. The function must be of the form prt = dfdp (x, f, p, dp, F [,bounds]). For backwards compatibility, the function will only be called with an extra 'bounds' argument if the 'bounds' option is explicitly specified to leasqr (see dfdp.m).
options
Structure with multiple options. The following fields are recognized:
fract_prec
Column vector (same length as pin) of desired fractional precisions in parameter estimates. Iterations are terminated if change in parameter vector (chg) relative to current parameter estimate is less than their corresponding elements in 'fract_prec', i.e., all (abs (chg) < abs (options.fract_prec .* current_parm_est)) on two consecutive iterations. Defaults to zeros (size (pin)).
max_fract_change
Column vector (same length as pin) of maximum fractional step changes in parameter vector. Fractional change in elements of parameter vector is constrained to be at most 'max_fract_change' between sucessive iterations, i.e., abs (chg(i)) = abs (min([chg(i), options.max_fract_change(i) * current param estimate])). Defaults to Inf * ones (size (pin)).
inequc
Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Initial parameters must satisfy these constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say m) and a vector (say v), specifying linear inequality constraints of the form `m.' * parameters + v >= 0'. If the constraints are just bounds, it is suggested to specify them in 'options.bounds' instead, since then some sanity tests are performed, and since the function 'dfdp.m' is guarantied not to violate constraints during determination of the numeric gradient only for those constraints specified as 'bounds' (possibly with violations due to a certain inaccuracy, however, except if no constraints except bounds are specified). The first entry for general constraints must be a differentiable vector valued function (say h), specifying general inequality constraints of the form `h (p[, idx]) >= 0'; p is the column vector of optimized paraters and the optional argument idx is a logical index. h has to return the values of all constraints if idx is not given, and has to return only the indexed constraints if idx is given (so computation of the other constraints can be spared). If a second entry for general constraints is given, it must be a function (say dh) which returnes a matrix whos rows contain the gradients of the constraint function h with respect to the optimized parameters. It has the form jac_h = dh (vh, p, dp, h, idx[, bounds]); p is the column vector of optimized parameters, and idx is a logical index — only the rows indexed by idx must be returned (so computation of the others can be spared). The other arguments of dh are for the case that dh computes numerical gradients: vh is the column vector of the current values of the constraint function h, with idx already applied. h is a function h (p) to compute the values of the constraints for parameters p, it will return only the values indexed by idx. dp is a suggestion for relative step width, having the same value as the argument 'dp' of leasqr above. If bounds were specified to leasqr, they are provided in the argument bounds of dh, to enable their consideration in determination of numerical gradients. If dh is not specified to leasqr, numerical gradients are computed in the same way as with 'dfdp.m' (see above). If some constraints are linear, they should be specified as linear constraints (or bounds, if applicable) for reasons of performance, even if general constraints are also specified.
bounds
Two-column-matrix, one row for each parameter in pin. Each row contains a minimal and maximal value for each parameter. Default: [-Inf, Inf] in each row. If this field is used with an existing user-side function for 'dFdp' (see above) the functions interface might have to be changed.
equc
Equality constraints, specified the same way as inequality constraints (see field 'options.inequc'). Initial parameters must satisfy these constraints. Note that there is possibly a certain inaccuracy in honoring constraints, except if only bounds are specified. Warning: If constraints (or bounds) are set, returned guesses of corp, covp, and Z are generally invalid, even if no constraints are active for the final parameters. If equality constraints are specified, corp, covp, and Z are not guessed at all.
cpiv
Function for complementary pivot algorithm for inequality constraints. Defaults to cpiv_bard. No different function is supplied.

For backwards compatibility, options can also be a matrix whose first and second column contains the values of fract_prec and max_fract_change, respectively.

Output:

f
Column vector of values computed: f = F(x,p).
p
Column vector trial or final parameters, i.e, the solution.
cvg
Scalar: = 1 if convergence, = 0 otherwise.
iter
Scalar number of iterations used.
corp
Correlation matrix for parameters.
covp
Covariance matrix of the parameters.
covr
Diag(covariance matrix of the residuals).
stdresid
Standardized residuals.
Z
Matrix that defines confidence region (see comments in the source).
r2
Coefficient of multiple determination, intercept form.

Not suitable for non-real residuals.

References: Bard, Nonlinear Parameter Estimation, Academic Press, 1974. Draper and Smith, Applied Regression Analysis, John Wiley and Sons, 1981.

optim-1.4.1/doc/html/XREFde_005fmin.html0000644000175000017500000000011212477524752016214 0ustar olafolaf optim-1.4.1/doc/html/polyfitinf.html0000644000175000017500000001607412477524752016102 0ustar olafolaf polyfitinf - optim_doc

Next: , Previous: expfit, Up: Residual optimization


2.9 Function polyfitinf for polynomial fitting.

Helptext:

function [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT,REF0)

  Best polynomial approximation in discrete uniform norm

  INPUT VARIABLES:

  M       : degree of the fitting polynomial
  N       : number of data points
  X(N)    : x-coordinates of data points
  Y(N)    : y-coordinates of data points
  K       : character of the polynomial:
                  K = 0 : mixed parity polynomial
                  K = 1 : odd polynomial  ( X(1) must be >  0 )
                  K = 2 : even polynomial ( X(1) must be >= 0 )
  EPSH    : tolerance for leveling. A useful value for 24-bit
            mantissa is EPSH = 2.0E-7
  MAXIT   : upper limit for number of exchange steps
  REF0(M2): initial alternating set ( N-vector ). This is an
            OPTIONAL argument. The length M2 is given by:
                  M2 = M + 2                      , if K = 0
                  M2 = integer part of (M+3)/2    , if K = 1
                  M2 = 2 + M/2 (M must be even)   , if K = 2

  OUTPUT VARIABLES:

  A       : polynomial coefficients of the best approximation
            in order of increasing powers:
                  p*(x) = A(1) + A(2)*x + A(3)*x^2 + ...
  REF     : selected alternating set of points
  HMAX    : maximum deviation ( uniform norm of p* - f )
  H       : pointwise approximation errors
	R		: total number of iterations
  EQUAL   : success of failure of algorithm
                  EQUAL=1 :  succesful
                  EQUAL=0 :  convergence not acheived
                  EQUAL=-1:  input error
                  EQUAL=-2:  algorithm failure

  Relies on function EXCH, provided below.

  Example: 
  M = 5; N = 10000; K = 0; EPSH = 10^-12; MAXIT = 10;
  X = linspace(-1,1,N);   % uniformly spaced nodes on [-1,1]
  k=1; Y = abs(X).^k;     % the function Y to approximate
  [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT);
  p = polyval(A,X); plot(X,Y,X,p) % p is the best approximation

  Note: using an even value of M, e.g., M=2, in the example above, makes
  the algorithm to fail with EQUAL=-2, because of collocation, which
  appears because both the appriximating function and the polynomial are
  even functions. The way aroung it is to approximate only the right half
  of the function, setting K = 2 : even polynomial. For example: 

N = 10000; K = 2; EPSH = 10^-12; MAXIT = 10;  X = linspace(0,1,N);
for i = 1:2
    k = 2*i-1; Y = abs(X).^k;
    for j = 1:4
        M = 2^j;
        [~,~,HMAX] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT);
        approxerror(i,j) = HMAX;
    end
end
disp('Table 3.1 from Approximation theory and methods, M.J.D.POWELL, p. 27');
disp(' ');
disp('            n          K=1          K=3'); 
disp(' '); format short g;
disp([(2.^(1:4))' approxerror']);

  ALGORITHM:

  Computation of the polynomial that best approximates the data (X,Y)
  in the discrete uniform norm, i.e. the polynomial with the  minimum
  value of max{ | p(x_i) - y_i | , x_i in X } . That polynomial, also
  known as minimax polynomial, is obtained by the exchange algorithm,
  a finite iterative process requiring, at most,
     n
   (   ) iterations ( usually p = M + 2. See also function EXCH ).
     p
  since this number can be very large , the routine  may not converge
  within MAXIT iterations . The  other possibility of  failure occurs
  when there is insufficient floating point precision  for  the input
  data chosen.

  CREDITS: This routine was developed and modified as 
  computer assignments in Approximation Theory courses by 
  Prof. Andrew Knyazev, University of Colorado Denver, USA.

  Team Fall 98 (Revision 1.0):
          Chanchai Aniwathananon
          Crhistopher Mehl
          David A. Duran
          Saulo P. Oliveira

  Team Spring 11 (Revision 1.1): Manuchehr Aminian

  The algorithm and the comments are based on a FORTRAN code written
  by Joseph C. Simpson. The code is available on Netlib repository:
  http://www.netlib.org/toms/501
  See also: Communications of the ACM, V14, pp.355-356(1971)

  NOTES:

  1) A may contain the collocation polynomial
  2) If MAXIT is exceeded, REF contains a new reference set
  3) M, EPSH and REF can be altered during the execution
  4) To keep consistency to the original code , EPSH can be
  negative. However, the use of REF0 is *NOT* determined by
  EPSH< 0, but only by its inclusion as an input parameter.

  Some parts of the code can still take advantage of vectorization.  

  Revision 1.0 from 1998 is a direct human translation of 
  the FORTRAN code http://www.netlib.org/toms/501
  Revision 1.1 is a clean-up and technical update.  
  Tested on MATLAB Version 7.11.0.584 (R2010b) and 
  GNU Octave Version 3.2.4

optim-1.4.1/doc/html/XREFexpfit.html0000644000175000017500000000010412477524752015666 0ustar olafolaf optim-1.4.1/doc/html/nmsmax.html0000644000175000017500000001036312477524752015215 0ustar olafolaf nmsmax - optim_doc

Next: , Previous: fmins, Up: Scalar optimization


1.7 A Nelder-Mead simplex algorithm

Helptext:

NMSMAX  Nelder-Mead simplex method for direct search optimization.
       [x, fmax, nf] = NMSMAX(FUN, x0, STOPIT, SAVIT) attempts to
       maximize the function FUN, using the starting vector x0.
       The Nelder-Mead direct search method is used.
       Output arguments:
              x    = vector yielding largest function value found,
              fmax = function value at x,
              nf   = number of function evaluations.
       The iteration is terminated when either
              - the relative size of the simplex is <= STOPIT(1)
                (default 1e-3),
              - STOPIT(2) function evaluations have been performed
                (default inf, i.e., no limit), or
              - a function value equals or exceeds STOPIT(3)
                (default inf, i.e., no test on function values).
       The form of the initial simplex is determined by STOPIT(4):
          STOPIT(4) = 0: regular simplex (sides of equal length, the default)
          STOPIT(4) = 1: right-angled simplex.
       Progress of the iteration is not shown if STOPIT(5) = 0 (default 1).
          STOPIT(6) indicates the direction (ie. minimization or 
                  maximization.) Default is 1, maximization.
                  set STOPIT(6)=-1 for minimization
       If a non-empty fourth parameter string SAVIT is present, then
       `SAVE SAVIT x fmax nf' is executed after each inner iteration.
       NB: x0 can be a matrix.  In the output argument, in SAVIT saves,
           and in function calls, x has the same shape as x0.
       NMSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional
       arguments to be passed to fun, via feval(fun,x,P1,P2,...).
References:
N. J. Higham, Optimization by direct search in matrix computations,
   SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993.
C. T. Kelley, Iterative Methods for Optimization, Society for Industrial
   and Applied Mathematics, Philadelphia, PA, 1999.

optim-1.4.1/doc/html/XREFleasqr.html0000644000175000017500000000010412477524752015656 0ustar olafolaf optim-1.4.1/doc/html/XREFbfgsmin.html0000644000175000017500000000010612477524752016016 0ustar olafolaf optim-1.4.1/doc/html/Non_002dscalar-parameters.html0000644000175000017500000000642312477524752020522 0ustar olafolaf Non-scalar parameters - optim_doc

9.5 Handling named parameter arrays

Parameters in named structure fields are allowed to be non-scalar real arrays. In this case, their dimensions must be given by the setting param_dims, a cell-array of dimension vectors, each containing at least two dimensions; if not given, dimensions are taken from the initial parameters, if these are given in a structure.

If there are any vector-based settings or not structure-based linear constraints, they must correspond to an order of parameters defined as follows:

All named parameter arrays are reshaped to vectors. Then, all parameters, scalars and vectors, are concatenated in the order of parameter names, given by the user.

Structure-based settings or structure-based initial parameters must contain arrays with dimensions reshapable to those of the respective parameters. optim-1.4.1/doc/html/Residual-optimization.html0000644000175000017500000001214512477524752020206 0ustar olafolaf Residual optimization - optim_doc

Next: , Previous: Scalar optimization, Up: Top


2 Functions for optimization of a model function returning an array

Model functions whose parameters are to be optimized may return a vector or array of values. Either these or their differences to some constant values (curve fitting) can be minimized in some sense, often, but not necessarily, by minimizing the sum of their squares. It is usually preferable to use optimizers designed for residual optimization for this purpose. These can exploit information contained in the individual elements of the returned array, which would not be possible if the user calculated a norm (e.g. sum of squares) of the elements and performed a scalar optimization.

Optimization frontends

Optimization backends

Statistics frontends

Statistics backends

Standalone functions

optim-1.4.1/doc/html/residmin_005fstat.html0000644000175000017500000001423712477524752017156 0ustar olafolaf residmin_stat - optim_doc

2.4 Statistics for residual minimization

— Function File: info = residmin_stat (f, p, settings)

Frontend for computation of statistics for a residual-based minimization.

settings is a structure whose fields can be set by optimset. With settings the computation of certain statistics is requested by setting the fields ret_<name_of_statistic> to true. The respective statistics will be returned in a structure as fields with name <name_of_statistic>. Depending on the requested statistic and on the additional information provided in settings, f and p may be empty. Otherwise, f is the model function of an optimization (the interface of f is described e.g. in nonlin_residmin, please see there), and p is a real column vector with parameters resulting from the same optimization.

Currently, the following statistics (or general information) can be requested:

dfdp: Jacobian of model function with respect to parameters.

covd: Covariance matrix of data (typically guessed by applying a factor to the covariance matrix of the residuals).

covp: Covariance matrix of final parameters.

corp: Correlation matrix of final parameters.

See also: curvefit_stat.

Further settings

The fields of the settings structure can be set with optimset.

For settings common to all frontends see Common frontend options.

Additional settings:
objf_type
Type of objective function of the optimization; must be specified in many cases. This determines which backends to use. Currently, there are only backends for the type "wls" (weighted least squares).
residuals
covd
Optional information on the result of optimization, residuals and covariance matrix of data, respectively.
weights
Array of weights applied to the residuals in the previous optimization. Dimensions must match those of the residuals.
dfdp
Can be set in the same way and has the same default as in nonlin_residmin (see nonlin_residmin), but alternatively may already contain the computed Jacobian of the model function at the final parameters in matrix- or structure-form.
complex_step_derivative_f
Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (dfdp) must be specified.

Structure based parameter handling

Please see Parameter structures.

Backend information

Please see Residual optimization and choose backend from menu under `Statistics backends'. optim-1.4.1/doc/html/lm_005fsvd_005ffeasible.html0000644000175000017500000000666712477524752020032 0ustar olafolaf lm_svd_feasible - optim_doc

2.3 Default backend lm_svd_feasible of residual minimization

Levenberg/Marquardt algorithm using singular value decomposition. Constraints must be met by the initial parameters and are attempted to be kept met throughout the optimization.

Returned value cvg will be 0, 1, or 2. Returned structure outp will have the fields niter and user_interaction.

Backend-specific defaults are: MaxIter: 20, fract_prec: zeros (size (parameters)), max_fract_change: Inf for all parameters. The setting TolX is not honoured.

Interpretation of Display: if set to "iter", currently some diagnostics are printed.

Specific option: lm_svd_feasible_alt_s: if falling back to nearly gradient descent, do it more like original Levenberg/Marquardt method, with descent in each gradient component; for testing only. optim-1.4.1/doc/html/XREFnmsmax.html0000644000175000017500000000010412477524752015672 0ustar olafolaf optim-1.4.1/doc/html/deriv.html0000644000175000017500000000707212477524752015026 0ustar olafolaf deriv - optim_doc

Next: , Previous: dfpdp, Up: Gradient functions


4.2 Higher order numerical derivatives

— Function File: dx = deriv (f, x0)
— Function File: dx = deriv (f, x0, h)
— Function File: dx = deriv (f, x0, h, O)
— Function File: dx = deriv (f, x0, h, O, N)

Calculate derivate of function f.

f must be a function handle or the name of a function that takes x0 and returns a variable of equal length and orientation. x0 must be a numeric vector or scalar.

h defines the step taken for the derivative calculation. Defaults to 1e-7.

O defines the order of the calculation. Supported values are 2 (h^2 order) or 4 (h^4 order). Defaults to 2.

N defines the derivative order. Defaults to the 1st derivative of the function. Can be up to the 4th derivative.

Reference: Numerical Methods for Mathematics, Science, and Engineering by John H. Mathews.

optim-1.4.1/doc/html/XREFnelder_005fmead_005fmin.html0000644000175000017500000000014212477524752020461 0ustar olafolaf optim-1.4.1/doc/html/linprog.html0000644000175000017500000000710712477524752015366 0ustar olafolaf linprog - optim_doc

Previous: wsolve, Up: Residual optimization


2.14 Linear programming

— Function File: x = linprog (f, A, b)
— Function File: x = linprog (f, A, b, Aeq, beq)
— Function File: x = linprog (f, A, b, Aeq, beq, lb, ub)
— Function File: [x, fval] = linprog (...)

Solve a linear problem.

Finds

          min (f' * x)

(both f and x are column vectors) subject to

          A   * x <= b
          Aeq * x  = beq
          lb <= x <= ub

If not specified, Aeq and beq default to empty matrices.

If not specified, the lower bound lb defaults to minus infinite and the upper bound ub defaults to infinite.

See also glpk. optim-1.4.1/doc/html/fmins.html0000644000175000017500000001026212477524752015024 0ustar olafolaf fmins - optim_doc

Next: , Previous: d2_min, Up: Scalar optimization


1.6 Alternative frontend for gradient-less algorithms

— Function File: [x] = fmins (f,X0,options,grad,P1,P2, ...)

Find the minimum of a funtion of several variables. By default the method used is the Nelder&Mead Simplex algorithm

Example usage: fmins(inline('(x(1)-5).^2+(x(2)-8).^4'),[0;0])

Inputs

f
A string containing the name of the function to minimize
X0
A vector of initial parameters fo the function f.
options
Vector with control parameters (not all parameters are used)
          options(1) - Show progress (if 1, default is 0, no progress)
          options(2) - Relative size of simplex (default 1e-3)
          options(6) - Optimization algorithm
             if options(6)==0 - Nelder & Mead simplex (default)
             if options(6)==1 - Multidirectional search Method
             if options(6)==2 - Alternating Directions search
          options(5)
             if options(6)==0 && options(5)==0 - regular simplex
             if options(6)==0 && options(5)==1 - right-angled simplex
                Comment: the default is set to "right-angled simplex".
                  this works better for me on a broad range of problems,
                  although the default in nmsmax is "regular simplex"
          options(10) - Maximum number of function evaluations
     

grad
Unused (For compatibility with Matlab)
P1, P2, ...
Optional parameters for function f

optim-1.4.1/doc/html/Function-index.html0000644000175000017500000001626612477524752016614 0ustar olafolaf Function index - optim_doc

Next: , Previous: Additional parameters, Up: Top


Index of functions in optim

optim-1.4.1/doc/html/XREFnonlin_005fmin.html0000644000175000017500000000012212477524752017122 0ustar olafolaf optim-1.4.1/doc/html/XREFlinprog.html0000644000175000017500000000010612477524752016043 0ustar olafolaf optim-1.4.1/doc/html/XREFnumgradient.html0000644000175000017500000000011612477524752016707 0ustar olafolaf optim-1.4.1/doc/html/lm_005ffeasible.html0000644000175000017500000000742112477524752016550 0ustar olafolaf lm_feasible - optim_doc

Next: , Previous: nonlin_min, Up: Scalar optimization


1.2 Default backend lm_feasible of scalar optimization

A Levenberg/Marquardt-like algorithm, attempting to honour constraints throughout the course of optimization. This means that the initial parameters must not violate constraints (to find an initial feasible set of parameters, e.g. core Octaves sqp can be used (see octave_sqp), by specifying an objective function which is constant or which returns a norm of the distances to the initial values). The Hessian is either supplied by the user or is approximated by the BFGS algorithm. Core Octaves sqp performed better in some tests with unconstrained problems.

Returned value cvg will be 2 or 3 for success and 0 or -4 for failure (see nonlin_min for meaning). Returned structure outp will have the fields niter, nobjf, and user_interaction.

Backend-specific defaults are: MaxIter: 20, fract_prec: zeros (size (parameters)), max_fract_change: Inf for all parameters. The setting TolX is not honoured.

Interpretation of Display: if set to "iter", currently only information on applying max_fract_change is printed. optim-1.4.1/doc/html/Additional-parameters.html0000644000175000017500000000671312477524752020127 0ustar olafolaf Additional parameters - optim_doc

Next: , Previous: Parameter structures, Up: Top


10 Passing additional parameters to user functions

Optimizers often require the user to supply functions (e.g. objective functions, model functions, constraint functions). The interface of these functions — arguments and returned values — is defined by the optimizer. Often, a user function needs additional arguments, not covered by the defined interface, which are constant throughout the optimization. These can be supplied by wrapping the user function into an anonymous function. See Anonymous Functions, for further explanation and examples.

There are also some older optimizers in the optim package, written when anonymous functions were not available in Octave. Some of these offer an interface to user functions which is itself able to pass additional constant variables in arbitrary argument positions. Newer optimizers should not be written this way, since this is an unnecessary complication.

Though it is possible to use global variables to pass additional data to user functions, this is not recommended since it introduces the possibility of name conflicts within the pool of global variables. optim-1.4.1/doc/html/XREFnrm.html0000644000175000017500000000007612477524752015173 0ustar olafolaf optim-1.4.1/doc/html/wpolyfit.html0000644000175000017500000001534012477524752015567 0ustar olafolaf wpolyfit - optim_doc

Next: , Previous: polyfitinf, Up: Residual optimization


2.10 Polynomial fitting suitable for polyconf

— Function File: [p, s] = wpolyfit (x, y, dy, n)

Return the coefficients of a polynomial p(x) of degree n that minimizes sumsq (p(x(i)) - y(i)), to best fit the data in the least squares sense. The standard error on the observations y if present are given in dy.

The returned value p contains the polynomial coefficients suitable for use in the function polyval. The structure s returns information necessary to compute uncertainty in the model.

To compute the predicted values of y with uncertainty use

          [y,dy] = polyconf(p,x,s,'ci');

You can see the effects of different confidence intervals and prediction intervals by calling the wpolyfit internal plot function with your fit:

          feval('wpolyfit:plt',x,y,dy,p,s,0.05,'pi')

Use dy=[] if uncertainty is unknown.

You can use a chi^2 test to reject the polynomial fit:

          p = 1-chi2cdf(s.normr^2,s.df);

p is the probability of seeing a chi^2 value higher than that which was observed assuming the data are normally distributed around the fit. If p < 0.01, you can reject the fit at the 1% level.

You can use an F test to determine if a higher order polynomial improves the fit:

          [poly1,S1] = wpolyfit(x,y,dy,n);
          [poly2,S2] = wpolyfit(x,y,dy,n+1);
          F = (S1.normr^2 - S2.normr^2)/(S1.df-S2.df)/(S2.normr^2/S2.df);
          p = 1-f_cdf(F,S1.df-S2.df,S2.df);

p is the probability of observing the improvement in chi^2 obtained by adding the extra parameter to the fit. If p < 0.01, you can reject the lower order polynomial at the 1% level.

You can estimate the uncertainty in the polynomial coefficients themselves using

          dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr;

but the high degree of covariance amongst them makes this a questionable operation.

— Function File: [p, s, mu] = wpolyfit (...)

If an additional output mu = [mean(x),std(x)] is requested then the x values are centered and normalized prior to computing the fit. This will give more stable numerical results. To compute a predicted y from the returned model use y = polyval(p, (x-mu(1))/mu(2)

— Function File: wpolyfit (...)

If no output arguments are requested, then wpolyfit plots the data, the fitted line and polynomials defining the standard error range.

Example

          x = linspace(0,4,20);
          dy = (1+rand(size(x)))/2;
          y = polyval([2,3,1],x) + dy.*randn(size(x));
          wpolyfit(x,y,dy,2);
— Function File: wpolyfit (..., 'origin')

If 'origin' is specified, then the fitted polynomial will go through the origin. This is generally ill-advised. Use with caution.

Hocking, RR (2003). Methods and Applications of Linear Models. New Jersey: John Wiley and Sons, Inc.

See also: polyconf.

See also polyfit. optim-1.4.1/doc/html/Scalar-optimization.html0000644000175000017500000001166312477524752017647 0ustar olafolaf Scalar optimization - optim_doc

Next: , Previous: Top, Up: Top


1 Functions for optimization of a scalar objective function

Frontend

Backends

Alternative frontend

Standalone functions

optim-1.4.1/doc/html/adsmax.html0000644000175000017500000001043312477524752015165 0ustar olafolaf adsmax - optim_doc

Next: , Previous: mdsmax, Up: Scalar optimization


1.9 An alternating directions algorithm

Helptext:

ADSMAX  Alternating directions method for direct search optimization.
       [x, fmax, nf] = ADSMAX(FUN, x0, STOPIT, SAVIT, P) attempts to
       maximize the function FUN, using the starting vector x0.
       The alternating directions direct search method is used.
       Output arguments:
              x    = vector yielding largest function value found,
              fmax = function value at x,
              nf   = number of function evaluations.
       The iteration is terminated when either
              - the relative increase in function value between successive
                iterations is <= STOPIT(1) (default 1e-3),
              - STOPIT(2) function evaluations have been performed
                (default inf, i.e., no limit), or
              - a function value equals or exceeds STOPIT(3)
                (default inf, i.e., no test on function values).
       Progress of the iteration is not shown if STOPIT(5) = 0 (default 1).
       If a non-empty fourth parameter string SAVIT is present, then
       `SAVE SAVIT x fmax nf' is executed after each inner iteration.
       By default, the search directions are the co-ordinate directions.
       The columns of a fifth parameter matrix P specify alternative search
       directions (P = EYE is the default).
       NB: x0 can be a matrix.  In the output argument, in SAVIT saves,
           and in function calls, x has the same shape as x0.
       ADSMAX(fun, x0, STOPIT, SAVIT, P, P1, P2,...) allows additional
       arguments to be passed to fun, via feval(fun,x,P1,P2,...).
    Reference:
    N. J. Higham, Optimization by direct search in matrix computations,
       SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993.
    N. J. Higham, Accuracy and Stability of Numerical Algorithms,
       Second edition, Society for Industrial and Applied Mathematics,
       Philadelphia, PA, 2002; sec. 20.5.

optim-1.4.1/doc/html/Documentation.html0000644000175000017500000000570712477524752016531 0ustar olafolaf Documentation - optim_doc

Next: , Previous: Helper functions, Up: Top


6 Function optim_doc to view documentation

— Function File: optim_doc ()
— Function File: optim_doc (keyword)

Show optim package documentation.

Runs the info viewer Octave is configured with on the documentation in info format of the installed optim package. Without argument, the top node of the documentation is displayed. With an argument, the respective index entry is searched for and its node displayed.

optim-1.4.1/doc/html/cg_005fmin.html0000644000175000017500000001236312477524752015543 0ustar olafolaf cg_min - optim_doc

Next: , Previous: nrm, Up: Scalar optimization


1.15 A conjugate gradient method

— Function File: [x0,v,nev] cg_min ( f,df,args,ctl )

NonLinear Conjugate Gradient method to minimize function f.

Arguments

  • f : string : Name of function. Return a real value
  • df : string : Name of f's derivative. Returns a (R*C) x 1 vector
  • args: cell : Arguments passed to f.
  • ctl : 5-vec : (Optional) Control variables, described below

Returned values

  • x0 : matrix : Local minimum of f
  • v : real : Value of f in x0
  • nev : 1 x 2 : Number of evaluations of f and of df

Control Variables

  • ctl(1) : 1 or 2 : Select stopping criterion amongst :
  • ctl(1)==0 : Default value
  • ctl(1)==1 : Stopping criterion : Stop search when value doesn't improve, as tested by ctl(2) > Deltaf/max(|f(x)|,1) where Deltaf is the decrease in f observed in the last iteration (each iteration consists R*C line searches).
  • ctl(1)==2 : Stopping criterion : Stop search when updates are small, as tested by ctl(2) > max { dx(i)/max(|x(i)|,1) | i in 1..N } where dx is the change in the x that occured in the last iteration.
  • ctl(2) : Threshold used in stopping tests. Default=10*eps
  • ctl(2)==0 : Default value
  • ctl(3) : Position of the minimized argument in args Default=1
  • ctl(3)==0 : Default value
  • ctl(4) : Maximum number of function evaluations Default=inf
  • ctl(4)==0 : Default value
  • ctl(5) : Type of optimization:
  • ctl(5)==1 : "Fletcher-Reves" method
  • ctl(5)==2 : "Polak-Ribiere" (Default)
  • ctl(5)==3 : "Hestenes-Stiefel" method

ctl may have length smaller than 4. Default values will be used if ctl is not passed or if nan values are given.

Example:

function r=df( l ) b=[1;0;-1]; r = -( 2*l{1} - 2*b + rand(size(l{1}))); endfunction
function r=ff( l ) b=[1;0;-1]; r = (l{1}-b)' * (l{1}-b); endfunction
ll = { [10; 2; 3] };
ctl(5) = 3;
[x0,v,nev]=cg_min( "ff", "df", ll, ctl )

Comment: In general, BFGS method seems to be better performin in many cases but requires more computation per iteration See also http://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient.

See also: bfgsmin.

optim-1.4.1/doc/html/Parameter-structures.html0000644000175000017500000001107612477524752020055 0ustar olafolaf Parameter structures - optim_doc

9 Handling of structures of optimized parameters

It can be convenient not to handle the optimized parameters as elements of a vector, but as named fields of a structure. The frontends nonlin_residmin, nonlin_curvefit, residmin_stat, curvefit_stat, and nonlin_min can accept parameter information in structure form, and can pass the parameters as a structure to user functions, although the backends still handle the parameters as vectors.

To use this feature, the initial parameters must be given in structure form, or the setting param_order must be given, a cell-array with names of the parameters. If both is done, only the parameters in structure fields named in param_order will be optimized. If there are still some non-structure-based configuration settings or user functions, specifying param_order is mandatory even if the initial parameters are given in structure form.

If the initial parameters are a structure, the parameters being the optimization result will also be returned as a structure.

optim-1.4.1/doc/html/XREFcdiff.html0000644000175000017500000000010212477524752015440 0ustar olafolaf optim-1.4.1/doc/html/LinearRegression.html0000644000175000017500000001074712477524752017173 0ustar olafolaf LinearRegression - optim_doc

Next: , Previous: polyconf, Up: Residual optimization


2.12 Function LinearRegression

-*- texinfo -*-

— Function File: [p,e_var,r,p_var,y_var] = LinearRegression (F,y)
— Function File: [p,e_var,r,p_var,y_var] = LinearRegression (F,y,w)

general linear regression determine the parameters p_j (j=1,2,...,m) such that the function f(x) = sum_(i=1,...,m) p_j*f_j(x) is the best fit to the given values y_i = f(x_i) parameters:

  • F is an n*m matrix with the values of the basis functions at the support points. In column j give the values of f_j at the points x_i (i=1,2,...,n)
  • y is a column vector of length n with the given values
  • w is n column vector of of length n vector with the weights of data points
return values:
  • p is the vector of length m with the estimated values of the parameters
  • e_var is the estimated variance of the difference between fitted and measured values
  • r is the weighted norm of the residual
  • p_var is the estimated variance of the parameters p_j
  • y_var is the estimated variance of the dependend variables
Caution: do NOT request y_var for large data sets, as a n by n matrix is generated

See also regress, leasqr, nonlin_curvefit, polyfit, wpolyfit, expfit. optim-1.4.1/doc/html/gjp.html0000644000175000017500000000601012477524752014464 0ustar olafolaf gjp - optim_doc

Previous: cpiv_bard, Up: Helper functions


5.2 Gauss-Jordan pivoting

Helptext:

m = gjp (m, k[, l])

m: matrix; k, l: row- and column-index of pivot, l defaults to k.

Gauss-Jordon pivot as defined in Bard, Y.: Nonlinear Parameter
Estimation, p. 296, Academic Press, New York and London 1974. In
the pivot column, this seems not quite the same as the usual
Gauss-Jordan(-Clasen) pivot. Bard gives Beaton, A. E., 'The use of
special matrix operators in statistical calculus' Research Bulletin
RB-64-51 (1964), Educational Testing Service, Princeton, New Jersey
as a reference, but this article is not easily accessible. Another
reference, whose definition of gjp differs from Bards by some
signs, is Clarke, R. B., 'Algorithm AS 178: The Gauss-Jordan sweep
operator with detection of collinearity', Journal of the Royal
Statistical Society, Series C (Applied Statistics) (1982), 31(2),
166--168.

optim-1.4.1/doc/html/XREFderiv.html0000644000175000017500000000010212477524752015476 0ustar olafolaf optim-1.4.1/doc/html/Structure_002dbased-configuration-settings.html0000644000175000017500000000735212477524752024145 0ustar olafolaf Structure-based configuration settings - optim_doc

9.4 Parameter-related configuration settings in structure form

The vector-based settings lbound, ubound, fixed, diffp, diff_onesided, fract_prec, and max_fract_change can be replaced by the setting param_config. It is a structure that can contain fields corresponding to parameter names. For each such field, there may be subfields with the same names as the above vector-based settings, but containing a scalar value for the respective parameter.

For example, if the parameters are named a and b, instead of specifying

     settings = optimset ("lbound", [-Inf; 0],
                          "diff_onesided", [true; true]);

one can specify

     pconf.b.lbound = 0;
     pconf.a.diff_onesided = true;
     pconf.b.diff_onesided = true;
     settings = optimset ("param_config", pconf);

If param_config is specified, none of the above vector-based settings may be used. optim-1.4.1/doc/html/standalone-d2_005fmin.html0000644000175000017500000001270612477524752017606 0ustar olafolaf standalone d2_min - optim_doc

Next: , Previous: powell, Up: Scalar optimization


1.12 Unconstrained Newton-like minimization, standalone function

This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured.

This function is deprecated. It is now available as a backend for nonlin_min (see nonlin_min).

Helptext:

[x,v,nev,h,args] = d2_min(f,d2f,args,ctl,code) - Newton-like minimization

Minimize f(x) using 1st and 2nd derivatives. Any function w/ second
derivatives can be minimized, as in Newton. f(x) decreases at each
iteration, as in Levenberg-Marquardt. This function is inspired from the
Levenberg-Marquardt algorithm found in the book "Numerical Recipes".

ARGUMENTS :
f    : string : Cost function's name

d2f  : string : Name of function returning the cost (1x1), its
                differential (1xN) and its second differential or it's
                pseudo-inverse (NxN) (see ctl(5) below) :

                [v,dv,d2v] = d2f (x).

args : list   : f and d2f's arguments. By default, minimize the 1st
    or matrix : argument.

ctl  : vector : Control arguments (see below)
     or struct

code : string : code will be evaluated after each outer loop that
                produced some (any) improvement. Variables visible from
                "code" include "x", the best parameter found, "v" the
                best value and "args", the list of all arguments. All can
                be modified. This option can be used to re-parameterize 
                the argument space during optimization

CONTROL VARIABLE ctl : (optional). May be a struct or a vector of length
---------------------- 5 or less where NaNs are ignored. Default values
                       are written <value>.
FIELD  VECTOR
NAME    POS

ftol, f N/A    : Stop search when value doesn't improve, as tested by

                  f > Deltaf/max(|f(x)|,1)

            where Deltaf is the decrease in f observed in the last
            iteration.                                     <10*sqrt(eps)>

utol, u N/A    : Stop search when updates are small, as tested by

                  u > max { dx(i)/max(|x(i)|,1) | i in 1..N }

            where  dx is the change in the x that occured in the last
            iteration.                                              <NaN>

dtol, d N/A    : Stop search when derivative is small, as tested by

                  d > norm (dv)                                     <eps>

crit, c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'utol' (c=2)
                 or 'dtol' (c=3) to the value of by the 'tol' option. <1>

tol, t  ctl(2) : Threshold in termination test chosen by 'crit'  <10*eps>

narg, n ctl(3) : Position of the minimized argument in args           <1>
maxev,m ctl(4) : Maximum number of function evaluations             <inf>
maxout,m       : Maximum number of outer loops                      <inf>
id2f, i ctl(5) : 0 if d2f returns the 2nd derivatives, 1 if           <0>
                 it returns its pseudo-inverse.

verbose, v N/A : Be more or less verbose (quiet=0)                    <0>

optim-1.4.1/doc/html/XREFd2_005fmin.html0000644000175000017500000000012512477524752016135 0ustar olafolaf optim-1.4.1/doc/html/Structure_002dbased-gradients-and-Hessians.html0000644000175000017500000000703612477524752023732 0ustar olafolaf Structure-based gradients and Hessians - optim_doc

9.2 Format of returned values of structure-based gradient and Hessian functions

Structure-based gradient or Jacobian functions, including Jacobians of general constraints, must return the partial derivatives as fields of a structure under the respective parameter names. For gradients, the partial derivatives are scalar. For Jacobians, the partial derivatives must be column vectors.

Structure-based Hessian functions must return the 2nd derivatives as subfields in a two-level structure of parameter names. For example, if the parameter names are a and b, the 2nd derivative with respect to a and b must be in the field returned_structure.a.b or returned_structure.b.a (there is no need to specify both). optim-1.4.1/doc/html/XREFnonlin_005fresidmin.html0000644000175000017500000000013412477524752020154 0ustar olafolaf optim-1.4.1/doc/html/Concept-index.html0000644000175000017500000001437712477524752016423 0ustar olafolaf Concept index - optim_doc

Previous: Function index, Up: Top


Concept index

optim-1.4.1/doc/html/XREFadsmax.html0000644000175000017500000000010412477524752015644 0ustar olafolaf optim-1.4.1/doc/html/battery.html0000644000175000017500000000527312477524752015370 0ustar olafolaf battery - optim_doc

Previous: de_min, Up: Scalar optimization


1.20 Repeatedly call bfgsmin

Helptext:

battery.m: repeatedly call bfgs using a battery of 
start values, to attempt to find global min
of a nonconvex function

INPUTS:
func: function to mimimize
args: args of function
minarg: argument to minimize w.r.t. (usually = 1)
startvals: kxp matrix of values to try for sure (don't include all zeros, that's automatic)
max iters per start value
number of additional random start values to try

OUTPUT: theta - the best value found - NOT iterated to convergence

optim-1.4.1/doc/html/d2_005fmin.html0000644000175000017500000000603612477524752015457 0ustar olafolaf d2_min - optim_doc

Next: , Previous: siman, Up: Scalar optimization


1.5 Unconstrained Newton-like optimization

This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. If the supplied Hessian function actually returns the inverse of the Hessian, set inverse_hessian to true. Supplying the inverse Hessian is preferable, if possible.

Returned value cvg will be 2 or 3 for success and 0 or -1 for failure (see nonlin_min for meaning). Returned structure outp will have the fields niter, nobjf, and user_interaction.

Interpretation of Display: if set to "iter", some diagnostics are printed. optim-1.4.1/doc/html/de_005fmin.html0000644000175000017500000001523012477524752015536 0ustar olafolaf de_min - optim_doc

Next: , Previous: samin, Up: Scalar optimization


1.19 A differential evolution (stochastic) optimizer

Helptext:

de_min: global optimisation using differential evolution

Usage: [x, obj_value, nfeval, convergence] = de_min(fcn, control)

minimization of a user-supplied function with respect to x(1:D),
using the differential evolution (DE) method based on an algorithm
by  Rainer Storn (http://www.icsi.berkeley.edu/~storn/code.html)
See: http://www.softcomputing.net/tevc2009_1.pdf


Arguments:  
---------------
fcn        string : Name of function. Must return a real value
control    vector : (Optional) Control variables, described below
        or struct

Returned values:
----------------
x          vector : parameter vector of best solution
obj_value  scalar : objective function value of best solution
nfeval     scalar : number of function evaluations
convergence       : 1 = best below value to reach (VTR)
                    0 = population has reached defined quality (tol)
                   -1 = some values are close to constraints/boundaries
                   -2 = max number of iterations reached (maxiter)
                   -3 = max number of functions evaluations reached (maxnfe)

Control variable:   (optional) may be named arguments (i.e. "name",value
----------------    pairs), a struct, or a vector, where
                    NaN's are ignored.

XVmin        : vector of lower bounds of initial population
               *** note: by default these are no constraints ***
XVmax        : vector of upper bounds of initial population
constr       : 1 -> enforce the bounds not just for the initial population
const        : data vector (remains fixed during the minimization)
NP           : number of population members
F            : difference factor from interval [0, 2]
CR           : crossover probability constant from interval [0, 1]
strategy     : 1 --> DE/best/1/exp           7 --> DE/best/1/bin
               2 --> DE/rand/1/exp           8 --> DE/rand/1/bin
               3 --> DE/target-to-best/1/exp 9 --> DE/target-to-best/1/bin
               4 --> DE/best/2/exp           10--> DE/best/2/bin
               5 --> DE/rand/2/exp           11--> DE/rand/2/bin
               6 --> DEGL/SAW/exp            else  DEGL/SAW/bin
refresh      : intermediate output will be produced after "refresh"
               iterations. No intermediate output will be produced
               if refresh is < 1
VTR          : Stopping criterion: "Value To Reach"
               de_min will stop when obj_value <= VTR.
               Use this if you know which value you expect.
tol          : Stopping criterion: "tolerance"
               stops if (best-worst)/max(1,worst) < tol
               This stops basically if the whole population is "good".
maxnfe       : maximum number of function evaluations
maxiter      : maximum number of iterations (generations)

      The algorithm seems to work well only if [XVmin,XVmax] covers the 
      region where the global minimum is expected.
      DE is also somewhat sensitive to the choice of the
      difference factor F. A good initial guess is to choose F from
      interval [0.5, 1], e.g. 0.8.
      CR, the crossover probability constant from interval [0, 1]
      helps to maintain the diversity of the population and is
      rather uncritical but affects strongly the convergence speed.
      If the parameters are correlated, high values of CR work better.
      The reverse is true for no correlation.
      Experiments suggest that /bin likes to have a slightly
      larger CR than /exp.
      The number of population members NP is also not very critical. A
      good initial guess is 10*D. Depending on the difficulty of the
      problem NP can be lower than 10*D or must be higher than 10*D
      to achieve convergence.

Default Values:
---------------
XVmin = [-2];
XVmax = [ 2];
constr= 0;
const = [];
NP    = 10 *D
F     = 0.8;
CR    = 0.9;
strategy = 12;
refresh  = 0;
VTR   = -Inf;
tol   = 1.e-3;
maxnfe  = 1e6;
maxiter = 1000;


Example to find the minimum of the Rosenbrock saddle:
----------------------------------------------------
Define f as:
                   function result = f(x);
                     result = 100 * (x(2) - x(1)^2)^2 + (1 - x(1))^2;
                   end
Then type:

	ctl.XVmin = [-2 -2];
	ctl.XVmax = [ 2  2];
	[x, obj_value, nfeval, convergence] = de_min (@f, ctl);

Keywords: global-optimisation optimisation minimisation

optim-1.4.1/doc/html/samin.html0000644000175000017500000001013212477524752015013 0ustar olafolaf samin - optim_doc

Next: , Previous: line_min, Up: Scalar optimization


1.18 A simulated annealing (stochastic) optimizer

This simulated annealing algorithm implicitely honours bound constraints.

Helptext:

samin: simulated annealing minimization of a function. See samin_example.m

usage: [x, obj, convergence, details] = samin(\"f\", {args}, {control})

Arguments:
* \"f\": function name (string)
* {args}: a cell array that holds all arguments of the function,
* {control}: a cell array with 11 elements
	* LB  - vector of lower bounds
	* UB - vector of upper bounds
	* nt - integer: # of iterations between temperature reductions
	* ns - integer: # of iterations between bounds adjustments
	* rt - (0 < rt <1): temperature reduction factor
	* maxevals - integer: limit on function evaluations
	* neps - integer:  number of values final result is compared to
	* functol -   (> 0): the required tolerance level for function value
	                   comparisons
	* paramtol -  (> 0): the required tolerance level for parameters
	* verbosity - scalar: 0, 1, or 2.
		* 0 = no screen output
		* 1 = only final results to screen
		* 2 = summary every temperature change
	* minarg - integer: which of function args is minimization over?

Returns:
* x: the minimizer
* obj: the value of f() at x
* convergence:
	0 if no convergence within maxevals function evaluations
	1 if normal convergence to a point interior to the parameter space
	2 if convergence to point very near bounds of parameter space
	  (suggest re-running with looser bounds)
* details: a px3 matrix. p is the number of times improvements were found.
           The columns record information at the time an improvement was found
           * first: cumulative number of function evaluations
           * second: temperature
           * third: function value

Example: see samin_example


optim-1.4.1/doc/html/XREFoptiondfdp.html0000644000175000017500000000012512477524752016540 0ustar olafolaf optim-1.4.1/doc/html/XREFcurvefit_005fstat.html0000644000175000017500000000013012477524752017643 0ustar olafolaf optim-1.4.1/doc/html/expfit.html0000644000175000017500000000771312477524752015216 0ustar olafolaf expfit - optim_doc

Next: , Previous: leasqr, Up: Residual optimization


2.8 Prony's method for non-linear exponential fitting

Helptext:

USAGE  [alpha,c,rms] = expfit( deg, x1, h, y )

Prony's method for non-linear exponential fitting

Fit function:   \sum_1^{deg} c(i)*exp(alpha(i)*x)

Elements of data vector y must correspond to
equidistant x-values starting at x1 with stepsize h

The method is fully compatible with complex linear
coefficients c, complex nonlinear coefficients alpha
and complex input arguments y, x1, non-zero h .
Fit-order deg  must be a real positive integer.

Returns linear coefficients c, nonlinear coefficients
alpha and root mean square error rms. This method is
known to be more stable than 'brute-force' non-linear
least squares fitting.

Example
   x0 = 0; step = 0.05; xend = 5; x = x0:step:xend;
   y = 2*exp(1.3*x)-0.5*exp(2*x);
   error = (rand(1,length(y))-0.5)*1e-4;
   [alpha,c,rms] = expfit(2,x0,step,y+error)

 alpha =
   2.0000
   1.3000
 c =
   -0.50000
    2.00000
 rms = 0.00028461

The fit is very sensitive to the number of data points.
It doesn't perform very well for small data sets.
Theoretically, you need at least 2*deg data points, but
if there are errors on the data, you certainly need more.

Be aware that this is a very (very,very) ill-posed problem.
By the way, this algorithm relies heavily on computing the
roots of a polynomial. I used 'roots.m', if there is
something better please use that code.

Demo for a complex fit-function:
deg= 2; N= 20; x1= -(1+i), x= linspace(x1,1+i/2,N).';
h = x(2) - x(1)
y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x );
A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise
[alpha,c,rms]= expfit( deg, x1, h, y )

optim-1.4.1/doc/html/Helper-functions.html0000644000175000017500000000502612477524752017137 0ustar olafolaf Helper functions - optim_doc

Next: , Previous: Gradient functions, Up: Top


5 Functions for algebraic tasks common to optimization problems

optim-1.4.1/doc/html/polyconf.html0000644000175000017500000000754412477524752015552 0ustar olafolaf polyconf - optim_doc

2.11 Confidence and prediction intervals for polynomial fitting

Helptext:

[y,dy] = polyconf(p,x,s)

  Produce prediction intervals for the fitted y. The vector p 
  and structure s are returned from polyfit or wpolyfit. The 
  x values are where you want to compute the prediction interval.

polyconf(...,['ci'|'pi'])

  Produce a confidence interval (range of likely values for the
  mean at x) or a prediction interval (range of likely values 
  seen when measuring at x).  The prediction interval tells
  you the width of the distribution at x.  This should be the same
  regardless of the number of measurements you have for the value
  at x.  The confidence interval tells you how well you know the
  mean at x.  It should get smaller as you increase the number of
  measurements.  Error bars in the physical sciences usually show 
  a 1-alpha confidence value of erfc(1/sqrt(2)), representing
  one standandard deviation of uncertainty in the mean.

polyconf(...,1-alpha)

  Control the width of the interval. If asking for the prediction
  interval 'pi', the default is .05 for the 95% prediction interval.
  If asking for the confidence interval 'ci', the default is
  erfc(1/sqrt(2)) for a one standard deviation confidence interval.

Example:
 [p,s] = polyfit(x,y,1);
 xf = linspace(x(1),x(end),150);
 [yf,dyf] = polyconf(p,xf,s,'ci');
 plot(xf,yf,'g-;fit;',xf,yf+dyf,'g.;;',xf,yf-dyf,'g.;;',x,y,'xr;data;');
 plot(x,y-polyval(p,x),';residuals;',xf,dyf,'g-;;',xf,-dyf,'g-;;');

optim-1.4.1/doc/html/XREFpolyfitinf.html0000644000175000017500000000011412477524752016553 0ustar olafolaf optim-1.4.1/doc/html/XREFresidmin_005fstat.html0000644000175000017500000000013012477524752017626 0ustar olafolaf optim-1.4.1/doc/html/XREFdfpdp.html0000644000175000017500000000010212477524752015462 0ustar olafolaf optim-1.4.1/doc/html/XREFvfzero.html0000644000175000017500000000010412477524752015702 0ustar olafolaf optim-1.4.1/doc/html/brent_005fline_005fmin.html0000644000175000017500000000774712477524752017700 0ustar olafolaf brent_line_min - optim_doc

Next: , Previous: cg_min, Up: Scalar optimization


1.16 Linesearch, Brent method

— Function File: [s,v,n] brent_line_min ( f,df,args,ctl )

Line minimization of f along df

Finds minimum of f on line x0 + dx*w | a < w < b by bracketing. a and b are passed through argument ctl.

Arguments

  • f : string : Name of function. Must return a real value
  • args : cell : Arguments passed to f or RxC : f's only argument. x0 must be at args{ ctl(2) }
  • ctl : 5 : (optional) Control variables, described below.

Returned values

  • s : 1 : Minimum is at x0 + s*dx
  • v : 1 : Value of f at x0 + s*dx
  • nev : 1 : Number of function evaluations

Control Variables

  • ctl(1) : Upper bound for error on s Default=sqrt(eps)
  • ctl(2) : Position of minimized argument in args Default= 1
  • ctl(3) : Maximum number of function evaluations Default= inf
  • ctl(4) : a Default=-inf
  • ctl(5) : b Default= inf

Default values will be used if ctl is not passed or if nan values are given.

optim-1.4.1/doc/html/siman.html0000644000175000017500000001234112477524752015017 0ustar olafolaf siman - optim_doc

Next: , Previous: octave_sqp, Up: Scalar optimization


1.4 Simulated annealing backend siman of scalar optimization

A simulated annealing (stochastic) optimizer, changing all parameters at once in a single step, so being suitable for non-bound constraints.

No gradient or hessian of the objective function is used. The settings MaxIter, fract_prec, TolFun, TolX, and max_fract_change are not honoured.

Accepts the additional settings T_init (initial temperature, default 0.01), T_min (final temperature, default 1.0e-5), mu_T (factor of temperature decrease, default 1.005), iters_fixed_T (iterations within one temperature step, default 10), max_rand_step (column vector or structure-based configuration of maximum random steps for each parameter, default 0.005 * pin), stoch_regain_constr (if true, regain constraints after a random step, otherwise take new random value until constraints are met, default false), trace_steps (set field trace of outp with a matrix with a row for each step, first column iteration number, second column repeat number within iteration, third column value of objective function, rest columns parameter values, default false), and siman_log (set field log of outp with a matrix with a row for each iteration, first column temperature, second column value of objective function, rest columns numbers of tries with decrease, no decrease but accepted, and no decrease and rejected.

Steps with increase diff of objective function are accepted if rand (1) < exp (- diff / T), where T is the temperature of the current iteration.

If regaining of constraints failed, optimization will be aborted and returned value of cvg will be 0. Otherwise, cvg will be 1. Returned structure outp, additionally to the possible fields trace and log described above, will have the fields niter and user_interaction.

Interpretation of Display: if set to "iter", an informational line is printed after each iteration.

If parallel_local is set to true, the objective function is evaluated for several parameter combinations in parallel. The course of optimization won't be changed by parallelization, provided the random number generator starts with the same state. To achieve this, some of the parallel results are discarded, causing the speedup to be smaller if the rate of acceptance of results is high. Also, due to overhead, there won't be any speedup, but even a slowdown, if the objective function is not computationally extensive enough.

Honours options save_state and recover_state, described for the frontend. optim-1.4.1/doc/html/vfzero.html0000644000175000017500000001053612477524752015227 0ustar olafolaf vfzero - optim_doc

3.1 A vectorized version of fzero

— Function File: vfzero (fun, x0)
— Function File: vfzero (fun, x0, options)
— Function File: [x, fval, info, output] = vfzero (...)

A variant of fzero. Finds a zero of a vector-valued multivariate function where each output element only depends on the input element with the same index (so the Jacobian is diagonal).

fun should be a handle or name of a function returning a column vector. x0 should be a two-column matrix, each row specifying two points which bracket a zero of the respective output element of fun.

If x0 is a single-column matrix then several nearby and distant values are probed in an attempt to obtain a valid bracketing. If this is not successful, the function fails. options is a structure specifying additional options. Currently, vfzero recognizes these options: "FunValCheck", "OutputFcn", "TolX", "MaxIter", "MaxFunEvals". For a description of these options, see optimset.

On exit, the function returns x, the approximate zero and fval, the function value thereof. info is a column vector of exit flags that can have these values:

  • 1 The algorithm converged to a solution.
  • 0 Maximum number of iterations or function evaluations has been reached.
  • -1 The algorithm has been terminated from user output function.
  • -5 The algorithm may have converged to a singular point.

output is a structure containing runtime information about the fzero algorithm. Fields in the structure are:

  • iterations Number of iterations through loop.
  • nfev Number of function evaluations.
  • bracketx A two-column matrix with the final bracketing of the zero along the x-axis.
  • brackety A two-column matrix with the final bracketing of the zero along the y-axis.

optim-1.4.1/doc/html/nonlin_005fmin.html0000644000175000017500000002234312477524752016446 0ustar olafolaf nonlin_min - optim_doc

1.1 Frontend nonlin_min for scalar non-linear optimization

— Function File: [p, objf, cvg, outp] = nonlin_min (f, pin)
— Function File: [p, objf, cvg, outp] = nonlin_min (f, pin, settings)

Frontend for nonlinear minimization of a scalar objective function.

The functions supplied by the user have a minimal interface; any additionally needed constants can be supplied by wrapping the user functions into anonymous functions.

The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately.

f: objective function. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination.

pin: real column vector of initial parameters.

settings: structure whose fields stand for optional settings referred to below. The fields can be set by optimset().

The returned values are the column vector of final parameters p, the final value of the objective function objf, an integer cvg indicating if and how optimization succeeded or failed, and a structure outp with additional information, curently with possible fields: niter, the number of iterations, nobjf, the number of objective function calls (indirect calls by gradient function not counted), lambda, the lambda of constraints at the result, and user_interaction, information on user stops (see settings). The backend may define additional fields. cvg is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be 0 (maximum number of iterations exceeded), 1 (success without further specification of criteria), 2 (parameter change less than specified precision in two consecutive iterations), 3 (improvement in objective function less than specified), -1 (algorithm aborted by a user function), or -4 (algorithm got stuck).

Settings

The fields of the settings structure can be set with optimset.

For settings common to all frontends (including these for statistics) see Common frontend options.

For additional settings common to all optimization frontends see Common optimization options.

Specific defaults:

Algorithm: "lm_feasible"

Additional settings:
objf_grad
Function computing the gradient of the objective function with respect to the parameters. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If dfdp was specified by the user, the informational structure has the fields f: value of objective function for current parameters, fixed: logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, diffp, diff_onesided, lbound, ubound: identical to the user settings of this name, plabels: 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, see

Parameter structures, contain the names of the parameters and the subindices of the parameters, see Non-scalar parameters, respectively. The default gradient function will call the objective function with the second argument set with fields f: as the f passed to the gradient function, plabels: cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of plabels as passed to the jacobian function corresponding to current parameter, side: 0 for one-sided interval, 1 or 2, respectively, for the sides of a two-sided interval, and parallel: logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative.

objf_hessian
Function computing the Hessian of the objective function with respect to the parameters. The default is backend specific. Will be called with the column vector of parameters as argument.
inverse_hessian
Logical scalar, indicating whether the Hessian function passed by the user actually returns the inverse of the Hessian.
complex_step_derivative_objf
Logical scalar, default: false. Estimate gradient of objective function with complex step derivative approximation. Use only if you know that your objective function is suitable for this. No user function for the gradient (objf_grad) must be specified.
save_state
String with path to a file which will be created for periodical saving of the state of optimization. Useful for very long optimizations which might get interrupted. The format of the saved state will be backend-specific. Currently, only the "siman" backend honours this option. Default: empty string, meaning no saving of state.
recover_state
String with path to a file created due to option save_state, which is used to recover a saved state before starting optimization. Default: empty string, meaning no recovering of state.

Structure based parameter handling

Please see Parameter structures.

Backend information

Please see Scalar optimization and choose backend from menu. optim-1.4.1/doc/html/nonlin_005fcurvefit.html0000644000175000017500000001212512477524752017507 0ustar olafolaf nonlin_curvefit - optim_doc

2.2 Function nonlin_curvefit() for curve fitting

In curve fitting, the model function computes values from a constant set of `independents', and the intention is to minimize the differences of these computed values to a constant set of `observations'. This can be done with nonlin_residmin, but it is more convenient to use nonlin_curvefit, which cares for passing the constant `independents' to the model function and for calculating the differences to the constant `observations'.

However, if in some optimization problem you notice that you end up with passing dummy-values for the `independents' and zeros for the `observations', you can more naturally use nonlin_residmin instead of nonlin_curvefit.

— Function File: [p, fy, cvg, outp] = nonlin_curvefit (f, pin, x, y)
— Function File: [p, fy, cvg, outp] = nonlin_curvefit (f, pin, x, y, settings)

Frontend for nonlinear fitting of values, computed by a model function, to observed values.

Please refer to the description of nonlin_residmin. The differences to nonlin_residmin are the additional arguments x (independent values, mostly, but not necessarily, an array of the same dimensions or the same number of rows as y) and y (array of observations), the returned value fy (final guess for observed values) instead of resid, that the model function has a second obligatory argument which will be set to x and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to x.

See also: nonlin_residmin.

Also, if the setting user_interaction is given, additional information is passed to these functions, see

Common optimization options. optim-1.4.1/doc/html/XREFbrent_005fline_005fmin.html0000644000175000017500000000014012477524752020341 0ustar olafolaf optim-1.4.1/doc/html/Zero-finders.html0000644000175000017500000000517112477524752016262 0ustar olafolaf Zero finders - optim_doc

Next: , Previous: Residual optimization, Up: Top


3 Functions for finding the zero of a nonlinear user function

There is only one dedicated zero finder in the optim package, which is just a vectorized version of Octaves fzero (see fzero).

optim-1.4.1/doc/html/cpiv_005fbard.html0000644000175000017500000000661012477524752016236 0ustar olafolaf cpiv_bard - optim_doc

Next: , Up: Helper functions


5.1 A complementary pivoting algorithm

Helptext:

[lb, idx, ridx, mv] = cpiv_bard (v, m[, incl])

v: column vector; m: matrix; incl (optional): index. length (v)
must equal rows (m). Finds column vectors w and l with w == v + m *
l, w >= 0, l >= 0, l.' * w == 0. Chooses idx, w, and l so that
l(~idx) == 0, l(idx) == -inv (m(idx, idx)) * v(idx), w(idx) roughly
== 0, and w(~idx) == v(~idx) + m(idx, ~idx).' * l(idx). idx indexes
at least everything indexed by incl, but l(incl) may be < 0. lb:
l(idx) (column vector); idx: logical index, defined above; ridx:
~idx & w roughly == 0; mv: [m, v] after performing a Gauss-Jordan
'sweep' (with gjp.m) on each diagonal element indexed by idx.
Except the handling of incl (which enables handling of equality
constraints in the calling code), this is called solving the
'complementary pivot problem' (Cottle, R. W. and Dantzig, G. B.,
'Complementary pivot theory of mathematical programming', Linear
Algebra and Appl. 1, 102--125. References for the current
algorithm: Bard, Y.: Nonlinear Parameter Estimation, p. 147--149,
Academic Press, New York and London 1974; Bard, Y., 'An eclectic
approach to nonlinear programming', Proc. ANU Sem. Optimization,
Canberra, Austral. Nat. Univ.).

optim-1.4.1/doc/html/XREFmdsmax.html0000644000175000017500000000010412477524752015660 0ustar olafolaf optim-1.4.1/doc/html/XREFoptim_005fdoc.html0000644000175000017500000000012012477524752016735 0ustar olafolaf optim-1.4.1/doc/html/XREFcg_005fmin.html0000644000175000017500000000011212477524752016215 0ustar olafolaf optim-1.4.1/doc/html/octave_005fsqp.html0000644000175000017500000000742412477524752016455 0ustar olafolaf octave_sqp - optim_doc

Next: , Previous: lm_feasible, Up: Scalar optimization


1.3 Backend wrapping core Octaves sqp function

This backend calls the sqp function of core Octave, a sequential quadratic programming algorithm with BFGS, so that it is usable with the nonlin_min frontend of the optim package. sqp honours constraints for the returned result, but not necessarily during the course of optimization.

Compared to calling sqp directly (see sqp),

  • a different default (numerical) gradient function is used (that of the frontend),
  • bounds are set to +-Inf by default, not to +-realmax.

The value of the additional setting octave_sqp_tolerance, a tolerance to several termination criteria, is passed as the respective argument to Octaves sqp, which provides the default sqrt (eps). The settings TolFun, TolX, fract_prec, max_fract_change, Display, and user_interaction are not honoured.

Returned value cvg will be 1 for success and 0 or -4 for failure (see nonlin_min for meaning). Returned structure outp will have the fields niter, nobjf, and lambda.

The default of MaxIter is that of Octaves sqp (100). optim-1.4.1/doc/html/Gradient-functions.html0000644000175000017500000000640012477524752017452 0ustar olafolaf Gradient functions - optim_doc

Next: , Previous: Zero finders, Up: Top


4 Functions for numerical approximation of gradients and Hessians

You should not usually need to use these functions directly or pass them as arguments. They should be chosen and used by optimizer functions, possibly subject to their configuration options.

optim-1.4.1/doc/html/numhessian.html0000644000175000017500000000572212477524752016067 0ustar olafolaf numhessian - optim_doc

Next: , Previous: numgradient, Up: Gradient functions


4.4 Numerical Hessian function

Helptext:

numhessian(f, {args}, minarg)

Numeric second derivative of f with respect
to argument \"minarg\".
* first argument: function name (string)
* second argument: all arguments of the function (cell array)
* third argument: (optional) the argument to differentiate w.r.t.
	(scalar, default=1)

If the argument
is a k-vector, the Hessian will be a kxk matrix

function a = f(x, y)
	a = x'*x + log(y);
endfunction

numhessian(\"f\", {ones(2,1), 1})
ans =

    2.0000e+00   -7.4507e-09
   -7.4507e-09    2.0000e+00

Now, w.r.t. second argument:
numhessian(\"f\", {ones(2,1), 1}, 2)
ans = -1.0000


optim-1.4.1/doc/html/line_005fmin.html0000644000175000017500000000637612477524752016110 0ustar olafolaf line_min - optim_doc

Next: , Previous: brent_line_min, Up: Scalar optimization


1.17 Linesearch, minimize a function along dx

Helptext:

[a,fx,nev] = line_min (f, dx, args, narg, h, nev_max) - Minimize f() along dx

INPUT ----------
f    : string  : Name of minimized function
dx   : matrix  : Direction along which f() is minimized
args : cell    : Arguments of f
narg : integer : Position of minimized variable in args.  Default=1
h    : scalar  : Step size to use for centered finite difference
approximation of first and second derivatives. Default=1E-3.
nev_max : integer : Maximum number of function evaluations.  Default=30

OUTPUT ---------
a    : scalar  : Value for which f(x+a*dx) is a minimum (*)
fx   : scalar  : Value of f(x+a*dx) at minimum (*)
nev  : integer : Number of function evaluations

(*) The notation f(x+a*dx) assumes that args == {x}.

Reference: David G Luenberger's Linear and Nonlinear Programming

optim-1.4.1/doc/optim-mini-howto-2.lyx0000644000175000017500000003734512477523726016206 0ustar olafolaf#LyX 2.0 created this file. For more info see http://www.lyx.org/ \lyxformat 413 \begin_document \begin_header \textclass article \begin_preamble \usepackage{dsfont} % No date please \date{} \fontfamily{cmss} \selectfont \end_preamble \use_default_options false \maintain_unincluded_children false \language english \language_package default \inputencoding auto \fontencoding global \font_roman default \font_sans helvet \font_typewriter default \font_default_family default \use_non_tex_fonts false \font_sc false \font_osf false \font_sf_scale 100 \font_tt_scale 100 \graphics default \default_output_format default \output_sync 0 \bibtex_command default \index_command default \paperfontsize default \spacing single \use_hyperref false \papersize default \use_geometry false \use_amsmath 1 \use_esint 0 \use_mhchem 0 \use_mathdots 1 \cite_engine basic \use_bibtopic false \use_indices false \paperorientation portrait \suppress_date false \use_refstyle 0 \index Index \shortcut idx \color #008000 \end_index \secnumdepth 3 \tocdepth 3 \paragraph_separation indent \paragraph_indentation default \quotes_language english \papercolumns 1 \papersides 1 \paperpagestyle default \tracking_changes false \output_changes false \html_math_output 0 \html_css_as_file 0 \html_be_strict false \end_header \begin_body \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout % This is for \backslash LyX \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \newcommand{\bfrac}[2]{\frac{#1}{#2}} \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \newcommand{\nth}{^{\textrm{th}}} \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \newcommand{\R}{R} \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \renewcommand{\N}{N} \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \renewcommand{\Z}{Z} \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \renewcommand{\tra}{^{T}} \end_inset \end_layout \begin_layout Standard \begin_inset FormulaMacro \renewcommand{\xx}{\mathbf{x}} \end_inset \end_layout \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout % This is for \backslash LaTeX \end_layout \begin_layout Plain Layout \end_layout \begin_layout Plain Layout \backslash fontfamily{cmss} \backslash selectfont \end_layout \begin_layout Plain Layout \end_layout \begin_layout Plain Layout \backslash renewcommand{ \backslash R}{{ \backslash mathds{R}}} \end_layout \begin_layout Plain Layout \end_layout \begin_layout Plain Layout \backslash renewcommand{ \backslash N}{{ \backslash mathds{N}}} \end_layout \begin_layout Plain Layout \end_layout \begin_layout Plain Layout \backslash renewcommand{ \backslash Z}{{ \backslash mathds{Z}}} \end_layout \begin_layout Plain Layout \end_layout \begin_layout Plain Layout \backslash renewcommand{ \backslash tra}{^{ \backslash top}} \end_layout \begin_layout Plain Layout \end_layout \begin_layout Plain Layout \backslash renewcommand{ \backslash bfrac}[2]{ \backslash frac{{ \backslash textstyle #1 }}{{ \backslash textstyle #2 }}} \end_layout \end_inset \end_layout \begin_layout Title Mini-HOWTO on using Octave for Unconstrained Nonlinear Optimization \begin_inset Foot status collapsed \begin_layout Plain Layout Author : Etienne Grossmann \family typewriter \family default (soon replaced by \begin_inset Quotes eld \end_inset Octave-Forge developers \begin_inset Quotes erd \end_inset ?). This document is free documentation; you can redistribute it and/or modify it under the terms of the GNU Free Documentation License as published by the Free Software Foundation. \begin_inset Newline newline \end_inset . \begin_inset space ~ \end_inset \begin_inset space ~ \end_inset \begin_inset space ~ \end_inset This is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset Note Comment status collapsed \begin_layout Plain Layout Keywords: nonlinear optimization, octave, tutorial, Nelder-Mead, Conjugate Gradient, Levenberg-Marquardt \end_layout \end_inset \end_layout \begin_layout Standard \emph on This document refers to the frontend function \begin_inset Quotes eld \end_inset minimize \begin_inset Quotes erd \end_inset , which is now deprecated and will be removed in a future version of the optim package. An alternative frontend function is \begin_inset Quotes eld \end_inset nonlin_min \begin_inset Quotes erd \end_inset , not described here. The backends of \begin_inset Quotes eld \end_inset minimize \begin_inset Quotes erd \end_inset ( \begin_inset Quotes eld \end_inset nelder_mead_min \begin_inset Quotes erd \end_inset , \begin_inset Quotes eld \end_inset bfgsmin \begin_inset Quotes erd \end_inset , \begin_inset Quotes eld \end_inset d2_min \begin_inset Quotes erd \end_inset ) can also be called directly. This document will be removed in the future. \end_layout \begin_layout Standard Nonlinear optimization problems are very common and when a solution cannot be found analytically, one usually tries to find it numerically. This document shows how to perform unconstrained nonlinear minimization using the Octave language for numerical computation. We assume to be so lucky as to have an initial guess from which to start an iterative method, and so impatient as to avoid as much as possible going into the details of the algorithm. In the following examples, we consider multivariable problems, but the single variable case is solved in exactly the same way. \end_layout \begin_layout Standard All the algorithms used below return numerical approximations of \emph on local minima \emph default of the optimized function. In the following examples, we minimize a function with a single minimum (Figure \begin_inset space ~ \end_inset \begin_inset CommandInset ref LatexCommand ref reference "fig:function" \end_inset ), which is relatively easily found. In practice, success of optimization algorithms greatly depend on the optimized function and on the starting point. \end_layout \begin_layout Section* \end_layout \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout \backslash fontfamily{cmss} \backslash selectfont A simple example \end_layout \end_inset \end_layout \begin_layout Standard \begin_inset Float figure wide false sideways false status open \begin_layout Plain Layout \align center \begin_inset ERT status collapsed \begin_layout Plain Layout \backslash raisebox{6mm}{ \end_layout \end_inset \begin_inset Graphics filename figures/2D_slice-3.eps2 width 30text% \end_inset \begin_inset ERT status collapsed \begin_layout Plain Layout } \end_layout \end_inset \begin_inset space ~ \end_inset \end_layout \begin_layout Plain Layout \begin_inset Caption \begin_layout Plain Layout \begin_inset CommandInset label LatexCommand label name "fig:function" \end_inset 2D slice of the function that is minimized throughout this tutorial. Although not obvious at first sight, it has a unique minimum. \end_layout \end_inset \end_layout \end_inset \end_layout \begin_layout Standard We will use a call of the type \end_layout \begin_layout LyX-Code [x_best, best_value, niter] = minimize (func, x_init) \end_layout \begin_layout Standard to find the minimum of \begin_inset Formula \[ \begin{array}{cccc} f\,: & \left(x_{1},.x_{2},x_{3}\right)\in\R^{3} & \longrightarrow & \left(x_{1}-1\right)^{2}/9+\left(x_{3}-1\right)^{2}/9+\left(x_{3}-1\right)^{2}/9\\ & & & -\cos\left(x_{1}-1\right)-\cos\left(x_{2}-1\right)-\cos\left(x_{3}-1\right). \end{array} \] \end_inset \end_layout \begin_layout Standard The following commands should find a local minimum of \begin_inset Formula \ensuremath{f()} \end_inset , using the Nelder-Mead (aka \begin_inset Quotes eld \end_inset downhill simplex \begin_inset Quotes erd \end_inset ) algorithm and starting from a randomly chosen point \family typewriter x0 \family default \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code function cost = foo (xx) \end_layout \begin_layout LyX-Code xx--; \end_layout \begin_layout LyX-Code cost = sum (-cos(xx)+xx.^2/9); \end_layout \begin_layout LyX-Code endfunction \end_layout \begin_layout LyX-Code x0 = [-1, 3, -2]; \end_layout \begin_layout LyX-Code [x,v,n] = minimize ("foo", x0) \end_layout \begin_layout Standard The output should look like \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code x = \end_layout \begin_layout LyX-Code 1.00000 1.00000 1.00000 \end_layout \begin_layout LyX-Code \end_layout \begin_layout LyX-Code v = -3.0000 \end_layout \begin_layout LyX-Code n = 248 \end_layout \begin_layout Standard This means that a minimum has been found in \begin_inset Formula \ensuremath{\left(1,1,1\right)} \end_inset and that the value at that point is \begin_inset Formula \ensuremath{-3} \end_inset . This is correct, since all the points of the form \begin_inset Formula \ensuremath{x_{1}=1+2i\pi,\, x_{2}=1+2j\pi,\, x_{3}=1+2k\pi} \end_inset , for some \begin_inset Formula \ensuremath{i,j,k\in\N} \end_inset , minimize \begin_inset Formula \ensuremath{f()} \end_inset . The number of function evaluations, 248, is also returned. Note that this number depends on the starting point. You will most likely obtain different numbers if you change \family typewriter x0 \family default . \end_layout \begin_layout Standard The Nelder-Mead algorithm is quite robust, but unfortunately it is not very efficient. For high-dimensional problems, its execution time may become prohibitive. \end_layout \begin_layout Section* \end_layout \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout \backslash fontfamily{cmss} \backslash selectfont Using the first differential \end_layout \end_inset \end_layout \begin_layout Standard Fortunately, when a function, like \begin_inset Formula \ensuremath{f()} \end_inset above, is differentiable, more efficient optimization algorithms can be used. If \family typewriter minimize() \family default is given the differential of the optimized function, using the \family typewriter "df" \family default option, it will use a conjugate gradient method. \end_layout \begin_layout LyX-Code ## Function returning partial derivatives \end_layout \begin_layout LyX-Code function dc = diffoo (x) \end_layout \begin_layout LyX-Code x = x(:)' - 1; \end_layout \begin_layout LyX-Code dc = sin (x) + 2*x/9; \end_layout \begin_layout LyX-Code endfunction \end_layout \begin_layout LyX-Code [x, v, n] = minimize ("foo", x0, "df", "diffoo") \end_layout \begin_layout Standard This produces the output \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code x = \end_layout \begin_layout LyX-Code 1.00000 1.00000 1.00000 \end_layout \begin_layout LyX-Code v = -3 \end_layout \begin_layout LyX-Code n = \end_layout \begin_layout LyX-Code 108 6 \end_layout \begin_layout Standard The same minimum has been found, but only 108 function evaluations were needed, together with 6 evaluations of the differential. Here, \family typewriter diffoo() \family default takes the same argument as \family typewriter foo() \family default and returns the partial derivatives of \begin_inset Formula \ensuremath{f()} \end_inset with respect to the corresponding variables. It doesn't matter if it returns a row or column vector or a matrix, as long as the \begin_inset Formula \ensuremath{i\nth} \end_inset element of \family typewriter diffoo(x) \family default is the partial derivative of \begin_inset Formula \ensuremath{f()} \end_inset with respect to \begin_inset Formula \ensuremath{x_{i}} \end_inset . \end_layout \begin_layout Section* \end_layout \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout \backslash fontfamily{cmss} \backslash selectfont Using numerical approximations of the first differential \end_layout \end_inset \end_layout \begin_layout Standard Sometimes, the minimized function is differentiable, but actually writing down its differential is more work than one would like. Numerical differentiation offers a solution which is less efficient in terms of computation cost, but easy to implement. The \family typewriter "ndiff" \family default option of \family typewriter minimize() \family default uses numerical differentiation to execute exactly the same algorithm as in the previous example. However, because numerical approximation of the differentia is used, the outpud may differ slightly \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code [x, v, n] = minimize ("foo", x0, "ndiff") \end_layout \begin_layout Standard wich yields \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code x = \end_layout \begin_layout LyX-Code 1.00000 1.00000 1.00000 \end_layout \begin_layout LyX-Code v = -3 \end_layout \begin_layout LyX-Code n = \end_layout \begin_layout LyX-Code 78 6 \end_layout \begin_layout Standard Note that each time the differential is numerically approximated, \family typewriter foo() \family default is called 6 times (twice per input element), so that \family typewriter foo() \family default is evaluated a total of (78+6*6=) 114 times in this example. \end_layout \begin_layout Section* \end_layout \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout \backslash fontfamily{cmss} \backslash selectfont Using the first and second differentials \end_layout \end_inset \end_layout \begin_layout Standard When the function is twice differentiable and one knows how to compute its first and second differentials, still more efficient algorithms can be used (in our case, a variant of Levenberg-Marquardt). The option \family typewriter "d2f" \family default allows to specify a function that returns the value of the function, the first and second differentials of the minimized function. Entering the commands \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code function [c, dc, d2c] = d2foo (x) \end_layout \begin_layout LyX-Code c = foo(x); \end_layout \begin_layout LyX-Code dc = diffoo(x); \end_layout \begin_layout LyX-Code d2c = diag (cos (x(:)-1) + 2/9); \end_layout \begin_layout LyX-Code end \end_layout \begin_layout LyX-Code [x,v,n] = minimize ("foo", x0, "d2f", "d2foo") \end_layout \begin_layout Standard produces the output \begin_inset space ~ \end_inset : \end_layout \begin_layout LyX-Code x = \end_layout \begin_layout LyX-Code 1.0000 1.0000 1.0000 \end_layout \begin_layout LyX-Code v = -3 \end_layout \begin_layout LyX-Code n = \end_layout \begin_layout LyX-Code 34 5 \end_layout \begin_layout Standard This time, 34 function evaluations, and 5 evaluations of \family typewriter d2foo() \family default were needed. \end_layout \begin_layout Section* \end_layout \begin_layout Standard \begin_inset ERT status collapsed \begin_layout Plain Layout \backslash fontfamily{cmss} \backslash selectfont Summary \end_layout \end_inset \end_layout \begin_layout Standard We have just seen the most basic ways of solving nonlinear unconstrained optimization problems. The online help system of Octave (try e.g. \begin_inset Quotes eld \end_inset \family typewriter help minimize \family default \begin_inset Quotes erd \end_inset ) will yield information on other issues, such as \emph on passing extra arguments \emph default to the minimized function, \emph on controling the termination \emph default of the optimization process, choosing the algorithm etc. \end_layout \begin_layout LyX-Code \end_layout \end_body \end_document optim-1.4.1/doc/optim.texi0000644000175000017500000036634512477524021014106 0ustar olafolaf@c This file is generated automatically by the packages munge-texi.pl. \input texinfo @c %**start of header @setfilename optim.info @settitle optim_doc @c %**end of header @copying Additional documentation for the optim package for Octave. Copyright @copyright{} @email{Olaf Till } You can redistribute this documentation and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This documentation is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this documentation; if not, see . @end copying @include macros.texi @macro mysee @ifhtml see @end ifhtml @end macro @titlepage @title Additional documentation for the optim package for Octave @page @vskip 0pt plus 1filll @insertcopying @end titlepage @c No table of contents. The table would occupy most of the top node in @c html and IMHO misleads the user to use the table instead of the menu @c structure of the nodes, which would let some information unused. @c @c @contents @c ------------------------------------------------------------------ @node Top @top Additional documentation for the optim package for Octave @ifhtml @strong{Note: Some links in this document lead to Octaves core documentation. These links probably don't work in the html version of this document. They should work, however, in the info version accessible from the Octave commandline with @code{optim_doc()}.} @end ifhtml The optim package is a collection of additional functions related to numerical optimization. For Octaves core optimization functions (not contained in this package) @mysee @ref{Optimization,,,octave}. @menu Types of functions in the optim package * Scalar optimization:: Functions for optimization of a scalar objective function. * Residual optimization:: Functions for optimization of a model function returning an array. * Zero finders:: Functions for finding the zero of a scalar or array valued nonlinear user function. * Gradient functions:: Functions for numerical approximation of gradients and Hessians. * Helper functions:: Functions for algebraic tasks common to optimization problems. * Documentation:: Function optim_doc to view documentation. Configuration * Common frontend options:: Options common to all frontends. * Common optimization options:: Options common to all optimization frontends. * Parameter structures:: Handling of structures of optimized parameters. * Additional parameters:: Passing additional parameters to user functions. Indices * Function index:: Index of functions in optim. * Concept index:: Concept index. @end menu @c ------------------------------------------------------------------ @node Scalar optimization @chapter Functions for optimization of a scalar objective function @cindex scalar optimization @menu Frontend * nonlin_min:: Interface for scalar non-linear optimization. Backends * lm_feasible:: L/M-like optimizer, constraints met throughout optimization. * octave_sqp:: A wrapper to core Octaves sqp function. * siman:: Simulated annealing with constraints. * d2_min:: Newton-like optimizer, no constraints. Alternative frontend * fmins:: Frontend for direct search (gradient-less) algorithms. Standalone functions * nmsmax:: A Nelder-Mead simplex algorithm. * mdsmax:: A multidirectional search algorithm. * adsmax:: An alternating directions algorithm. * nelder_mead_min:: Another Nelder-Mead algorithm. * powell:: Direction-set (Powell's) method. * standalone d2_min:: Unconstrained Newton-like minimization, standalone function. * bfgsmin:: Unconstrained BFGS algorithm. * nrm:: Newton-Raphson algorithm. * cg_min:: A conjugate gradient method. * brent_line_min:: Linesearch, Brent method. * line_min:: Linesearch (minimize a function along dx). * samin:: A simulated annealing algorithm. * de_min:: A differential evolution algorithm. * battery:: Repeatedly call bfgsmin. @end menu @c ------------------------------------------------------------------ @node nonlin_min @section Frontend nonlin_min for scalar non-linear optimization @findex nonlin_min @c nonlin_min ../inst/nonlin_min.m @anchor{XREFnonlin_min} @deftypefn {Function File} {[@var{p}, @var{objf}, @var{cvg}, @var{outp}] =} nonlin_min (@var{f}, @var{pin}) @deftypefnx {Function File} {[@var{p}, @var{objf}, @var{cvg}, @var{outp}] =} nonlin_min (@var{f}, @var{pin}, @var{settings}) Frontend for nonlinear minimization of a scalar objective function. The functions supplied by the user have a minimal interface; any additionally needed constants can be supplied by wrapping the user functions into anonymous functions. The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately. @var{f}: objective function. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination. @var{pin}: real column vector of initial parameters. @var{settings}: structure whose fields stand for optional settings referred to below. The fields can be set by @code{optimset()}. The returned values are the column vector of final parameters @var{p}, the final value of the objective function @var{objf}, an integer @var{cvg} indicating if and how optimization succeeded or failed, and a structure @var{outp} with additional information, curently with possible fields: @code{niter}, the number of iterations, @code{nobjf}, the number of objective function calls (indirect calls by gradient function not counted), @code{lambda}, the lambda of constraints at the result, and @code{user_interaction}, information on user stops (see settings). The backend may define additional fields. @var{cvg} is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be @code{0} (maximum number of iterations exceeded), @code{1} (success without further specification of criteria), @code{2} (parameter change less than specified precision in two consecutive iterations), @code{3} (improvement in objective function less than specified), @code{-1} (algorithm aborted by a user function), or @code{-4} (algorithm got stuck). @c The following block will be cut out in the package info file. @end deftypefn @subheading Settings The fields of the @var{settings} structure can be set with @ref{XREFoptimset,,optimset,octave}. For settings common to all frontends (including these for statistics) @mysee @ref{Common frontend options}. For additional settings common to all optimization frontends @mysee @ref{Common optimization options}. @subsubheading Specific defaults: @multitable {@code{Algorithm}} {"lm_feasible"} @item @code{Algorithm}: @tab "lm_feasible" @end multitable @subsubheading Additional settings: @table @code @item objf_grad Function computing the gradient of the objective function with respect to the parameters. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If @code{dfdp} was specified by the user, the informational structure has the fields @code{f}: value of objective function for current parameters, @code{fixed}: logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, @code{diffp}, @code{diff_onesided}, @code{lbound}, @code{ubound}: identical to the user settings of this name, @code{plabels}: 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, @mysee @ref{Parameter structures}, contain the names of the parameters and the subindices of the parameters, @mysee @ref{Non-scalar parameters}, respectively. The default gradient function will call the objective function with the second argument set with fields @code{f}: as the @code{f} passed to the gradient function, @code{plabels}: cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of @code{plabels} as passed to the jacobian function corresponding to current parameter, @code{side}: @code{0} for one-sided interval, @code{1} or @code{2}, respectively, for the sides of a two-sided interval, and @code{parallel}: logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative. @item objf_hessian Function computing the Hessian of the objective function with respect to the parameters. The default is backend specific. Will be called with the column vector of parameters as argument. @item inverse_hessian Logical scalar, indicating whether the Hessian function passed by the user actually returns the inverse of the Hessian. @item complex_step_derivative_objf Logical scalar, default: @code{false}. Estimate gradient of objective function with complex step derivative approximation. Use only if you know that your objective function is suitable for this. No user function for the gradient (@code{objf_grad}) must be specified. @item save_state String with path to a file which will be created for periodical saving of the state of optimization. Useful for very long optimizations which might get interrupted. The format of the saved state will be backend-specific. Currently, only the @qcode{"siman"} backend honours this option. Default: empty string, meaning no saving of state. @item recover_state String with path to a file created due to option @code{save_state}, which is used to recover a saved state before starting optimization. Default: empty string, meaning no recovering of state. @end table @subheading Structure based parameter handling Please @mysee @ref{Parameter structures}. @subheading Backend information Please @mysee @ref{Scalar optimization} and choose backend from menu. @c ------------------------------------------------------------------ @node lm_feasible @section Default backend lm_feasible of scalar optimization @cindex lm_feasible A Levenberg/Marquardt-like algorithm, attempting to honour constraints throughout the course of optimization. This means that the initial parameters must not violate constraints (to find an initial feasible set of parameters, e.g. core Octaves @code{sqp} can be used (@mysee @ref{octave_sqp}), by specifying an objective function which is constant or which returns a norm of the distances to the initial values). The Hessian is either supplied by the user or is approximated by the BFGS algorithm. Core Octaves @code{sqp} performed better in some tests with @emph{unconstrained} problems. Returned value @var{cvg} will be @code{2} or @code{3} for success and @code{0} or @code{-4} for failure (@mysee @ref{nonlin_min} for meaning). Returned structure @var{outp} will have the fields @code{niter}, @code{nobjf}, and @code{user_interaction}. Backend-specific defaults are: @code{MaxIter}: 20, @code{fract_prec}: @code{zeros (size (parameters))}, @code{max_fract_change}: @code{Inf} for all parameters. The setting @code{TolX} is not honoured. Interpretation of @code{Display}: if set to @qcode{"iter"}, currently only information on applying @code{max_fract_change} is printed. @c ------------------------------------------------------------------ @node octave_sqp @section Backend wrapping core Octaves sqp function @cindex octave_sqp This backend calls the @code{sqp} function of core Octave, a sequential quadratic programming algorithm with BFGS, so that it is usable with the @code{nonlin_min} frontend of the optim package. @code{sqp} honours constraints for the returned result, but not necessarily during the course of optimization. Compared to calling @code{sqp} directly (@mysee @ref{XREFsqp,,sqp,octave}), @itemize @item a different default (numerical) gradient function is used (that of the frontend), @item bounds are set to @code{+-Inf} by default, not to @code{+-realmax}. @end itemize The value of the additional setting @code{octave_sqp_tolerance}, a tolerance to several termination criteria, is passed as the respective argument to Octaves @code{sqp}, which provides the default @code{sqrt (eps)}. The settings @code{TolFun}, @code{TolX}, @code{fract_prec}, @code{max_fract_change}, @code{Display}, and @code{user_interaction} are not honoured. Returned value @var{cvg} will be @code{1} for success and @code{0} or @code{-4} for failure (@mysee @ref{nonlin_min} for meaning). Returned structure @var{outp} will have the fields @code{niter}, @code{nobjf}, and @code{lambda}. The default of @code{MaxIter} is that of Octaves @code{sqp} (100). @c ------------------------------------------------------------------ @node siman @section Simulated annealing backend siman of scalar optimization @cindex siman A simulated annealing (stochastic) optimizer, changing all parameters at once in a single step, so being suitable for non-bound constraints. No gradient or hessian of the objective function is used. The settings @code{MaxIter}, @code{fract_prec}, @code{TolFun}, @code{TolX}, and @code{max_fract_change} are not honoured. Accepts the additional settings @code{T_init} (initial temperature, default 0.01), @code{T_min} (final temperature, default 1.0e-5), @code{mu_T} (factor of temperature decrease, default 1.005), @code{iters_fixed_T} (iterations within one temperature step, default 10), @code{max_rand_step} (column vector or structure-based configuration of maximum random steps for each parameter, default 0.005 * @var{pin}), @code{stoch_regain_constr} (if @code{true}, regain constraints after a random step, otherwise take new random value until constraints are met, default @code{false}), @code{trace_steps} (set field @code{trace} of @var{outp} with a matrix with a row for each step, first column iteration number, second column repeat number within iteration, third column value of objective function, rest columns parameter values, default @code{false}), and @code{siman_log} (set field @code{log} of @var{outp} with a matrix with a row for each iteration, first column temperature, second column value of objective function, rest columns numbers of tries with decrease, no decrease but accepted, and no decrease and rejected. Steps with increase @code{diff} of objective function are accepted if @code{rand (1) < exp (- diff / T)}, where @code{T} is the temperature of the current iteration. If regaining of constraints failed, optimization will be aborted and returned value of @var{cvg} will be @code{0}. Otherwise, @var{cvg} will be @code{1}. Returned structure @var{outp}, additionally to the possible fields @code{trace} and @code{log} described above, will have the fields @code{niter} and @code{user_interaction}. Interpretation of @code{Display}: if set to @qcode{"iter"}, an informational line is printed after each iteration. If @code{parallel_local} is set to @code{true}, the objective function is evaluated for several parameter combinations in parallel. The course of optimization won't be changed by parallelization, provided the random number generator starts with the same state. To achieve this, some of the parallel results are discarded, causing the speedup to be smaller if the rate of acceptance of results is high. Also, due to overhead, there won't be any speedup, but even a slowdown, if the objective function is not computationally extensive enough. Honours options @code{save_state} and @code{recover_state}, described for the frontend. @c ------------------------------------------------------------------ @node d2_min @section Unconstrained Newton-like optimization @cindex d2_min This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. If the supplied Hessian function actually returns the inverse of the Hessian, set @code{inverse_hessian} to @code{true}. Supplying the inverse Hessian is preferable, if possible. Returned value @var{cvg} will be @code{2} or @code{3} for success and @code{0} or @code{-1} for failure (@mysee @ref{nonlin_min} for meaning). Returned structure @var{outp} will have the fields @code{niter}, @code{nobjf}, and @code{user_interaction}. Interpretation of @code{Display}: if set to @qcode{"iter"}, some diagnostics are printed. @c ------------------------------------------------------------------ @node fmins @section Alternative frontend for gradient-less algorithms @findex fmins @c fmins ../inst/fmins.m @anchor{XREFfmins} @deftypefn {Function File} {[@var{x}] =} fmins (@var{f},@var{X0},@var{options},@var{grad},@var{P1},@var{P2}, ...) Find the minimum of a funtion of several variables. By default the method used is the Nelder&Mead Simplex algorithm Example usage: fmins(inline('(x(1)-5).^2+(x(2)-8).^4'),[0;0]) @strong{Inputs} @c use @asis and explicite @var in @table to avoid makeinfo warning @c `unlikely character , in @var' for `P1, P2, ...'. @table @asis @item @var{f} A string containing the name of the function to minimize @item @var{X0} A vector of initial parameters fo the function @var{f}. @item @var{options} Vector with control parameters (not all parameters are used) @verbatim options(1) - Show progress (if 1, default is 0, no progress) options(2) - Relative size of simplex (default 1e-3) options(6) - Optimization algorithm if options(6)==0 - Nelder & Mead simplex (default) if options(6)==1 - Multidirectional search Method if options(6)==2 - Alternating Directions search options(5) if options(6)==0 && options(5)==0 - regular simplex if options(6)==0 && options(5)==1 - right-angled simplex Comment: the default is set to "right-angled simplex". this works better for me on a broad range of problems, although the default in nmsmax is "regular simplex" options(10) - Maximum number of function evaluations @end verbatim @item @var{grad} Unused (For compatibility with Matlab) @item @var{P1}, @var{P2}, ... Optional parameters for function @var{f} @end table @end deftypefn @c ------------------------------------------------------------------ @node nmsmax @section A Nelder-Mead simplex algorithm @findex nmsmax @subheading Helptext: @anchor{XREFnmsmax} @verbatim NMSMAX Nelder-Mead simplex method for direct search optimization. [x, fmax, nf] = NMSMAX(FUN, x0, STOPIT, SAVIT) attempts to maximize the function FUN, using the starting vector x0. The Nelder-Mead direct search method is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative size of the simplex is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). The form of the initial simplex is determined by STOPIT(4): STOPIT(4) = 0: regular simplex (sides of equal length, the default) STOPIT(4) = 1: right-angled simplex. Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). STOPIT(6) indicates the direction (ie. minimization or maximization.) Default is 1, maximization. set STOPIT(6)=-1 for minimization If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. NMSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). References: N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. C. T. Kelley, Iterative Methods for Optimization, Society for Industrial and Applied Mathematics, Philadelphia, PA, 1999. @end verbatim @c ------------------------------------------------------------------ @node mdsmax @section A multidirectional search algorithm @findex mdsmax @subheading Helptext: @anchor{XREFmdsmax} @verbatim MDSMAX Multidirectional search method for direct search optimization. [x, fmax, nf] = MDSMAX(FUN, x0, STOPIT, SAVIT) attempts to maximize the function FUN, using the starting vector x0. The method of multidirectional search is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative size of the simplex is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). The form of the initial simplex is determined by STOPIT(4): STOPIT(4) = 0: regular simplex (sides of equal length, the default), STOPIT(4) = 1: right-angled simplex. Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. MDSMAX(fun, x0, STOPIT, SAVIT, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). This implementation uses 2n^2 elements of storage (two simplices), where x0 is an n-vector. It is based on the algorithm statement in [2, sec.3], modified so as to halve the storage (with a slight loss in readability). References: [1] V. J. Torczon, Multi-directional search: A direct search algorithm for parallel machines, Ph.D. Thesis, Rice University, Houston, Texas, 1989. [2] V. J. Torczon, On the convergence of the multidirectional search algorithm, SIAM J. Optimization, 1 (1991), pp. 123-145. [3] N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. [4] N. J. Higham, Accuracy and Stability of Numerical Algorithms, Second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2002; sec. 20.5. @end verbatim @c ------------------------------------------------------------------ @node adsmax @section An alternating directions algorithm @findex adsmax @subheading Helptext: @anchor{XREFadsmax} @verbatim ADSMAX Alternating directions method for direct search optimization. [x, fmax, nf] = ADSMAX(FUN, x0, STOPIT, SAVIT, P) attempts to maximize the function FUN, using the starting vector x0. The alternating directions direct search method is used. Output arguments: x = vector yielding largest function value found, fmax = function value at x, nf = number of function evaluations. The iteration is terminated when either - the relative increase in function value between successive iterations is <= STOPIT(1) (default 1e-3), - STOPIT(2) function evaluations have been performed (default inf, i.e., no limit), or - a function value equals or exceeds STOPIT(3) (default inf, i.e., no test on function values). Progress of the iteration is not shown if STOPIT(5) = 0 (default 1). If a non-empty fourth parameter string SAVIT is present, then `SAVE SAVIT x fmax nf' is executed after each inner iteration. By default, the search directions are the co-ordinate directions. The columns of a fifth parameter matrix P specify alternative search directions (P = EYE is the default). NB: x0 can be a matrix. In the output argument, in SAVIT saves, and in function calls, x has the same shape as x0. ADSMAX(fun, x0, STOPIT, SAVIT, P, P1, P2,...) allows additional arguments to be passed to fun, via feval(fun,x,P1,P2,...). Reference: N. J. Higham, Optimization by direct search in matrix computations, SIAM J. Matrix Anal. Appl, 14(2): 317-333, 1993. N. J. Higham, Accuracy and Stability of Numerical Algorithms, Second edition, Society for Industrial and Applied Mathematics, Philadelphia, PA, 2002; sec. 20.5. @end verbatim @c ------------------------------------------------------------------ @node nelder_mead_min @section Another Nelder-Mead algorithm @findex nelder_mead_min This function does gradient-less minimization using the Nelder-Mead algorithm. No constraints are honoured. @subheading Helptext: @anchor{XREFnelder_mead_min} @verbatim [x0,v,nev] = nelder_mead_min (f,args,ctl) - Nelder-Mead minimization Minimize 'f' using the Nelder-Mead algorithm. This function is inspired from the that found in the book "Numerical Recipes". ARGUMENTS --------- f : string : Name of function. Must return a real value args : list : Arguments passed to f. or matrix : f's only argument ctl : vector : (Optional) Control variables, described below or struct RETURNED VALUES --------------- x0 : matrix : Local minimum of f v : real : Value of f in x0 nev : number : Number of function evaluations CONTROL VARIABLE : (optional) may be named arguments (i.e. "name",value ------------------ pairs), a struct, or a vector of length <= 6, where NaN's are ignored. Default values are written . OPT. VECTOR NAME POS ftol,f N/A : Stopping criterion : stop search when values at simplex vertices are all alike, as tested by f > (max_i (f_i) - min_i (f_i)) /max(max(|f_i|),1) where f_i are the values of f at the vertices. <10*eps> rtol,r N/A : Stop search when biggest radius of simplex, using infinity-norm, is small, as tested by : ctl(2) > Radius <10*eps> vtol,v N/A : Stop search when volume of simplex is small, tested by ctl(2) > Vol crit,c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'rtol' (c=2) or 'vtol' (c=3) to the value of the 'tol' option. <1> tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> narg ctl(3) : Position of the minimized argument in args <1> maxev ctl(4) : Maximum number of function evaluations. This number may be slightly exceeded. isz ctl(5) : Size of initial simplex, which is : <1> { x + e_i | i in 0..N } Where x == args{narg} is the initial value e_0 == zeros (size (x)), e_i(j) == 0 if j != i and e_i(i) == ctl(5) e_i has same size as x Set ctl(5) to the distance you expect between the starting point and the minimum. rst ctl(6) : When a minimum is found the algorithm restarts next to it until the minimum does not improve anymore. ctl(6) is the maximum number of restarts. Set ctl(6) to zero if you know the function is well-behaved or if you don't mind not getting a true minimum. <0> verbose, v Be more or less verbose (quiet=0) <0> @end verbatim @c ------------------------------------------------------------------ @node powell @section Direction-set (Powell's) method @findex powell @c powell ../inst/powell.m @anchor{XREFpowell} @deftypefn {Function File} {[@var{p}, @var{obj_value}, @var{convergence}, @var{iters}, @var{nevs}] =} powell (@var{f}, @var{p0}, @var{control}) Multidimensional minimization (direction-set method). Implements a direction-set (Powell's) method for multidimensional minimization of a function without calculation of the gradient [1, 2] @subheading Arguments @itemize @bullet @item @var{f}: name of function to minimize (string or handle), which should accept one input variable (see example for how to pass on additional input arguments) @item @var{p0}: An initial value of the function argument to minimize @item @var{options}: an optional structure, which can be generated by optimset, with some or all of the following fields: @itemize @minus @item MaxIter: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) @item TolFun: minimum amount by which function value must decrease in each iteration to continue (default is 1E-8) @item MaxFunEvals: maximum function evaluations (positive integer, or -1 or Inf for unlimited (default)) @item SearchDirections: an n*n matrix whose columns contain the initial set of (presumably orthogonal) directions to minimize along, where n is the number of elements in the argument to be minimized for; or an n*1 vector of magnitudes for the initial directions (defaults to the set of unit direction vectors) @end itemize @end itemize @subheading Examples @example @group y = @@(x, s) x(1) ^ 2 + x(2) ^ 2 + s; o = optimset('MaxIter', 100, 'TolFun', 1E-10); s = 1; [x_optim, y_min, conv, iters, nevs] = powell(@@(x) y(x, s), [1 0.5], o); %pass y wrapped in an anonymous function so that all other arguments to y, which are held constant, are set %should return something like x_optim = [4E-14 3E-14], y_min = 1, conv = 1, iters = 2, nevs = 24 @end group @end example @subheading Returns: @itemize @bullet @item @var{p}: the minimizing value of the function argument @item @var{obj_value}: the value of @var{f}() at @var{p} @item @var{convergence}: 1 if normal convergence, 0 if not @item @var{iters}: number of iterations performed @item @var{nevs}: number of function evaluations @end itemize @subheading References @enumerate @item Powell MJD (1964), An efficient method for finding the minimum of a function of several variables without calculating derivatives, @cite{Computer Journal}, 7 :155-162 @item Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (1992). @cite{Numerical Recipes in Fortran: The Art of Scientific Computing} (2nd Ed.). New York: Cambridge University Press (Section 10.5) @end enumerate @end deftypefn @c ------------------------------------------------------------------ @node standalone d2_min @section Unconstrained Newton-like minimization, standalone function @findex standalone d2_min This function features a Newton-like algorithm. The user has to supply a Hessian function. No constraints are honoured. This function is deprecated. It is now available as a backend for @code{nonlin_min} (@mysee @ref{nonlin_min}). @subheading Helptext: @anchor{XREFd2_min} @verbatim [x,v,nev,h,args] = d2_min(f,d2f,args,ctl,code) - Newton-like minimization Minimize f(x) using 1st and 2nd derivatives. Any function w/ second derivatives can be minimized, as in Newton. f(x) decreases at each iteration, as in Levenberg-Marquardt. This function is inspired from the Levenberg-Marquardt algorithm found in the book "Numerical Recipes". ARGUMENTS : f : string : Cost function's name d2f : string : Name of function returning the cost (1x1), its differential (1xN) and its second differential or it's pseudo-inverse (NxN) (see ctl(5) below) : [v,dv,d2v] = d2f (x). args : list : f and d2f's arguments. By default, minimize the 1st or matrix : argument. ctl : vector : Control arguments (see below) or struct code : string : code will be evaluated after each outer loop that produced some (any) improvement. Variables visible from "code" include "x", the best parameter found, "v" the best value and "args", the list of all arguments. All can be modified. This option can be used to re-parameterize the argument space during optimization CONTROL VARIABLE ctl : (optional). May be a struct or a vector of length ---------------------- 5 or less where NaNs are ignored. Default values are written . FIELD VECTOR NAME POS ftol, f N/A : Stop search when value doesn't improve, as tested by f > Deltaf/max(|f(x)|,1) where Deltaf is the decrease in f observed in the last iteration. <10*sqrt(eps)> utol, u N/A : Stop search when updates are small, as tested by u > max { dx(i)/max(|x(i)|,1) | i in 1..N } where dx is the change in the x that occured in the last iteration. dtol, d N/A : Stop search when derivative is small, as tested by d > norm (dv) crit, c ctl(1) : Set one stopping criterion, 'ftol' (c=1), 'utol' (c=2) or 'dtol' (c=3) to the value of by the 'tol' option. <1> tol, t ctl(2) : Threshold in termination test chosen by 'crit' <10*eps> narg, n ctl(3) : Position of the minimized argument in args <1> maxev,m ctl(4) : Maximum number of function evaluations maxout,m : Maximum number of outer loops id2f, i ctl(5) : 0 if d2f returns the 2nd derivatives, 1 if <0> it returns its pseudo-inverse. verbose, v N/A : Be more or less verbose (quiet=0) <0> @end verbatim @c ------------------------------------------------------------------ @node bfgsmin @section Unconstrained BFGS algorithm @findex bfgsmin BFGS or limited memory BFGS minimization of a function. No constraits are honoured. @subheading Helptext: @anchor{XREFbfgsmin} @verbatim bfgsmin: bfgs or limited memory bfgs minimization of function Usage: [x, obj_value, convergence, iters] = bfgsmin(f, args, control) The function must be of the form [value, return_2,..., return_m] = f(arg_1, arg_2,..., arg_n) By default, minimization is w.r.t. arg_1, but it can be done w.r.t. any argument that is a vector. Numeric derivatives are used unless analytic derivatives are supplied. See bfgsmin_example.m for methods. Arguments: * f: name of function to minimize (string) * args: a cell array that holds all arguments of the function The argument with respect to which minimization is done MUST be a vector * control: an optional cell array of 1-8 elements. If a cell array shorter than 8 elements is provided, the trailing elements are provided with default values. * elem 1: maximum iterations (positive integer, or -1 or Inf for unlimited (default)) * elem 2: verbosity 0 = no screen output (default) 1 = only final results 2 = summary every iteration 3 = detailed information * elem 3: convergence criterion 1 = strict (function, gradient and param change) (default) 0 = weak - only function convergence required * elem 4: arg in f_args with respect to which minimization is done (default is first) * elem 5: (optional) Memory limit for lbfgs. If it's a positive integer then lbfgs will be use. Otherwise ordinary bfgs is used * elem 6: function change tolerance, default 1e-12 * elem 7: parameter change tolerance, default 1e-6 * elem 8: gradient tolerance, default 1e-5 Returns: * x: the minimizer * obj_value: the value of f() at x * convergence: 1 if normal conv, other values if not * iters: number of iterations performed Example: see bfgsmin_example.m @end verbatim @c ------------------------------------------------------------------ @node nrm @section Newton-Raphson algorithm @findex nrm No constraints are honoured. @c nrm ../inst/nrm.m @anchor{XREFnrm} @deftypefn {Function File} {@var{xmin} =} nrm (@var{f},@var{x0}) Using @var{x0} as a starting point find a minimum of the scalar function @var{f}. The Newton-Raphson method is used. @end deftypefn @c ------------------------------------------------------------------ @node cg_min @section A conjugate gradient method @findex cg_min @c cg_min ../inst/cg_min.m @anchor{XREFcg_min} @deftypefn {Function File} {[@var{x0},@var{v},@var{nev}]} cg_min ( @var{f},@var{df},@var{args},@var{ctl} ) NonLinear Conjugate Gradient method to minimize function @var{f}. @subheading Arguments @itemize @bullet @item @var{f} : string : Name of function. Return a real value @item @var{df} : string : Name of f's derivative. Returns a (R*C) x 1 vector @item @var{args}: cell : Arguments passed to f.@* @item @var{ctl} : 5-vec : (Optional) Control variables, described below @end itemize @subheading Returned values @itemize @bullet @item @var{x0} : matrix : Local minimum of f @item @var{v} : real : Value of f in x0 @item @var{nev} : 1 x 2 : Number of evaluations of f and of df @end itemize @subheading Control Variables @itemize @bullet @item @var{ctl}(1) : 1 or 2 : Select stopping criterion amongst : @item @var{ctl}(1)==0 : Default value @item @var{ctl}(1)==1 : Stopping criterion : Stop search when value doesn't improve, as tested by @math{ ctl(2) > Deltaf/max(|f(x)|,1) } where Deltaf is the decrease in f observed in the last iteration (each iteration consists R*C line searches). @item @var{ctl}(1)==2 : Stopping criterion : Stop search when updates are small, as tested by @math{ ctl(2) > max @{ dx(i)/max(|x(i)|,1) | i in 1..N @}} where dx is the change in the x that occured in the last iteration. @item @var{ctl}(2) : Threshold used in stopping tests. Default=10*eps @item @var{ctl}(2)==0 : Default value @item @var{ctl}(3) : Position of the minimized argument in args Default=1 @item @var{ctl}(3)==0 : Default value @item @var{ctl}(4) : Maximum number of function evaluations Default=inf @item @var{ctl}(4)==0 : Default value @item @var{ctl}(5) : Type of optimization: @item @var{ctl}(5)==1 : "Fletcher-Reves" method @item @var{ctl}(5)==2 : "Polak-Ribiere" (Default) @item @var{ctl}(5)==3 : "Hestenes-Stiefel" method @end itemize @var{ctl} may have length smaller than 4. Default values will be used if ctl is not passed or if nan values are given. @subheading Example: function r=df( l ) b=[1;0;-1]; r = -( 2*l@{1@} - 2*b + rand(size(l@{1@}))); endfunction @* function r=ff( l ) b=[1;0;-1]; r = (l@{1@}-b)' * (l@{1@}-b); endfunction @* ll = @{ [10; 2; 3] @}; @* ctl(5) = 3; @* [x0,v,nev]=cg_min( "ff", "df", ll, ctl ) @* Comment: In general, BFGS method seems to be better performin in many cases but requires more computation per iteration See also http://en.wikipedia.org/wiki/Nonlinear_conjugate_gradient. @seealso{@ref{XREFbfgsmin,,bfgsmin}} @end deftypefn @c ------------------------------------------------------------------ @node brent_line_min @section Linesearch, Brent method @findex brent_line_min @c brent_line_min ../inst/brent_line_min.m @anchor{XREFbrent_line_min} @deftypefn {Function File} {[@var{s},@var{v},@var{n}]} brent_line_min ( @var{f},@var{df},@var{args},@var{ctl} ) Line minimization of f along df Finds minimum of f on line @math{ x0 + dx*w | a < w < b } by bracketing. a and b are passed through argument ctl. @subheading Arguments @itemize @bullet @item @var{f} : string : Name of function. Must return a real value @item @var{args} : cell : Arguments passed to f or RxC : f's only argument. x0 must be at @var{args}@{ @var{ctl}(2) @} @item @var{ctl} : 5 : (optional) Control variables, described below. @end itemize @subheading Returned values @itemize @bullet @item @var{s} : 1 : Minimum is at x0 + s*dx @item @var{v} : 1 : Value of f at x0 + s*dx @item @var{nev} : 1 : Number of function evaluations @end itemize @subheading Control Variables @itemize @bullet @item @var{ctl}(1) : Upper bound for error on s Default=sqrt(eps) @item @var{ctl}(2) : Position of minimized argument in args Default= 1 @item @var{ctl}(3) : Maximum number of function evaluations Default= inf @item @var{ctl}(4) : a Default=-inf @item @var{ctl}(5) : b Default= inf @end itemize Default values will be used if ctl is not passed or if nan values are given. @end deftypefn @c ------------------------------------------------------------------ @node line_min @section Linesearch, minimize a function along dx @findex line_min @subheading Helptext: @anchor{XREFline_min} @verbatim [a,fx,nev] = line_min (f, dx, args, narg, h, nev_max) - Minimize f() along dx INPUT ---------- f : string : Name of minimized function dx : matrix : Direction along which f() is minimized args : cell : Arguments of f narg : integer : Position of minimized variable in args. Default=1 h : scalar : Step size to use for centered finite difference approximation of first and second derivatives. Default=1E-3. nev_max : integer : Maximum number of function evaluations. Default=30 OUTPUT --------- a : scalar : Value for which f(x+a*dx) is a minimum (*) fx : scalar : Value of f(x+a*dx) at minimum (*) nev : integer : Number of function evaluations (*) The notation f(x+a*dx) assumes that args == {x}. Reference: David G Luenberger's Linear and Nonlinear Programming @end verbatim @c ------------------------------------------------------------------ @node samin @section A simulated annealing (stochastic) optimizer @findex samin This simulated annealing algorithm implicitely honours bound constraints. @subheading Helptext: @anchor{XREFsamin} @verbatim samin: simulated annealing minimization of a function. See samin_example.m usage: [x, obj, convergence, details] = samin(\"f\", {args}, {control}) Arguments: * \"f\": function name (string) * {args}: a cell array that holds all arguments of the function, * {control}: a cell array with 11 elements * LB - vector of lower bounds * UB - vector of upper bounds * nt - integer: # of iterations between temperature reductions * ns - integer: # of iterations between bounds adjustments * rt - (0 < rt <1): temperature reduction factor * maxevals - integer: limit on function evaluations * neps - integer: number of values final result is compared to * functol - (> 0): the required tolerance level for function value comparisons * paramtol - (> 0): the required tolerance level for parameters * verbosity - scalar: 0, 1, or 2. * 0 = no screen output * 1 = only final results to screen * 2 = summary every temperature change * minarg - integer: which of function args is minimization over? Returns: * x: the minimizer * obj: the value of f() at x * convergence: 0 if no convergence within maxevals function evaluations 1 if normal convergence to a point interior to the parameter space 2 if convergence to point very near bounds of parameter space (suggest re-running with looser bounds) * details: a px3 matrix. p is the number of times improvements were found. The columns record information at the time an improvement was found * first: cumulative number of function evaluations * second: temperature * third: function value Example: see samin_example @end verbatim @c ------------------------------------------------------------------ @node de_min @section A differential evolution (stochastic) optimizer @findex de_min @subheading Helptext: @anchor{XREFde_min} @verbatim de_min: global optimisation using differential evolution Usage: [x, obj_value, nfeval, convergence] = de_min(fcn, control) minimization of a user-supplied function with respect to x(1:D), using the differential evolution (DE) method based on an algorithm by Rainer Storn (http://www.icsi.berkeley.edu/~storn/code.html) See: http://www.softcomputing.net/tevc2009_1.pdf Arguments: --------------- fcn string : Name of function. Must return a real value control vector : (Optional) Control variables, described below or struct Returned values: ---------------- x vector : parameter vector of best solution obj_value scalar : objective function value of best solution nfeval scalar : number of function evaluations convergence : 1 = best below value to reach (VTR) 0 = population has reached defined quality (tol) -1 = some values are close to constraints/boundaries -2 = max number of iterations reached (maxiter) -3 = max number of functions evaluations reached (maxnfe) Control variable: (optional) may be named arguments (i.e. "name",value ---------------- pairs), a struct, or a vector, where NaN's are ignored. XVmin : vector of lower bounds of initial population *** note: by default these are no constraints *** XVmax : vector of upper bounds of initial population constr : 1 -> enforce the bounds not just for the initial population const : data vector (remains fixed during the minimization) NP : number of population members F : difference factor from interval [0, 2] CR : crossover probability constant from interval [0, 1] strategy : 1 --> DE/best/1/exp 7 --> DE/best/1/bin 2 --> DE/rand/1/exp 8 --> DE/rand/1/bin 3 --> DE/target-to-best/1/exp 9 --> DE/target-to-best/1/bin 4 --> DE/best/2/exp 10--> DE/best/2/bin 5 --> DE/rand/2/exp 11--> DE/rand/2/bin 6 --> DEGL/SAW/exp else DEGL/SAW/bin refresh : intermediate output will be produced after "refresh" iterations. No intermediate output will be produced if refresh is < 1 VTR : Stopping criterion: "Value To Reach" de_min will stop when obj_value <= VTR. Use this if you know which value you expect. tol : Stopping criterion: "tolerance" stops if (best-worst)/max(1,worst) < tol This stops basically if the whole population is "good". maxnfe : maximum number of function evaluations maxiter : maximum number of iterations (generations) The algorithm seems to work well only if [XVmin,XVmax] covers the region where the global minimum is expected. DE is also somewhat sensitive to the choice of the difference factor F. A good initial guess is to choose F from interval [0.5, 1], e.g. 0.8. CR, the crossover probability constant from interval [0, 1] helps to maintain the diversity of the population and is rather uncritical but affects strongly the convergence speed. If the parameters are correlated, high values of CR work better. The reverse is true for no correlation. Experiments suggest that /bin likes to have a slightly larger CR than /exp. The number of population members NP is also not very critical. A good initial guess is 10*D. Depending on the difficulty of the problem NP can be lower than 10*D or must be higher than 10*D to achieve convergence. Default Values: --------------- XVmin = [-2]; XVmax = [ 2]; constr= 0; const = []; NP = 10 *D F = 0.8; CR = 0.9; strategy = 12; refresh = 0; VTR = -Inf; tol = 1.e-3; maxnfe = 1e6; maxiter = 1000; Example to find the minimum of the Rosenbrock saddle: ---------------------------------------------------- Define f as: function result = f(x); result = 100 * (x(2) - x(1)^2)^2 + (1 - x(1))^2; end Then type: ctl.XVmin = [-2 -2]; ctl.XVmax = [ 2 2]; [x, obj_value, nfeval, convergence] = de_min (@f, ctl); Keywords: global-optimisation optimisation minimisation @end verbatim @c ------------------------------------------------------------------ @node battery @section Repeatedly call bfgsmin @findex battery @subheading Helptext: @anchor{XREFbattery} @verbatim battery.m: repeatedly call bfgs using a battery of start values, to attempt to find global min of a nonconvex function INPUTS: func: function to mimimize args: args of function minarg: argument to minimize w.r.t. (usually = 1) startvals: kxp matrix of values to try for sure (don't include all zeros, that's automatic) max iters per start value number of additional random start values to try OUTPUT: theta - the best value found - NOT iterated to convergence @end verbatim @c ------------------------------------------------------------------ @node Residual optimization @chapter Functions for optimization of a model function returning an array @cindex residual optimization Model functions whose parameters are to be optimized may return a vector or array of values. Either these or their differences to some constant values (curve fitting) can be minimized in some sense, often, but not necessarily, by minimizing the sum of their squares. It is usually preferable to use optimizers designed for residual optimization for this purpose. These can exploit information contained in the individual elements of the returned array, which would not be possible if the user calculated a norm (@abbr{e.g@.} sum of squares) of the elements and performed a scalar optimization. @menu Optimization frontends * nonlin_residmin:: The standard interface for non-linear residual minimization. * nonlin_curvefit:: A convenience interface, curve fitting. Optimization backends * lm_svd_feasible:: L/M algorithm with SVD, constraints met throughout optimization. Statistics frontends * residmin_stat:: Statistics for residual minimization. * curvefit_stat:: Statistics for curve fitting. Statistics backends * wls:: Statistics for weighted least squares. Standalone functions * leasqr:: An older function for curve fitting. * expfit:: Prony's method for non-linear exponential fitting. * polyfitinf:: Function polyfitinf for polynomial fitting. * wpolyfit:: Polynomial fitting suitable for polyconf. * polyconf:: Confidence and prediction intervals for polynomial fitting. * LinearRegression:: Function LinearRegression. * wsolve:: Another linear solver. * linprog:: Linear programming. @end menu @c ------------------------------------------------------------------ @node nonlin_residmin @section Frontend nonlin_residmin for non-linear residual minimization @findex nonlin_residmin @c include function help here @c nonlin_residmin ../inst/nonlin_residmin.m @anchor{XREFnonlin_residmin} @deftypefn {Function File} {[@var{p}, @var{resid}, @var{cvg}, @var{outp}] =} nonlin_residmin (@var{f}, @var{pin}) @deftypefnx {Function File} {[@var{p}, @var{resid}, @var{cvg}, @var{outp}] =} nonlin_residmin (@var{f}, @var{pin}, @var{settings}) Frontend for nonlinear minimization of residuals returned by a model function. The functions supplied by the user have a minimal interface; any additionally needed constants (e.g. observed values) can be supplied by wrapping the user functions into anonymous functions. The following description applies to usage with vector-based parameter handling. Differences in usage for structure-based parameter handling will be explained separately. @var{f}: function returning the array of residuals. It gets a column vector of real parameters as argument. In gradient determination, this function may be called with an informational second argument, whose content depends on the function for gradient determination. @var{pin}: real column vector of initial parameters. @var{settings}: structure whose fields stand for optional settings referred to below. The fields can be set by @code{optimset()}. The returned values are the column vector of final parameters @var{p}, the final array of residuals @var{resid}, an integer @var{cvg} indicating if and how optimization succeeded or failed, and a structure @var{outp} with additional information, curently with the fields: @code{niter}, the number of iterations and @code{user_interaction}, information on user stops (see settings). The backend may define additional fields. @var{cvg} is greater than zero for success and less than or equal to zero for failure; its possible values depend on the used backend and currently can be @code{0} (maximum number of iterations exceeded), @code{2} (parameter change less than specified precision in two consecutive iterations), or @code{3} (improvement in objective function -- e.g. sum of squares -- less than specified), or @code{-1} (algorithm aborted by a user function). @c The following block will be cut out in the package info file. @seealso{@ref{XREFnonlin_curvefit,,nonlin_curvefit}} @end deftypefn @subheading Settings The fields of the @var{settings} structure can be set with @ref{XREFoptimset,,optimset,octave}. For settings common to all frontends (including these for statistics) @mysee @ref{Common frontend options}. For additional settings common to all optimization frontends @mysee @ref{Common optimization options}. @subsubheading Specific defaults: @multitable {@code{Algorithm}} {"lm_svd_feasible"} @item @code{Algorithm}: @tab "lm_svd_feasible" @end multitable @subsubheading Additional settings: @table @code @item weights Array of weights for the residuals. Dimensions must match. @anchor{XREFoptiondfdp} @item dfdp Function computing the Jacobian of the residuals with respect to the parameters, assuming residuals are reshaped to a column vector. Default: real finite differences. Will be called with the column vector of parameters and an informational structure as arguments. If @code{dfdp} was specified by the user, the informational structure has the fields @code{f}: value of residuals for current parameters, reshaped to a column vector, @code{fixed}: logical vector indicating which parameters are not optimized, so these partial derivatives need not be computed and can be set to zero, @code{diffp}, @code{diff_onesided}, @code{lbound}, @code{ubound}: identical to the user settings of this name, @code{plabels}: 1-dimensional cell-array of column-cell-arrays, each column with labels for all parameters; the first column contains the numerical indices of the parameters; the second and third columns, present for structure based parameter handling, @mysee @ref{Parameter structures}, contain the names of the parameters and the subindices of the parameters, @mysee @ref{Non-scalar parameters}, respectively. The default jacobian function will call the model function with the second argument set with fields @code{f}: as the @code{f} passed to the jacobian function, @code{plabels}: cell-array of 1x1 cell-arrays with the entries of the column-cell-arrays of @code{plabels} as passed to the jacobian function corresponding to current parameter, @code{side}: @code{0} for one-sided interval, @code{1} or @code{2}, respectively, for the sides of a two-sided interval, and @code{parallel}: logical scalar indicating parallel computation of partial derivatives. This information can be useful if the model function can omit some computations depending on the currently computed partial derivative. @item complex_step_derivative_f Logical scalar, default: @code{false}. Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (@code{dfdp}) must be specified. @item plot_cmd Function enabling backend to plot results or intermediate results. Will be called with current computed residuals. Default: plot nothing. This setting is deprecated and will disappear. Please use @code{user_interaction} instead (@mysee @ref{Common optimization options}). @end table @subheading Structure based parameter handling Please @mysee @ref{Parameter structures}. @subheading Backend information Please @mysee @ref{Residual optimization} and choose backend from menu under `Optimization backends'. @c ------------------------------------------------------------------ @node nonlin_curvefit @section Function nonlin_curvefit() for curve fitting @findex nonlin_curvefit In curve fitting, the model function computes values from a constant set of `independents', and the intention is to minimize the differences of these computed values to a constant set of `observations'. This can be done with @code{nonlin_residmin}, but it is more convenient to use @code{nonlin_curvefit}, which cares for passing the constant `independents' to the model function and for calculating the differences to the constant `observations'. However, if in some optimization problem you notice that you end up with passing dummy-values for the `independents' and zeros for the `observations', you can more naturally use @code{nonlin_residmin} instead of @code{nonlin_curvefit}. @c include function help here @c nonlin_curvefit ../inst/nonlin_curvefit.m @anchor{XREFnonlin_curvefit} @deftypefn {Function File} {[@var{p}, @var{fy}, @var{cvg}, @var{outp}] =} nonlin_curvefit (@var{f}, @var{pin}, @var{x}, @var{y}) @deftypefnx {Function File} {[@var{p}, @var{fy}, @var{cvg}, @var{outp}] =} nonlin_curvefit (@var{f}, @var{pin}, @var{x}, @var{y}, @var{settings}) Frontend for nonlinear fitting of values, computed by a model function, to observed values. Please refer to the description of @code{nonlin_residmin}. The differences to @code{nonlin_residmin} are the additional arguments @var{x} (independent values, mostly, but not necessarily, an array of the same dimensions or the same number of rows as @var{y}) and @var{y} (array of observations), the returned value @var{fy} (final guess for observed values) instead of @var{resid}, that the model function has a second obligatory argument which will be set to @var{x} and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to @var{x}. @c The following block will be cut out in the package info file. @seealso{@ref{XREFnonlin_residmin,,nonlin_residmin}} @end deftypefn @c replace the cut out text Also, if the setting @code{user_interaction} is given, additional information is passed to these functions, @mysee @ref{Common optimization options}. @c ------------------------------------------------------------------ @node lm_svd_feasible @section Default backend lm_svd_feasible of residual minimization @cindex lm_svd_feasible Levenberg/Marquardt algorithm using singular value decomposition. Constraints must be met by the initial parameters and are attempted to be kept met throughout the optimization. Returned value @var{cvg} will be @code{0}, @code{1}, or @code{2}. Returned structure @var{outp} will have the fields @code{niter} and @code{user_interaction}. Backend-specific defaults are: @code{MaxIter}: 20, @code{fract_prec}: @code{zeros (size (parameters))}, @code{max_fract_change}: @code{Inf} for all parameters. The setting @code{TolX} is not honoured. Interpretation of @code{Display}: if set to @qcode{"iter"}, currently some diagnostics are printed. Specific option: @code{lm_svd_feasible_alt_s}: if falling back to nearly gradient descent, do it more like original Levenberg/Marquardt method, with descent in each gradient component; for testing only. @c ------------------------------------------------------------------ @node residmin_stat @section Statistics for residual minimization @findex residmin_stat @c include function help here @c residmin_stat ../inst/residmin_stat.m @anchor{XREFresidmin_stat} @deftypefn {Function File} {@var{info} =} residmin_stat (@var{f}, @var{p}, @var{settings}) Frontend for computation of statistics for a residual-based minimization. @var{settings} is a structure whose fields can be set by @code{optimset}. With @var{settings} the computation of certain statistics is requested by setting the fields @code{ret_} to @code{true}. The respective statistics will be returned in a structure as fields with name @code{}. Depending on the requested statistic and on the additional information provided in @var{settings}, @var{f} and @var{p} may be empty. Otherwise, @var{f} is the model function of an optimization (the interface of @var{f} is described e.g. in @code{nonlin_residmin}, please see there), and @var{p} is a real column vector with parameters resulting from the same optimization. Currently, the following statistics (or general information) can be requested: @code{dfdp}: Jacobian of model function with respect to parameters. @code{covd}: Covariance matrix of data (typically guessed by applying a factor to the covariance matrix of the residuals). @code{covp}: Covariance matrix of final parameters. @code{corp}: Correlation matrix of final parameters. @c The following block will be cut out in the package info file. @seealso{@ref{XREFcurvefit_stat,,curvefit_stat}} @end deftypefn @subheading Further settings The fields of the @var{settings} structure can be set with @ref{XREFoptimset,,optimset,octave}. For settings common to all frontends @mysee @ref{Common frontend options}. @subsubheading Additional settings: @table @code @item objf_type Type of objective function of the optimization; must be specified in many cases. This determines which backends to use. Currently, there are only backends for the type "wls" (weighted least squares). @item residuals @item covd Optional information on the result of optimization, residuals and covariance matrix of data, respectively. @item weights Array of weights applied to the residuals in the previous optimization. Dimensions must match those of the residuals. @item dfdp Can be set in the same way and has the same default as in @code{nonlin_residmin} (@mysee @ref{nonlin_residmin}), but alternatively may already contain the computed Jacobian of the model function at the final parameters in matrix- or structure-form. @item complex_step_derivative_f Estimate Jacobian of model function with complex step derivative approximation. Use only if you know that your model function is suitable for this. No user function for the Jacobian (@code{dfdp}) must be specified. @end table @subheading Structure based parameter handling Please @mysee @ref{Parameter structures}. @subheading Backend information Please @mysee @ref{Residual optimization} and choose backend from menu under `Statistics backends'. @c ------------------------------------------------------------------ @node curvefit_stat @section Statistics for curve fitting @findex curvefit_stat As @code{nonlin_curvefit} can be used instead of @code{nonlin_residmin} for curve fitting (@mysee @ref{nonlin_curvefit}, @mysee @ref{nonlin_residmin}), @code{curvefit_stat} can be used instead of @code{residmin_stat} (@mysee @ref{residmin_stat}) for statistics on the results of curve fitting. @c include function help here @c curvefit_stat ../inst/curvefit_stat.m @anchor{XREFcurvefit_stat} @deftypefn {Function File} {@var{info} =} curvefit_stat (@var{f}, @var{p}, @var{x}, @var{y}, @var{settings}) Frontend for computation of statistics for fitting of values, computed by a model function, to observed values. Please refer to the description of @code{residmin_stat}. The only differences to @code{residmin_stat} are the additional arguments @var{x} (independent values) and @var{y} (observations), that the model function @var{f}, if provided, has a second obligatory argument which will be set to @var{x} and is supposed to return guesses for the observations (with the same dimensions), and that the possibly user-supplied function for the jacobian of the model function has also a second obligatory argument which will be set to @var{x}. @seealso{@ref{XREFresidmin_stat,,residmin_stat}} @end deftypefn @c ------------------------------------------------------------------ @node wls @section Statistics for weighted least squares @cindex wls @cindex statistics for weighted least squares The backends for @code{objf_type == "wls"} (currently the only supported type of objective function) compute @code{covd} (due to user request or as a prerequisite for @code{covp} and @code{corp}) as a diagonal matrix by assuming that the variances of data points are proportional to the reciprocal of the squared @code{weights} and guessing the factor of proportionality from the residuals. If @code{covp} is not defined (e.g. because the Jacobian has no full rank), an attempt is made to still compute its uniquely defined elements, if any. In @code{corp}, interdependent parameters can cause elements of @code{1} or @code{-1}, which in this case are not the real coefficients of correlation, but rather indicate the direction of parameter interdependence. To be consistent with this, an attempt is made (often not successful) to identify parameter interdependence and mark it with elements of @code{1} or @code{-1} in @code{corp} even if the respective elements of @code{covp} can not be computed. @c ------------------------------------------------------------------ @node leasqr @section An older function for curve fitting @findex leasqr This was a popular function for curve fitting and has been enhanced to honour constraints. @code{nonlin_curvefit} (@mysee @ref{nonlin_curvefit}) does now the same job if used with the default backend, and should be prefered due to its more powerful interface. The statistics returned by @code{leasqr} can also (and partially better) be computed with @code{curvefit_stat} (@mysee @ref{curvefit_stat}). There are currently two things which still only @code{leasqr} does: @itemize @item internally providing a function for plotting fits during optimization, @item returning a pre-computed matrix for determining confidence regions. @end itemize @c leasqr ../inst/leasqr.m @anchor{XREFleasqr} @deftypefn {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}, @var{dFdp}) @deftypefnx {Function File} {} leasqr (@var{x}, @var{y}, @var{pin}, @var{F}, @var{stol}, @var{niter}, @var{wt}, @var{dp}, @var{dFdp}, @var{options}) @deftypefnx {Function File} {[@var{f}, @var{p}, @var{cvg}, @var{iter}, @var{corp}, @var{covp}, @var{covr}, @var{stdresid}, @var{Z}, @var{r2}] =} leasqr (@dots{}) Levenberg-Marquardt nonlinear regression. Input arguments: @table @var @item x Vector or matrix of independent variables. @item y Vector or matrix of observed values. @item pin Vector of initial parameters to be adjusted by leasqr. @item F Name of function or function handle. The function must be of the form @code{y = f(x, p)}, with y, x, p of the form @var{y}, @var{x}, @var{pin}. @item stol Scalar tolerance on fractional improvement in scalar sum of squares, i.e., @code{sum ((@var{wt} .* (@var{y}-@var{f}))^2)}. Set to 0.0001 if empty or not given; @item niter Maximum number of iterations. Set to 20 if empty or not given. @item wt Statistical weights (same dimensions as @var{y}). These should be set to be proportional to @code{sqrt (@var{y}) ^-1}, i.e., the covariance matrix of the data is assumed to be proportional to diagonal with diagonal equal to @code{(@var{wt}.^2)^-1}. The constant of proportionality will be estimated. Set to @code{ones (size (@var{y}))} if empty or not given. @item dp Fractional increment of @var{p} for numerical partial derivatives. Set to @code{0.001 * ones (size (@var{pin}))} if empty or not given. @itemize @bullet @item dp(j) > 0 means central differences on j-th parameter p(j). @item dp(j) < 0 means one-sided differences on j-th parameter p(j). @item dp(j) = 0 holds p(j) fixed, i.e., leasqr won't change initial guess: pin(j) @end itemize @item dFdp Name of partial derivative function in quotes or function handle. If not given or empty, set to @code{dfdp}, a slow but general partial derivatives function. The function must be of the form @code{prt = dfdp (x, f, p, dp, F [,bounds])}. For backwards compatibility, the function will only be called with an extra 'bounds' argument if the 'bounds' option is explicitly specified to leasqr (see dfdp.m). @item options Structure with multiple options. The following fields are recognized: @table @asis @item @qcode{fract_prec} Column vector (same length as @var{pin}) of desired fractional precisions in parameter estimates. Iterations are terminated if change in parameter vector (chg) relative to current parameter estimate is less than their corresponding elements in 'fract_prec', i.e., @code{all (abs (chg) < abs (options.fract_prec .* current_parm_est))} on two consecutive iterations. Defaults to @code{zeros (size (@var{pin}))}. @item @qcode{max_fract_change} Column vector (same length as @var{pin}) of maximum fractional step changes in parameter vector. Fractional change in elements of parameter vector is constrained to be at most 'max_fract_change' between sucessive iterations, i.e., @code{abs (chg(i)) = abs (min([chg(i), options.max_fract_change(i) * current param estimate]))}. Defaults to @code{Inf * ones (size (@var{pin}))}. @item @qcode{inequc} Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Initial parameters must satisfy these constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say m) and a vector (say v), specifying linear inequality constraints of the form `m.' * parameters + v >= 0'. If the constraints are just bounds, it is suggested to specify them in 'options.bounds' instead, since then some sanity tests are performed, and since the function 'dfdp.m' is guarantied not to violate constraints during determination of the numeric gradient only for those constraints specified as 'bounds' (possibly with violations due to a certain inaccuracy, however, except if no constraints except bounds are specified). The first entry for general constraints must be a differentiable vector valued function (say h), specifying general inequality constraints of the form `h (p[, idx]) >= 0'; p is the column vector of optimized paraters and the optional argument idx is a logical index. h has to return the values of all constraints if idx is not given, and has to return only the indexed constraints if idx is given (so computation of the other constraints can be spared). If a second entry for general constraints is given, it must be a function (say dh) which returnes a matrix whos rows contain the gradients of the constraint function h with respect to the optimized parameters. It has the form jac_h = dh (vh, p, dp, h, idx[, bounds]); p is the column vector of optimized parameters, and idx is a logical index --- only the rows indexed by idx must be returned (so computation of the others can be spared). The other arguments of dh are for the case that dh computes numerical gradients: vh is the column vector of the current values of the constraint function h, with idx already applied. h is a function h (p) to compute the values of the constraints for parameters p, it will return only the values indexed by idx. dp is a suggestion for relative step width, having the same value as the argument 'dp' of leasqr above. If bounds were specified to leasqr, they are provided in the argument bounds of dh, to enable their consideration in determination of numerical gradients. If dh is not specified to leasqr, numerical gradients are computed in the same way as with 'dfdp.m' (see above). If some constraints are linear, they should be specified as linear constraints (or bounds, if applicable) for reasons of performance, even if general constraints are also specified. @item @qcode{bounds} Two-column-matrix, one row for each parameter in @var{pin}. Each row contains a minimal and maximal value for each parameter. Default: [-Inf, Inf] in each row. If this field is used with an existing user-side function for 'dFdp' (see above) the functions interface might have to be changed. @item @qcode{equc} Equality constraints, specified the same way as inequality constraints (see field 'options.inequc'). Initial parameters must satisfy these constraints. Note that there is possibly a certain inaccuracy in honoring constraints, except if only bounds are specified. @emph{Warning}: If constraints (or bounds) are set, returned guesses of @var{corp}, @var{covp}, and @var{Z} are generally invalid, even if no constraints are active for the final parameters. If equality constraints are specified, @var{corp}, @var{covp}, and @var{Z} are not guessed at all. @item @qcode{cpiv} Function for complementary pivot algorithm for inequality constraints. Defaults to cpiv_bard. No different function is supplied. @end table For backwards compatibility, @var{options} can also be a matrix whose first and second column contains the values of @qcode{fract_prec} and @qcode{max_fract_change}, respectively. @end table Output: @table @var @item f Column vector of values computed: f = F(x,p). @item p Column vector trial or final parameters, i.e, the solution. @item cvg Scalar: = 1 if convergence, = 0 otherwise. @item iter Scalar number of iterations used. @item corp Correlation matrix for parameters. @item covp Covariance matrix of the parameters. @item covr Diag(covariance matrix of the residuals). @item stdresid Standardized residuals. @item Z Matrix that defines confidence region (see comments in the source). @item r2 Coefficient of multiple determination, intercept form. @end table Not suitable for non-real residuals. References: Bard, Nonlinear Parameter Estimation, Academic Press, 1974. Draper and Smith, Applied Regression Analysis, John Wiley and Sons, 1981. @end deftypefn @c ------------------------------------------------------------------ @node expfit @section Prony's method for non-linear exponential fitting @findex expfit @subheading Helptext: @anchor{XREFexpfit} @verbatim USAGE [alpha,c,rms] = expfit( deg, x1, h, y ) Prony's method for non-linear exponential fitting Fit function: \sum_1^{deg} c(i)*exp(alpha(i)*x) Elements of data vector y must correspond to equidistant x-values starting at x1 with stepsize h The method is fully compatible with complex linear coefficients c, complex nonlinear coefficients alpha and complex input arguments y, x1, non-zero h . Fit-order deg must be a real positive integer. Returns linear coefficients c, nonlinear coefficients alpha and root mean square error rms. This method is known to be more stable than 'brute-force' non-linear least squares fitting. Example x0 = 0; step = 0.05; xend = 5; x = x0:step:xend; y = 2*exp(1.3*x)-0.5*exp(2*x); error = (rand(1,length(y))-0.5)*1e-4; [alpha,c,rms] = expfit(2,x0,step,y+error) alpha = 2.0000 1.3000 c = -0.50000 2.00000 rms = 0.00028461 The fit is very sensitive to the number of data points. It doesn't perform very well for small data sets. Theoretically, you need at least 2*deg data points, but if there are errors on the data, you certainly need more. Be aware that this is a very (very,very) ill-posed problem. By the way, this algorithm relies heavily on computing the roots of a polynomial. I used 'roots.m', if there is something better please use that code. Demo for a complex fit-function: deg= 2; N= 20; x1= -(1+i), x= linspace(x1,1+i/2,N).'; h = x(2) - x(1) y= (2+i)*exp( (-1-2i)*x ) + (-1+3i)*exp( (2+3i)*x ); A= 5e-2; y+= A*(randn(N,1)+randn(N,1)*i); % add complex noise [alpha,c,rms]= expfit( deg, x1, h, y ) @end verbatim @c ------------------------------------------------------------------ @node polyfitinf @section Function polyfitinf for polynomial fitting. @findex polyfitinf @subheading Helptext: @anchor{XREFpolyfitinf} @verbatim function [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT,REF0) Best polynomial approximation in discrete uniform norm INPUT VARIABLES: M : degree of the fitting polynomial N : number of data points X(N) : x-coordinates of data points Y(N) : y-coordinates of data points K : character of the polynomial: K = 0 : mixed parity polynomial K = 1 : odd polynomial ( X(1) must be > 0 ) K = 2 : even polynomial ( X(1) must be >= 0 ) EPSH : tolerance for leveling. A useful value for 24-bit mantissa is EPSH = 2.0E-7 MAXIT : upper limit for number of exchange steps REF0(M2): initial alternating set ( N-vector ). This is an OPTIONAL argument. The length M2 is given by: M2 = M + 2 , if K = 0 M2 = integer part of (M+3)/2 , if K = 1 M2 = 2 + M/2 (M must be even) , if K = 2 OUTPUT VARIABLES: A : polynomial coefficients of the best approximation in order of increasing powers: p*(x) = A(1) + A(2)*x + A(3)*x^2 + ... REF : selected alternating set of points HMAX : maximum deviation ( uniform norm of p* - f ) H : pointwise approximation errors R : total number of iterations EQUAL : success of failure of algorithm EQUAL=1 : succesful EQUAL=0 : convergence not acheived EQUAL=-1: input error EQUAL=-2: algorithm failure Relies on function EXCH, provided below. Example: M = 5; N = 10000; K = 0; EPSH = 10^-12; MAXIT = 10; X = linspace(-1,1,N); % uniformly spaced nodes on [-1,1] k=1; Y = abs(X).^k; % the function Y to approximate [A,REF,HMAX,H,R,EQUAL] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); p = polyval(A,X); plot(X,Y,X,p) % p is the best approximation Note: using an even value of M, e.g., M=2, in the example above, makes the algorithm to fail with EQUAL=-2, because of collocation, which appears because both the appriximating function and the polynomial are even functions. The way aroung it is to approximate only the right half of the function, setting K = 2 : even polynomial. For example: N = 10000; K = 2; EPSH = 10^-12; MAXIT = 10; X = linspace(0,1,N); for i = 1:2 k = 2*i-1; Y = abs(X).^k; for j = 1:4 M = 2^j; [~,~,HMAX] = polyfitinf(M,N,K,X,Y,EPSH,MAXIT); approxerror(i,j) = HMAX; end end disp('Table 3.1 from Approximation theory and methods, M.J.D.POWELL, p. 27'); disp(' '); disp(' n K=1 K=3'); disp(' '); format short g; disp([(2.^(1:4))' approxerror']); ALGORITHM: Computation of the polynomial that best approximates the data (X,Y) in the discrete uniform norm, i.e. the polynomial with the minimum value of max{ | p(x_i) - y_i | , x_i in X } . That polynomial, also known as minimax polynomial, is obtained by the exchange algorithm, a finite iterative process requiring, at most, n ( ) iterations ( usually p = M + 2. See also function EXCH ). p since this number can be very large , the routine may not converge within MAXIT iterations . The other possibility of failure occurs when there is insufficient floating point precision for the input data chosen. CREDITS: This routine was developed and modified as computer assignments in Approximation Theory courses by Prof. Andrew Knyazev, University of Colorado Denver, USA. Team Fall 98 (Revision 1.0): Chanchai Aniwathananon Crhistopher Mehl David A. Duran Saulo P. Oliveira Team Spring 11 (Revision 1.1): Manuchehr Aminian The algorithm and the comments are based on a FORTRAN code written by Joseph C. Simpson. The code is available on Netlib repository: http://www.netlib.org/toms/501 See also: Communications of the ACM, V14, pp.355-356(1971) NOTES: 1) A may contain the collocation polynomial 2) If MAXIT is exceeded, REF contains a new reference set 3) M, EPSH and REF can be altered during the execution 4) To keep consistency to the original code , EPSH can be negative. However, the use of REF0 is *NOT* determined by EPSH< 0, but only by its inclusion as an input parameter. Some parts of the code can still take advantage of vectorization. Revision 1.0 from 1998 is a direct human translation of the FORTRAN code http://www.netlib.org/toms/501 Revision 1.1 is a clean-up and technical update. Tested on MATLAB Version 7.11.0.584 (R2010b) and GNU Octave Version 3.2.4 @end verbatim @c ------------------------------------------------------------------ @node wpolyfit @section Polynomial fitting suitable for polyconf @findex wpolyfit @c wpolyfit ../inst/wpolyfit.m @anchor{XREFwpolyfit} @deftypefn {Function File} {[@var{p}, @var{s}] =} wpolyfit (@var{x}, @var{y}, @var{dy}, @var{n}) Return the coefficients of a polynomial @var{p}(@var{x}) of degree @var{n} that minimizes @c ######################################################## @c These lines must be written without space at start to work around @c a bug in html generation. @iftex @tex $$ \sum_{i=1}^N (p(x_i) - y_i)^2 $$ @end tex @end iftex @ifnottex @code{sumsq (p(x(i)) - y(i))}, @end ifnottex @c ######################################################## to best fit the data in the least squares sense. The standard error on the observations @var{y} if present are given in @var{dy}. The returned value @var{p} contains the polynomial coefficients suitable for use in the function polyval. The structure @var{s} returns information necessary to compute uncertainty in the model. To compute the predicted values of y with uncertainty use @example [y,dy] = polyconf(p,x,s,'ci'); @end example You can see the effects of different confidence intervals and prediction intervals by calling the wpolyfit internal plot function with your fit: @example feval('wpolyfit:plt',x,y,dy,p,s,0.05,'pi') @end example Use @var{dy}=[] if uncertainty is unknown. You can use a chi^2 test to reject the polynomial fit: @example p = 1-chi2cdf(s.normr^2,s.df); @end example p is the probability of seeing a chi^2 value higher than that which was observed assuming the data are normally distributed around the fit. If p < 0.01, you can reject the fit at the 1% level. You can use an F test to determine if a higher order polynomial improves the fit: @example [poly1,S1] = wpolyfit(x,y,dy,n); [poly2,S2] = wpolyfit(x,y,dy,n+1); F = (S1.normr^2 - S2.normr^2)/(S1.df-S2.df)/(S2.normr^2/S2.df); p = 1-f_cdf(F,S1.df-S2.df,S2.df); @end example p is the probability of observing the improvement in chi^2 obtained by adding the extra parameter to the fit. If p < 0.01, you can reject the lower order polynomial at the 1% level. You can estimate the uncertainty in the polynomial coefficients themselves using @example dp = sqrt(sumsq(inv(s.R'))'/s.df)*s.normr; @end example but the high degree of covariance amongst them makes this a questionable operation. @end deftypefn @deftypefn {Function File} {[@var{p}, @var{s}, @var{mu}] =} wpolyfit (...) If an additional output @code{mu = [mean(x),std(x)]} is requested then the @var{x} values are centered and normalized prior to computing the fit. This will give more stable numerical results. To compute a predicted @var{y} from the returned model use @code{y = polyval(p, (x-mu(1))/mu(2)} @end deftypefn @deftypefn {Function File} {} wpolyfit (...) If no output arguments are requested, then wpolyfit plots the data, the fitted line and polynomials defining the standard error range. Example @example x = linspace(0,4,20); dy = (1+rand(size(x)))/2; y = polyval([2,3,1],x) + dy.*randn(size(x)); wpolyfit(x,y,dy,2); @end example @end deftypefn @deftypefn {Function File} {} wpolyfit (..., 'origin') If 'origin' is specified, then the fitted polynomial will go through the origin. This is generally ill-advised. Use with caution. Hocking, RR (2003). Methods and Applications of Linear Models. New Jersey: John Wiley and Sons, Inc. @c Will be cut out in optims info file and replaced with the same @c refernces explicitely there, since references to core Octave @c functions are not automatically transformed from here to there. @seealso{@ref{XREFpolyconf,,polyconf}} @end deftypefn See also @ref{XREFpolyfit,,polyfit,octave}. @c ------------------------------------------------------------------ @node polyconf @section Confidence and prediction intervals for polynomial fitting @findex polyconf @subheading Helptext: @anchor{XREFpolyconf} @verbatim [y,dy] = polyconf(p,x,s) Produce prediction intervals for the fitted y. The vector p and structure s are returned from polyfit or wpolyfit. The x values are where you want to compute the prediction interval. polyconf(...,['ci'|'pi']) Produce a confidence interval (range of likely values for the mean at x) or a prediction interval (range of likely values seen when measuring at x). The prediction interval tells you the width of the distribution at x. This should be the same regardless of the number of measurements you have for the value at x. The confidence interval tells you how well you know the mean at x. It should get smaller as you increase the number of measurements. Error bars in the physical sciences usually show a 1-alpha confidence value of erfc(1/sqrt(2)), representing one standandard deviation of uncertainty in the mean. polyconf(...,1-alpha) Control the width of the interval. If asking for the prediction interval 'pi', the default is .05 for the 95% prediction interval. If asking for the confidence interval 'ci', the default is erfc(1/sqrt(2)) for a one standard deviation confidence interval. Example: [p,s] = polyfit(x,y,1); xf = linspace(x(1),x(end),150); [yf,dyf] = polyconf(p,xf,s,'ci'); plot(xf,yf,'g-;fit;',xf,yf+dyf,'g.;;',xf,yf-dyf,'g.;;',x,y,'xr;data;'); plot(x,y-polyval(p,x),';residuals;',xf,dyf,'g-;;',xf,-dyf,'g-;;'); @end verbatim @c ------------------------------------------------------------------ @node LinearRegression @section Function LinearRegression @findex LinearRegression @c LinearRegression ../inst/LinearRegression.m -*- texinfo -*- @deftypefn {Function File} {[@var{p},@var{e_var},@var{r},@var{p_var},@var{y_var}] =} LinearRegression (@var{F},@var{y}) @deftypefnx {Function File} {[@var{p},@var{e_var},@var{r},@var{p_var},@var{y_var}] =} LinearRegression (@var{F},@var{y},@var{w}) general linear regression determine the parameters p_j (j=1,2,...,m) such that the function f(x) = sum_(i=1,...,m) p_j*f_j(x) is the best fit to the given values y_i = f(x_i) parameters: @itemize @item @var{F} is an n*m matrix with the values of the basis functions at the support points. In column j give the values of f_j at the points x_i (i=1,2,...,n) @item @var{y} is a column vector of length n with the given values @item @var{w} is n column vector of of length n vector with the weights of data points @end itemize return values: @itemize @item @var{p} is the vector of length m with the estimated values of the parameters @item @var{e_var} is the estimated variance of the difference between fitted and measured values @item @var{r} is the weighted norm of the residual @item @var{p_var} is the estimated variance of the parameters p_j @item @var{y_var} is the estimated variance of the dependend variables @end itemize Caution: do NOT request @var{y_var} for large data sets, as a n by n matrix is generated @c Will be cut out in optims info file and replaced with the same @c refernces explicitely there, since references to core Octave @c functions are not automatically transformed from here to there. @end deftypefn See also @ref{XREFregress,,regress,octave}, @ref{XREFleasqr,,leasqr}, @ref{XREFnonlin_curvefit,,nonlin_curvefit}, @ref{XREFpolyfit,,polyfit,octave}, @ref{XREFwpolyfit,,wpolyfit}, @ref{XREFexpfit,,expfit}. @c ------------------------------------------------------------------ @node wsolve @section Function wsolve, another linear solver @findex wsolve @subheading Helptext: @anchor{XREFwsolve} @verbatim [x,s] = wsolve(A,y,dy) Solve a potentially over-determined system with uncertainty in the values. A x = y +/- dy Use QR decomposition for increased accuracy. Estimate the uncertainty for the solution from the scatter in the data. The returned structure s contains normr = sqrt( A x - y ), weighted by dy R such that R'R = A'A df = n-p, n = rows of A, p = columns of A See polyconf for details on how to use s to compute dy. The covariance matrix is inv(R'*R). If you know that the parameters are independent, then uncertainty is given by the diagonal of the covariance matrix, or dx = sqrt(N*sumsq(inv(s.R'))') where N = normr^2/df, or N = 1 if df = 0. Example 1: weighted system A=[1,2,3;2,1,3;1,1,1]; xin=[1;2;3]; dy=[0.2;0.01;0.1]; y=A*xin+randn(size(dy)).*dy; [x,s] = wsolve(A,y,dy); dx = sqrt(sumsq(inv(s.R'))'); res = [xin, x, dx] Example 2: weighted overdetermined system y = x1 + 2*x2 + 3*x3 + e A = fullfact([3,3,3]); xin=[1;2;3]; y = A*xin; dy = rand(size(y))/50; y+=dy.*randn(size(y)); [x,s] = wsolve(A,y,dy); dx = s.normr*sqrt(sumsq(inv(s.R'))'/s.df); res = [xin, x, dx] Note there is a counter-intuitive result that scaling the uncertainty in the data does not affect the uncertainty in the fit. Indeed, if you perform a monte carlo simulation with x,y datasets selected from a normal distribution centered on y with width 10*dy instead of dy you will see that the variance in the parameters indeed increases by a factor of 100. However, if the error bars really do increase by a factor of 10 you should expect a corresponding increase in the scatter of the data, which will increase the variance computed by the fit. @end verbatim @c ------------------------------------------------------------------ @node linprog @section Linear programming @findex linprog @c linprog ../inst/linprog.m @anchor{XREFlinprog} @deftypefn{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}) @deftypefnx{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}, @var{Aeq}, @var{beq}) @deftypefnx{Function File} {@var{x} =} linprog (@var{f}, @var{A}, @var{b}, @var{Aeq}, @var{beq}, @var{lb}, @var{ub}) @deftypefnx{Function File} {[@var{x}, @var{fval}] =} linprog (@dots{}) Solve a linear problem. Finds @example min (f' * x) @end example (both f and x are column vectors) subject to @example @group A * x <= b Aeq * x = beq lb <= x <= ub @end group @end example If not specified, @var{Aeq} and @var{beq} default to empty matrices. If not specified, the lower bound @var{lb} defaults to minus infinite and the upper bound @var{ub} defaults to infinite. @c Will be cut out in optims info file and replaced with the same @c refernces explicitely there, since references to core Octave @c functions are not automatically transformed from here to there. @end deftypefn See also @ref{XREFglpk,,glpk,octave}. @c ------------------------------------------------------------------ @node Zero finders @chapter Functions for finding the zero of a nonlinear user function @cindex zero finders There is only one dedicated zero finder in the optim package, which is just a vectorized version of Octaves fzero (@mysee @ref{XREFfzero,,fzero,octave}). @menu * vfzero:: A vectorized version of fzero. @end menu @c ------------------------------------------------------------------ @node vfzero @section A vectorized version of fzero @findex vfzero @c vfzero ../inst/vfzero.m @anchor{XREFvfzero} @deftypefn {Function File} {} vfzero (@var{fun}, @var{x0}) @deftypefnx {Function File} {} vfzero (@var{fun}, @var{x0}, @var{options}) @deftypefnx {Function File} {[@var{x}, @var{fval}, @var{info}, @var{output}] =} vfzero (@dots{}) A variant of @code{fzero}. Finds a zero of a vector-valued multivariate function where each output element only depends on the input element with the same index (so the Jacobian is diagonal). @var{fun} should be a handle or name of a function returning a column vector. @var{x0} should be a two-column matrix, each row specifying two points which bracket a zero of the respective output element of @var{fun}. If @var{x0} is a single-column matrix then several nearby and distant values are probed in an attempt to obtain a valid bracketing. If this is not successful, the function fails. @var{options} is a structure specifying additional options. Currently, @code{vfzero} recognizes these options: @code{"FunValCheck"}, @code{"OutputFcn"}, @code{"TolX"}, @code{"MaxIter"}, @code{"MaxFunEvals"}. For a description of these options, see optimset. On exit, the function returns @var{x}, the approximate zero and @var{fval}, the function value thereof. @var{info} is a column vector of exit flags that can have these values: @itemize @item 1 The algorithm converged to a solution. @item 0 Maximum number of iterations or function evaluations has been reached. @item -1 The algorithm has been terminated from user output function. @item -5 The algorithm may have converged to a singular point. @end itemize @var{output} is a structure containing runtime information about the @code{fzero} algorithm. Fields in the structure are: @itemize @item iterations Number of iterations through loop. @item nfev Number of function evaluations. @item bracketx A two-column matrix with the final bracketing of the zero along the x-axis. @item brackety A two-column matrix with the final bracketing of the zero along the y-axis. @end itemize @end deftypefn @c ------------------------------------------------------------------ @node Gradient functions @chapter Functions for numerical approximation of gradients and Hessians @cindex gradient functions You should not usually need to use these functions directly or pass them as arguments. They should be chosen and used by optimizer functions, possibly subject to their configuration options. @menu * dfpdp:: Direct user interface to default numerical gradient method of new frontends. * deriv:: Higher order numerical derivatives. * numgradient:: Another numerical gradient function. * numhessian:: Numerical Hessian function. * cdiff:: A string, yielding the numerical gradient if evaluated. * jacobs:: Complex step derivatives. @end menu @c ------------------------------------------------------------------ @node dfpdp @section Direct user interface to default numerical gradient method of new frontends @findex dfpdp @subheading Helptext: @anchor{XREFdfpdp} @verbatim function jac = dfpdp (p, func[, hook]) Returns Jacobian of func (p) with respect to p with finite differencing. The optional argument hook is a structure which can contain the following fields at the moment: hook.f: value of func(p) for p as given in the arguments hook.diffp: positive vector of fractional steps from given p in finite differencing (actual steps may be smaller if bounds are given). The default is .001 * ones (size (p)). hook.diff_onesided: logical vector, indexing elements of p for which only one-sided differences should be computed (faster); even if not one-sided, differences might not be exactly central if bounds are given. The default is false (size (p)). hook.fixed: logical vector, indexing elements of p for which zero should be returned instead of the guessed partial derivatives (useful in optimization if some parameters are not optimized, but are 'fixed'). hook.lbound, hook.ubound: vectors of lower and upper parameter bounds (or -Inf or +Inf, respectively) to be respected in finite differencing. The consistency of bounds is not checked. @end verbatim @c ------------------------------------------------------------------ @node deriv @section Higher order numerical derivatives @findex deriv @c deriv ../inst/deriv.m @anchor{XREFderiv} @deftypefn {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}) @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}) @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}, @var{O}) @deftypefnx {Function File} {@var{dx} =} deriv (@var{f}, @var{x0}, @var{h}, @var{O}, @var{N}) Calculate derivate of function @var{f}. @var{f} must be a function handle or the name of a function that takes @var{x0} and returns a variable of equal length and orientation. @var{x0} must be a numeric vector or scalar. @var{h} defines the step taken for the derivative calculation. Defaults to 1e-7. @var{O} defines the order of the calculation. Supported values are 2 (h^2 order) or 4 (h^4 order). Defaults to 2. @var{N} defines the derivative order. Defaults to the 1st derivative of the function. Can be up to the 4th derivative. Reference: Numerical Methods for Mathematics, Science, and Engineering by John H. Mathews. @end deftypefn @c ------------------------------------------------------------------ @node numgradient @section Another numerical gradient function @findex numgradient @subheading Helptext: @anchor{XREFnumgradient} @verbatim numgradient(f, {args}, minarg) Numeric central difference gradient of f with respect to argument \"minarg\". * first argument: function name (string) * second argument: all arguments of the function (cell array) * third argument: (optional) the argument to differentiate w.r.t. (scalar, default=1) \"f\" may be vector-valued. If \"f\" returns an n-vector, and the argument is a k-vector, the gradient will be an nxk matrix Example: function a = f(x); a = [x'*x; 2*x]; endfunction numgradient(\"f\", {ones(2,1)}) ans = 2.00000 2.00000 2.00000 0.00000 0.00000 2.00000 @end verbatim @c ------------------------------------------------------------------ @node numhessian @section Numerical Hessian function @findex numhessian @subheading Helptext: @anchor{XREFnumhessian} @verbatim numhessian(f, {args}, minarg) Numeric second derivative of f with respect to argument \"minarg\". * first argument: function name (string) * second argument: all arguments of the function (cell array) * third argument: (optional) the argument to differentiate w.r.t. (scalar, default=1) If the argument is a k-vector, the Hessian will be a kxk matrix function a = f(x, y) a = x'*x + log(y); endfunction numhessian(\"f\", {ones(2,1), 1}) ans = 2.0000e+00 -7.4507e-09 -7.4507e-09 2.0000e+00 Now, w.r.t. second argument: numhessian(\"f\", {ones(2,1), 1}, 2) ans = -1.0000 @end verbatim @c ------------------------------------------------------------------ @node cdiff @section A string, yielding the numerical gradient if evaluated @findex cdiff @subheading Helptext: @anchor{XREFcdiff} @verbatim c = cdiff (func,wrt,N,dfunc,stack,dx) - Code for num. differentiation = "function df = dfunc (var1,..,dvar,..,varN) .. endfunction Returns a string of octave code that defines a function 'dfunc' that returns the derivative of 'func' with respect to it's 'wrt'th argument. The derivatives are obtained by symmetric finite difference. dfunc()'s return value is in the same format as that of ndiff() func : string : name of the function to differentiate wrt : int : position, in argument list, of the differentiation variable. Default:1 N : int : total number of arguments taken by 'func'. If N=inf, dfunc will take variable argument list. Default:wrt dfunc : string : Name of the octave function that returns the derivatives. Default:['d',func] stack : string : Indicates whether 'func' accepts vertically (stack="rstack") or horizontally (stack="cstack") arguments. Any other string indicates that 'func' does not allow stacking. Default:'' dx : real : Step used in the symmetric difference scheme. Default:10*sqrt(eps) See also : ndiff, eval, todisk @end verbatim @c ------------------------------------------------------------------ @node jacobs @section Complex step derivatives @findex jacobs @c jacobs ../inst/jacobs.m @anchor{XREFjacobs} @deftypefn {Function File} {Df =} jacobs (@var{x}, @var{f}) @deftypefnx {Function File} {Df =} jacobs (@var{x}, @var{f}, @var{hook}) Calculate the jacobian of a function using the complex step method. Let @var{f} be a user-supplied function. Given a point @var{x} at which we seek for the Jacobian, the function @command{jacobs} returns the Jacobian matrix @code{d(f(1), @dots{}, df(end))/d(x(1), @dots{}, x(n))}. The function uses the complex step method and thus can be applied to real analytic functions. The optional argument @var{hook} is a structure with additional options. @var{hook} can have the following fields: @itemize @bullet @item @code{h} - can be used to define the magnitude of the complex step and defaults to 1e-20; steps larger than 1e-3 are not allowed. @item @code{fixed} - is a logical vector internally usable by some optimization functions; it indicates for which elements of @var{x} no gradient should be computed, but zero should be returned. @end itemize For example: @example @group f = @@(x) [x(1)^2 + x(2); x(2)*exp(x(1))]; Df = jacobs ([1, 2], f) @end group @end example @end deftypefn @c ------------------------------------------------------------------ @node Helper functions @chapter Functions for algebraic tasks common to optimization problems @cindex helper functions @menu * cpiv_bard:: A complementary pivoting algorithm. * gjp:: Gauss-Jordan pivoting. @end menu @c ------------------------------------------------------------------ @node cpiv_bard @section A complementary pivoting algorithm @findex cpiv_bard @subheading Helptext: @anchor{XREFcpiv_bard} @verbatim [lb, idx, ridx, mv] = cpiv_bard (v, m[, incl]) v: column vector; m: matrix; incl (optional): index. length (v) must equal rows (m). Finds column vectors w and l with w == v + m * l, w >= 0, l >= 0, l.' * w == 0. Chooses idx, w, and l so that l(~idx) == 0, l(idx) == -inv (m(idx, idx)) * v(idx), w(idx) roughly == 0, and w(~idx) == v(~idx) + m(idx, ~idx).' * l(idx). idx indexes at least everything indexed by incl, but l(incl) may be < 0. lb: l(idx) (column vector); idx: logical index, defined above; ridx: ~idx & w roughly == 0; mv: [m, v] after performing a Gauss-Jordan 'sweep' (with gjp.m) on each diagonal element indexed by idx. Except the handling of incl (which enables handling of equality constraints in the calling code), this is called solving the 'complementary pivot problem' (Cottle, R. W. and Dantzig, G. B., 'Complementary pivot theory of mathematical programming', Linear Algebra and Appl. 1, 102--125. References for the current algorithm: Bard, Y.: Nonlinear Parameter Estimation, p. 147--149, Academic Press, New York and London 1974; Bard, Y., 'An eclectic approach to nonlinear programming', Proc. ANU Sem. Optimization, Canberra, Austral. Nat. Univ.). @end verbatim @c ------------------------------------------------------------------ @node gjp @section Gauss-Jordan pivoting @findex gjp @subheading Helptext: @anchor{XREFgjp} @verbatim m = gjp (m, k[, l]) m: matrix; k, l: row- and column-index of pivot, l defaults to k. Gauss-Jordon pivot as defined in Bard, Y.: Nonlinear Parameter Estimation, p. 296, Academic Press, New York and London 1974. In the pivot column, this seems not quite the same as the usual Gauss-Jordan(-Clasen) pivot. Bard gives Beaton, A. E., 'The use of special matrix operators in statistical calculus' Research Bulletin RB-64-51 (1964), Educational Testing Service, Princeton, New Jersey as a reference, but this article is not easily accessible. Another reference, whose definition of gjp differs from Bards by some signs, is Clarke, R. B., 'Algorithm AS 178: The Gauss-Jordan sweep operator with detection of collinearity', Journal of the Royal Statistical Society, Series C (Applied Statistics) (1982), 31(2), 166--168. @end verbatim @c ------------------------------------------------------------------ @node Documentation @chapter Function optim_doc to view documentation @findex optim_doc @c optim_doc ../inst/optim_doc.m @anchor{XREFoptim_doc} @deftypefn {Function File} {} optim_doc () @deftypefnx {Function File} {} optim_doc (@var{keyword}) Show optim package documentation. Runs the info viewer Octave is configured with on the documentation in info format of the installed optim package. Without argument, the top node of the documentation is displayed. With an argument, the respective index entry is searched for and its node displayed. @end deftypefn @c ------------------------------------------------------------------ @node Common frontend options @chapter Options common to all frontends @cindex common options All frontends for optimization and for result statistics (@ref{nonlin_min}, @ref{nonlin_residmin}, @ref{nonlin_curvefit}, @ref{residmin_stat}, @ref{curvefit_stat})accept the following options, settable with @ref{XREFoptimset,,optimset,octave}. These options are handled within the frontend. @table @code @item diffp Column vector of fractional intervals (doubled for central intervals) supposed to be used by gradient or Jacobian functions performing finite differencing. Default: @code{.001 * ones (size (parameters))}. The default function for finite differencing will use these as absolute intervals for parameters with value zero. @item diff_onesided Logical column vector indicating the parameters for which one-sided intervals should be used by gradient or Jacobian functions performing finite differencing. Default: @code{false (size (parameters))}. @item cstep Scalar step size for complex step derivative approximation of gradients or Jacobians. Default: 1e-20. @item parallel_local Logical scalar, default: @code{false}. If the @code{parallel} package, @code{version >= 2.0.5}, is loaded, estimate gradients of objective function and Jacobians of model function and of constraints in parallel processes. Works for default (real) finite differences and for complex step derivatives. Due to overhead, a speed advantage can only be expected if objective function, model function or constraint functions are time consuming enough. Additionally, this setting is also passed to the individual optimization backends, which may also consider this option (see documentation of backends). If this option is set to @code{true}, a warning (ID: @code{optim:parallel_local}) will be issued if no @code{parallel} package of a correct version is loaded. @item fixed Logical column vector indicating which parameters are not optimized, but kept to their inital value. @end table @c ------------------------------------------------------------------ @node Common optimization options @chapter Options common to all optimization frontends @cindex common optimization options All frontends for optimization (@ref{nonlin_min}, @ref{nonlin_residmin}, @ref{nonlin_curvefit}) accept the following options, settable with @ref{XREFoptimset,,optimset,octave}. @subheading Settings handled within the frontend @table @code @item Algorithm String specifying the backend. @item complex_step_derivative_inequc, @item complex_step_derivative_equc Logical scalars, default: @code{false}. Estimate Jacobian of general inequality constraints and equality constraints, respectively, with complex step derivative approximation. Use only if you know that your function of general inequality constraints or function of general equality constraints, respectively, is suitable for this. No user function for the respective Jacobian must be specified. @end table @subheading Settings passed to the backend Which of these options are actually honored is noted in the descriptions of the individual backends. @table @code @item lbound, @item ubound Column vectors of lower and upper bounds for parameters. Default: @code{-Inf} and @code{+Inf}, respectively. The bounds are non-strict, i.e. parameters are allowed to be exactly equal to a bound. The default function for gradients or Jacobians will respect bounds (but no further inequality constraints) in finite differencing if the backend respects bounds even during the course of optimization. @item inequc Further inequality constraints. Cell-array containing up to four entries, two entries for linear inequality constraints and/or one or two entries for general inequality constraints. Either linear or general constraints may be the first entries, but the two entries for linear constraints must be adjacent and, if two entries are given for general constraints, they also must be adjacent. The two entries for linear constraints are a matrix (say @code{m}) and a vector (say @code{v}), specifying linear inequality constraints of the form @code{m.' * parameters + v >= 0}. The first entry for general constraints must be a differentiable column-vector valued function (say @code{h}), specifying general inequality constraints of the form @code{h (p[, idx]) >= 0}; @code{p} is the column vector of optimized paraters and the optional argument @code{idx} is a logical index. @code{h} has to return the values of all constraints if @code{idx} is not given. It may choose to return only the indexed constraints if @code{idx} is given (so computation of the other constraints can be spared); in this case, the additional setting @code{f_inequc_idx} has to be set to @code{true}. In gradient determination, this function may be called with an informational third argument, whose content depends on the function for gradient determination. If a second entry for general inequality constraints is given, it must be a function computing the jacobian of the constraints with respect to the parameters. For this function, the description of the setting @code{dfdp}, @mysee @ref{XREFoptiondfdp,,dfdp}, applies, with 2 exceptions: 1) it is called with 3 arguments since it has an additional argument @code{idx}, a logical index, at second position, indicating which rows of the jacobian must be returned (if the function chooses to return only indexed rows, the additional setting @code{df_inequc_idx} has to be set to @code{true}). 2) the default jacobian function calls @code{h} with 3 arguments, since the argument @code{idx} is also supplied. Note that specifying linear constraints as general constraints will generally waste performance, even if further, non-linear, general constraints are also specified. @item f_inequc_idx, @item df_inequc_idx Indicate that functions for general inequality constraints or their jacobian, respectively, return only the values or derivatives for the indexed parameters. See description of setting @code{inequc} above. @item equc Equality constraints. Specified the same way as inequality constraints (see @code{inequc} above). @item f_equc_idx, @item df_equc_idx As @code{f_inequc_idx} and @code{df_inequc_idx} above, but for equality constraints. @item cpiv Function for complementary pivoting, usable in algorithms for constraints. Default: @code{cpiv_bard}. Only the default function is supplied with the package. @item TolFun Minimum fractional improvement in objective function (e.g. sum of squares) in an iteration (termination criterium). Default: .0001. @item TolX Minimum fractional change in a norm of the parameters in an iteration (termination criterium). Default: backend specific. @item MaxIter Maximum number of iterations (termination criterium). Default: backend-specific. @item fract_prec Column Vector, minimum fractional changes of corresponding parameters in an iteration (termination criterium if violated in two consecutive iterations). Default: backend-specific. @item max_fract_change Column Vector, enforced maximum fractional changes in corresponding parameters in an iteration. Default: backend-specific. @item Display String indicating the degree of verbosity. Default: @qcode{"off"}. Possible values are currently @qcode{"off"} (no messages) and @qcode{"iter"} (some messages after each iteration). Support of this setting and its exact interpretation are backend-specific. @item debug Logical scalar, default: @code{false}. Will be passed to the backend, which might print debugging information if @code{true}. @item FunValCheck If @qcode{"on"}, the output of user functions will be sanity-checked. Default: @qcode{"off"}. @item user_interaction @c This setting has deliberately not been named as its Matlab equivalent @c `OutputFcn' since it differs from the latter by requiring the @c functions to return _two_ outputs. The rationale for the difference @c is that information about the reason for a user-stop should be @c possible to pass in the output. The second output can't be made @c optional without possibly altering the warning state under which the @c user-function runs. Handle to a user function or cell-array with a number of these. Functions must have this interface: @example [@var{stop}, @var{info}] = some_user_function (@var{p}, @var{vals}, @var{state}); @end example If @var{stop} is @code{true}, the algorithm stops. In @var{info} information about the reason for stopping can be returned in a free format. @var{info} can be set to be empty, but it must be set. Note that this is different from the otherwise similar Matlab setting @code{OutputFcn}. The functions will be called by the algorithms at the start with @var{state} set to @qcode{init}, after each iteration with @var{state} set to @qcode{iter}, and at the end with @var{state} set to @qcode{done}. @var{p} contains the current parameters, and @var{vals} is a structure with other current values, the possible fields are currently: @table @code @item iteration number of the current iteration, @item fval value of objective function (for scalar optimization), @item residual residuals (for residual-based optimization), @item model_y in @code{nonlin_curvefit}, the output of the model function, @item observations in @code{nonlin_curvefit}, the constant observations, @item model_x in @code{nonlin_curvefit}, the constant argument @var{x}. @end table Information about the output of these functions when they were called the last time (possibly causing a stop) will be contained in the output @var{outp} of the frontend in field @code{user_interaction}. Subfield @code{stop} is a vector containing the @var{stop} outputs of each function, subfield @code{info} is a cell-array containing the output @var{info} of each function. In the case of a stop, the output @var{cvg} of the frontent will be @code{-1}. @end table @c ------------------------------------------------------------------ @node Parameter structures @chapter Handling of structures of optimized parameters @cindex parameter structures It can be convenient not to handle the optimized parameters as elements of a vector, but as named fields of a structure. The frontends @code{nonlin_residmin}, @code{nonlin_curvefit}, @code{residmin_stat}, @code{curvefit_stat}, and @code{nonlin_min} can accept parameter information in structure form, and can pass the parameters as a structure to user functions, although the backends still handle the parameters as vectors. To use this feature, the initial parameters must be given in structure form, or the setting @code{param_order} must be given, a cell-array with names of the parameters. If both is done, only the parameters in structure fields named in @code{param_order} will be optimized. If there are still some non-structure-based configuration settings or user functions, specifying @code{param_order} is mandatory even if the initial parameters are given in structure form. If the initial parameters are a structure, the parameters being the optimization result will also be returned as a structure. @menu * Structure-based user functions:: Specify which user functions accept parameter structures. * Structure-based gradients and Hessians:: Format of returned values of structure-based gradient and Hessian functions. * Structure-based linear constraints:: Specify structure-based linear constraints. * Structure-based configuration settings:: Parameter-related configuration settings in structure form. * Non-scalar parameters:: Handling named parameter arrays. @end menu @c ------------------------------------------------------------------ @node Structure-based user functions @section Specify which user functions accept parameter structures @cindex structure-based user functions The frontend must be told which user functions accept parameter structures by setting corresponding settings to @code{true}. The names of these settings depend on which user functions are applicable to the respective frontend and are shown in the following table. @multitable {inequality constraints} {@code{hessian_objf_pstruct}} {@code{nonlin_residmin},} @headitem User function @tab Setting @tab Frontends @item Objective function @tab @code{objf_pstruct} @tab @code{nonlin_min} @item Gradient @tab @code{grad_objf_pstruct} @tab @code{nonlin_min} @item Hessian @tab @code{hessian_objf_pstruct} @tab @code{nonlin_min} @item Model function @tab @code{f_pstruct} @tab @code{nonlin_residmin}, @code{nonlin_curvefit}, @code{residmin_stat}, @code{curvefit_stat} @item Jacobian @tab @code{df_pstruct} @tab @code{nonlin_residmin}, @code{nonlin_curvefit}, @code{residmin_stat}, @code{curvefit_stat} @item General inequality constraints @tab @code{f_inequc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @item Jacobian of general inequality constraints @tab @code{df_inequc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @item General equality constraints @tab @code{f_equc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @item Jacobian of general equality constraints @tab @code{df_equc_pstruct} @tab @code{nonlin_min}, @code{nonlin_residmin}, @code{nonlin_curvefit} @end multitable @c ------------------------------------------------------------------ @node Structure-based gradients and Hessians @section Format of returned values of structure-based gradient and Hessian functions @cindex structure-based gradients @cindex structure-based Hessians Structure-based gradient or Jacobian functions, including Jacobians of general constraints, must return the partial derivatives as fields of a structure under the respective parameter names. For gradients, the partial derivatives are scalar. For Jacobians, the partial derivatives must be column vectors. Structure-based Hessian functions must return the 2nd derivatives as subfields in a two-level structure of parameter names. For example, if the parameter names are @code{a} and @code{b}, the 2nd derivative with respect to @code{a} and @code{b} must be in the field @code{returned_structure.a.b} or @code{returned_structure.b.a} (there is no need to specify both). @c ------------------------------------------------------------------ @node Structure-based linear constraints @section Specify structure-based linear constraints @cindex structure-based linear constraints Linear constraints, otherwise specified with a matrix and a vector@c @c (@xref{}) , can be adapted to structure-based parameter handling by specifying, instead of a matrix, a structure containing the rows of the matrix in fields under the respective parameter names. In this case, rows containing only zeros need not be given. @c ------------------------------------------------------------------ @node Structure-based configuration settings @section Parameter-related configuration settings in structure form @cindex structure-based configuration settings The vector-based settings @code{lbound}, @code{ubound}, @code{fixed}, @code{diffp}, @code{diff_onesided}, @code{fract_prec}, and @code{max_fract_change} can be replaced by the setting @code{param_config}. It is a structure that can contain fields corresponding to parameter names. For each such field, there may be subfields with the same names as the above vector-based settings, but containing a scalar value for the respective parameter. For example, if the parameters are named @code{a} and @code{b}, instead of specifying @example settings = optimset ("lbound", [-Inf; 0], "diff_onesided", [true; true]); @end example one can specify @example pconf.b.lbound = 0; pconf.a.diff_onesided = true; pconf.b.diff_onesided = true; settings = optimset ("param_config", pconf); @end example If @code{param_config} is specified, none of the above vector-based settings may be used. @c ------------------------------------------------------------------ @node Non-scalar parameters @section Handling named parameter arrays @cindex non-scalar parameters @cindex named parameter arrays Parameters in named structure fields are allowed to be non-scalar real arrays. In this case, their dimensions must be given by the setting @code{param_dims}, a cell-array of dimension vectors, each containing at least two dimensions; if not given, dimensions are taken from the initial parameters, if these are given in a structure. If there are any vector-based settings or not structure-based linear constraints, they must correspond to an order of parameters defined as follows: All named parameter arrays are reshaped to vectors. Then, all parameters, scalars and vectors, are concatenated in the order of parameter names, given by the user. Structure-based settings or structure-based initial parameters must contain arrays with dimensions reshapable to those of the respective parameters. @c ------------------------------------------------------------------ @node Additional parameters @chapter Passing additional parameters to user functions @cindex additional parameters Optimizers often require the user to supply functions (@abbr{e.g@.} objective functions, model functions, constraint functions). The interface of these functions --- arguments and returned values --- is defined by the optimizer. Often, a user function needs additional arguments, not covered by the defined interface, which are constant throughout the optimization. These can be supplied by wrapping the user function into an anonymous function. @xref{Anonymous Functions,,,octave}, for further explanation and examples. There are also some older optimizers in the optim package, written when anonymous functions were not available in Octave. Some of these offer an interface to user functions which is itself able to pass additional constant variables in arbitrary argument positions. Newer optimizers should not be written this way, since this is an unnecessary complication. Though it is possible to use global variables to pass additional data to user functions, this is not recommended since it introduces the possibility of name conflicts within the pool of global variables. @c ------------------------------------------------------------------ @node Function index @unnumbered Index of functions in optim @printindex fn @c ------------------------------------------------------------------ @node Concept index @unnumbered Concept index @printindex cp @bye