All estimation functions in MixEst accept an optional options structure to let the user control the behavior of the estimation algorithm. Each field of the structure corresponds to an option. The default value is used for any missing option.
The estimation functions read only the fields that they understand from the structure and leave the other fields intact, possibly passing them to other functions that are called inside (such as Manopt solvers). The list of options available for each estimation function is mentioned in its documentation.
List of commonly-used options
Here is a list of common options used in estimation functions. For information about the available options for a specific function, refer to the documentation of that function. Note that the option names are NOT case sensitive and are all converted to lower case before processing.
Undefined function or variable 'mxe_options'. Error in estimation_options (line 22) options = mxe_options();
Solvers
In the main estimation function, we use Manopt solvers to solve the optimization problem involved in model parameter estimation. You can select the employed solver by setting options.solver to the handle of the solver function (e.g. options.solver = @trustregions;), or alternatively you can set options.solver to one of the string values from the following table.
Mini-batching options
To enable mini-batching, you can set options.miniBatch to true to use the default mini-batching options. Alternatively you may set special mini-batching options as fields in options.miniBatch (as a structure) to customize mini-batching. For example the following code sets the number of data points in each mini-batch to 400 and the number of iterations per mini-batch to 20. The other mini-batcing options will be set to their default values.
options.minibatch.size = 400; options.minibatch.iter = 20;
The following table lists the mini-batching options:
Plotting options
MixEst gives you the option to plot the optimization cost and/or gradient norm during estimations. You can set options.plotCost or options.plotGradNorm to true to enable this feature. Alternatively you may set predefined field(s) in these options (as structures) to customize plotting behavior. The following example shows how to set options to show the cost and gradient norm plot in a single figure during estimation.
figure options.plotcost.axes = subplot(2,1,1); options.plotgradnorm.axes = subplot(2,1,2);
The following table lists the plotting options:
Visualization options
We have implemented simple visualizations for some of the distributions that might be useful for you (They were really helpful for us during development). If you have 2-D or 3-D data, You can set options.visualization to true to give it a try. Alternatively you may set predefined field(s) in options.visualization (as a structure) to customize the visualization. This feature is experimental and is not fully documented yet but the following example illustrates how you can set some visualization options.
% main visualization options options.visualization.axes = gca; options.visualization.stopOnClose = true; % stop estimation when the figure is closed options.visualization.dataPlotType = 'patch'; % or 'scatter' % visualization options for the mixture distribution options.visualization.mixture.colorize = true; options.visualization.mixture.colorMap = hsv(3); options.visualization.mixture.showLabels = true; % visualization options for the MVN distribution options.visualization.mvn.showCenter = false; % load some data and perform estimation load data2d D = mixturefactory(mvnfactory(2), 3); D.estimate(data, options)
Cross-validation options
To enable cross-validation, you can set options.crossVal to true to use the default cross-validation options. Alternatively you may set special cross-validation options as fields in options.crossVal (as a structure) to customize cross-validation. For example the following code asks for half of the data to be used for cross-validation. The other cross-validation options will be set to their default values.
options.crossval.fraction = 0.5;
Note that your data should be shuffled before using cross-validation.
The following table lists the cross-validation options:
statsfun
If you add a statsfun field containing a function handle with prototype stats = statsfun(D, theta, stats) to the options structure, it will be called after each iteration completes with the current distribution structure D and its parameter values theta and the statistics structure stats that will be logged in the info structure array as described here, and gives you a chance to modify the stats structure. (This is based on how Manopt declares statsfun but omitting its problem structure).
Example
Use the following code to log all the points (parameter values) visited during the estimation process:
options.statsfun = @mystatsfun; function stats = mystatsfun(theta, stats) stats.theta = theta; end
This will log all the points (parameter values) visited during the estimation process in the info struct-array returned by the estimation function.
stopfun
If you add a stopfun field containing a function handle with prototype stopnow = stopfun(D, theta, info, last) to the options structure, it will be called after each iteration completes with the current distribution structure D and its parameter values theta, the whole info structure array built so far and an index last such that info(last) is the statistics structure corresponding to the current iteration (this is because info is pre-allocated, so that info(end) typically does not refer to the current iteration). The returned value stopnow should be a logical. If it is returned as true, the algorithm will terminate. (This is based on how Manopt declares stopfun but omitting its problem structure).
Example
options.stopfun = @mystopfun; function stopnow = mystopfun(D, theta, info, last) stopnow = (last >= 3 && info(last-2).cost - info(last).cost < 1e-3); end
This will tell the algorithm to exit as soon as two successive iterations combined have decreased the cost (negative log-likelihood) by less than 1e-3.
costgrad
If you add a costgrad field containing a function handle with prototype [cost, grad] = costgrad(theta) to the options structure, it will be used instead of the default cost/grad function (mxe_costgrad.m) to calculate the cost cost and/or its Riemannian gradient grad at every point theta during the estimation.
Example
The following example overrides the default cost/grad function and implements a preliminary one:
function test D = mvnfactory(1); data = randn(1, 1000); options.solver = 'cg'; options.theta0 = D.randparam(); options.costgrad = @mycostgrad; D.estimate(data, options); % % function [cost, grad] = mycostgrad(theta) [ll, store] = D.ll(theta, data); cost = -ll; if nargout > 1 dll = D.llgrad(theta, data, store); grad = D.M.egrad2rgrad(theta, dll); % convert the gradient to Riemannian grad = D.M.lincomb(theta, -1, grad); % negate the gradient end end end
Split-and-merge options
Options for split-and-merge algorithms can be set through fields in the options.sm structure. The following options are common. Additional options for each algorithm are described in the documentation of the respective function.
- splitCriterion (default 'kl') : Criterion for selecting a component to be split. Following values are available:
Value | Description |
---|---|
'kl' | Maximum local KL-divergence between component distribution and empirical distribution |
'mll' | Minimum mean local log-likelihood |
'entropy' | Maximum entropy |
'rand' | Random split |
- mergeCriterion (default 'kl') : Criterion for selecting two components to be merged. Following values are available:
Value | Description |
---|---|
'kl' | Minimum symmetric KL-divergence |
'overlap' | Maximum distribution overlap |
'rand' | Random merge |
- splitInit (default 'default') : Initialization method for the components resulting from a split. You can set this field to 'default' to use the default methods for each parameter, or you may give a function handle implementing your custom initialization. The referred function should have the following prototype:
function [newtheta, store] = splitinit(D, idx, theta, options, data, store)
where D is the mixture distribution structure, idx is the index of the component to be split, theta is the parameter structure for the mixture distribution, options is the full options structure, data is the given data, and store can be used for caching purposes. The output newtheta is the parameter structure of the new mixture after the split (containing the parameters for an additional component). You should put the initialized parameter values for the first splitted component at newtheta.D{idx} and for the second at newtheta.D{end}. The component weights in newtheta.p should also be updated accordingly.
- mergeInit (default 'default') : Initialization method for the component resulting from a merge. You can set this field to 'default' to use the default methods for each parameter, or you may give a function handle implementing your custom initialization. The referred function should have the following prototype:
function [newtheta, store] = mergeinit(D, idx1, idx2, theta, options, data, store)
where D is the mixture distribution structure, idx1 and idx2 are the indices of the components to be merged, theta is the parameter structure for the mixture distribution, options is the full options structure, data is the given data, and store can be used for caching purposes. The output newtheta is the parameter structure of the new mixture after the merge (with one less item than theta). You should put the initialized parameter values for the merged component at newtheta.D{idx1} and remove theta.D{idx2} in newtheta. The component weights in newtheta.p should also be updated accordingly.
Example
If you want the split-and-merge algorithm to select the component with maximum entropy when it requires to split a component, use the following option:
options.sm.splitCriterion = 'entropy';
Example
The following code shows how to use options.sm.splitInit to manually initialize the parameters of splitted components for a mixture of Gaussians. For the mean vectors, it uses k-means clustering (with k=2) on 10 data points that have the highest posterior probability under the original component. Each covariance matrix is initialized as a unit matrix with the same volume as the original component. The weight for each of the splitted components is set to half the weight of the original component.
options.sm.splitinit = @splitinit; function [newtheta, store] = splitinit(D, idx, theta, options, data, store) % % add parameters for an additional component newtheta = theta; newtheta.D{end+1} = theta.D{idx}; % initialize the weights newtheta.p(idx) = theta.p(idx) / 2; newtheta.p(end+1) = newtheta.p(idx); % initialize the means Didx = D.component(idx); pp = Didx.llvec(theta.D{idx}, data); [~, I] = sort(pp, 'descend'); I = I(1:10); [~, C] = kmeans(data(:,I).', 2); newtheta.D{idx}.mu = C(1,:).'; newtheta.D{end}.mu = C(2,:).'; % initialize the covariance matrices A = theta.D{idx}.sigma; d = size(A,1); newtheta.D{idx}.sigma = det(A)^(1/d) * eye(d); newtheta.D{end}.sigma = newtheta.D{idx}.sigma; end
Inner options
Some estimation functions contain inner estimations that might need to be customized separately from the main estimation function. The options for the inner estimations can be set in the special field options.inner. This field can contain usual options as its sub-fields (e.g. options.inner.maxIter), but some special options like prevInfo might be disregarded.
Some functions might be more specific about their inner estimation options and let you set specific options for each type of inner estimation they use. This capability is provided through one more level of sub-fields in options.inner. For example you can set the maximum iterations for the partial inner estimations of the split-and-merge algorithms through options.inner.partial.maxIter. This should be documented in the API reference for the respective function.