# cem

Competitive Expectation-Maximization (CEM) algorithm for estimation of mixture model structure and parameters, with generalized internal estimations.

Syntax

```[theta, D] = cem(data)
[theta, D] = cem(data, options)
[theta, D, info] = cem(...)
[theta, D, info, options] = cem(...)
```

Description

This function implements the CEM algorithm proposed by Zhang et al. (2004), though here we use generalized optimization methods.

[theta, D] = cem(data) returns the estimated parameters and the mixture distribution D, fitted to data.

[theta, D] = cem(data, options) utilizes applicable options from the options structure in the estimation procedure.

[theta, D, info] = cem(...) also returns info, a structure array containing information about successive iterations performed by iterative estimation functions.

[theta, D, info, options] = cem(...) also returns the effective options used, so you can see what default values the function used on top of the options you possibly specified.

For information about the output theta, see Distribution Parameters Structure. The input argument data is described in Data Input Argument to Functions. You may also want to read about options or info arguments.

Available Options

This function supports all the options described in estimation options. This function accepts the following additional fields in options.sm:

• numInit (default 1) : Initial number of mixture components.
• numMin (default 1) : Minimum number of mixture components.
• numMax (default 15) : Maximum number of mixture components.
• mergeMaxCands (default 5) : Maximum number of merge candidates.
• splitMaxCands (default 5) : Maximum number of split candidates.
• maxFail (default 2) : Maximum number of split/merge trials before exit.
• componentD (default MVN distribution) : distribution structure defining the mixture component distribution type.

options.inner can be used to set specific options for the inner estimations. To set options only for the partial or full inner estimations use options.inner.partial or options.inner.full respectively.

Returned info fields

The fields present in the returned info structure array, depend on the solver used (options.solver). When a Manopt solver is specified, the info returned by the Manopt solver is returned directly. For the 'default' solver see the documentation of the 'estimatedefault' function for the mixture distribution. You can read more at our documentation on estimation statistics structure.

Example

```% generate 1000 random data points
data = randn(1,1000) .* 2 + 1;
% set some options
options.solver = 'cg';
options.verbosity = 2;
options.sm.numMax = 7;
options.inner.partial.maxiter = 10;
% fit mixture model to data
[theta, D] = cem(data, options)
```

References

1. B. Zhang, C. Zhang, and X. Yi, “Competitive EM algorithm for finite mixture models,” Pattern Recognition, vol. 37, no. 1, pp. 131–144, Jan. 2004.