Mixture Models

Mixture Models

A mixture model is a probabilistic distribution that combines a set of component to represent the overall distribution. Generally, the probability density/mass function is given by a convex combination of the pdf/pmf of individual components, as

\[f_{mix}(x; \Theta, \pi) = \sum_{k=1}^K \pi_k f(x; \theta_k)\]

A mixture model is characterized by a set of component parameters $\Theta=\{\theta_1, \ldots, \theta_K\}$ and a prior distribution $\pi$ over these components.

Type Hierarchy

This package introduces a type MixtureModel, defined as follows, to represent a mixture model:

abstract type AbstractMixtureModel{VF<:VariateForm,VS<:ValueSupport} <: Distribution{VF, VS} end

struct MixtureModel{VF<:VariateForm,VS<:ValueSupport,Component<:Distribution} <: AbstractMixtureModel{VF,VS}
    components::Vector{Component}
    prior::Categorical
end

const UnivariateMixture    = AbstractMixtureModel{Univariate}
const MultivariateMixture  = AbstractMixtureModel{Multivariate}

Remarks:

With such a type system, the type for a mixture of univariate normal distributions can be written as

MixtureModel{Univariate,Continuous,Normal}

Constructors

MixtureModel(components, [prior])

Construct a mixture model with a vector of components and a prior probability vector. If no prior is provided then all components will have the same prior probabilities.

source
MixtureModel(C, params, [prior])

Construct a mixture model with component type $C$, a vector of parameters for constructing the components given by $params$, and a prior probability vector. If no prior is provided then all components will have the same prior probabilities.

source

Examples

# constructs a mixture of three normal distributions,
# with prior probabilities [0.2, 0.5, 0.3]
MixtureModel(Normal[
   Normal(-2.0, 1.2),
   Normal(0.0, 1.0),
   Normal(3.0, 2.5)], [0.2, 0.5, 0.3])

# if the components share the same prior, the prior vector can be omitted
MixtureModel(Normal[
   Normal(-2.0, 1.2),
   Normal(0.0, 1.0),
   Normal(3.0, 2.5)])

# Since all components have the same type, we can use a simplified syntax
MixtureModel(Normal, [(-2.0, 1.2), (0.0, 1.0), (3.0, 2.5)], [0.2, 0.5, 0.3])

# Again, one can omit the prior vector when all components share the same prior
MixtureModel(Normal, [(-2.0, 1.2), (0.0, 1.0), (3.0, 2.5)])

# The following example shows how one can make a Gaussian mixture
# where all components share the same unit variance
MixtureModel(map(u -> Normal(u, 1.0), [-2.0, 0.0, 3.0]))

Common Interface

All subtypes of AbstractMixtureModel (obviously including MixtureModel) provide the following two methods:

components(d::AbstractMixtureModel)

Get a list of components of the mixture model d.

source
probs(d::AbstractMixtureModel)

Get the vector of prior probabilities of all components of d.

source
component_type(d::AbstractMixtureModel)

The type of the components of d.

source

In addition, for all subtypes of UnivariateMixture and MultivariateMixture, the following generic methods are provided:

Statistics.meanMethod.
mean(d::Union{UnivariateMixture, MultivariateMixture})

Compute the overall mean (expectation).

source
Statistics.varMethod.
var(d::UnivariateMixture)

Compute the overall variance (only for $UnivariateMixture$).

source
Base.lengthMethod.
length(d::MultivariateMixture)

The length of each sample (only for Multivariate).

source
Distributions.pdfMethod.
pdf(d::Union{UnivariateMixture, MultivariateMixture}, x)

Evaluate the (mixed) probability density function over x. Here, x can be a single sample or an array of multiple samples.

source
logpdf(d::Union{UnivariateMixture, MultivariateMixture}, x)

Evaluate the logarithm of the (mixed) probability density function over x. Here, x can be a single sample or an array of multiple samples.

source
Base.randMethod.
rand(d::Union{UnivariateMixture, MultivariateDistribution})

Draw a sample from the mixture model d.

rand(d::Union{UnivariateMixture, MultivariateMixture}, n)

Draw n samples from d.

source
Random.rand!Method.
rand!(d::Union{UnivariateMixture, MultivariateMixture}, r::AbstactArray)

Draw multiple samples from d and write them to r.

source

Estimation

There are a number of methods for estimating of mixture models from data, and this problem remains an open research topic. This package does not provide facilities for estimating mixture models. One can resort to other packages, e.g. GaussianMixtures.jl, for this purpose.