User Tools

Site Tools


pymix

The Python Mixture Package (PyMix , http://www.pymix.org/pymix/ ) is a freely available Python library implementing algorithms and data structures for a wide variety of data mining applications with basic and extended mixture models. Features include

  • Finite mixture models of discrete and continuous features
  • Wide range of available distributions (Normal, Exponential, Discrete, Dirichlet, Normal-Gamma, Uniform, HMM)
  • Bayesian mixture models
  • ML and MAP parameter estimation
  • Context-specific independence structure learning
  • Partially supervised parameter learning
  • Parameter estimation for pairwise constrained samples

Installing

(tested on Ubuntu 12.10)

> vi setup.py

##
# 1.
# replace the line 
# from distutils.core import setup, Extension,DistutilsExecError
# with

from distutils.core import setup, Extension
from distutils.errors import DistutilsExecError

##
# 2.
# replace the line
#   numpypath =  prefix + '/lib/python' +pyvs + '/site-packages/numpy/core/include/numpy'  # path to arrayobject.h
# with
    numpypath = '/usr/share/pyshared/numpy/core/include/numpy' # path to arrayobject.h
  • build and install
python setup.py build
sudo python setup.py install --prefix /usr/local/

Clustering

Following the tutorial from http://www.pymix.org/pymix/index.php?n=PyMix.Tutorial

import numpy
import mixture
 
# create dummy data with speeds from lkw and pkw
raw_data = numpy.array([75 , 80 , 120, 83, 134, 150, 89, 160, 80, 160] )
data = mixture.DataSet()
data.fromArray(raw_data)
 
# create mixture model
n1 = mixture.NormalDistribution(80,3.0)
n2 = mixture.NormalDistribution(130,10.0)
m = mixture.MixtureModel(2,[0.5,0.5], [n1,n2])
 
# Perform Expectation Maximization Algorithm
m.EM(data, max_iter=40, delta=0.1)  # finished after 40 iterations or when delta < 0.1
 
# show cluster assignment of data
clust = m.classify(data)
print clust
pymix.txt · Last modified: 2013/09/09 13:07 by hkoller