Skip to content

Commit

Permalink
git-svn-id: svn+ssh://lumo.ucsd.edu/projects/p1/svnroot/pdollar/toolb…
Browse files Browse the repository at this point in the history
…ox@1268 52fe0c90-79fe-0310-8a18-a0b98ad248f8
  • Loading branch information
pdollar committed Jun 12, 2006
1 parent 0467660 commit d593990
Show file tree
Hide file tree
Showing 571 changed files with 37,682 additions and 0 deletions.
45 changes: 45 additions & 0 deletions classify/Contents.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
% CLASSIFY
% See also
%
% Clustering:
% democluster - Clustering demo.
% demogendata - Generate data drawn form a mixture of Gaussians.
% kmeans2 - Very fast version of kmeans clustering.
% meanshift - meanshift clustering algorithm.
% meanshiftim - Applies the meanshift algorithm to a joint spatial/range image.
% meanshiftim_explore - Visualization to help choose sigmas for meanshiftim.
%
% Calculating distances efficiently:
% dist_L1 - Calculates the L1 Distance between vectors (ie the City-Block distance).
% dist_chisquared - Calculates the Chi Squared Distance between vectors (usually histograms).
% dist_emd - Calculates Earth Mover's Distance (EMD) between positive vectors.
% dist_euclidean - Calculates the Euclidean distance between vectors [FAST].
% distmatrix_show - Useful visualization of a distance matrix of clustered points.
% softmin - Calculates the softmin of a vector.
%
% Principal components analysis:
% pca - principal components analysis (alternative to princomp).
% pca_apply - Companion function to pca.
% pca_apply_large - Wrapper for pca_apply that allows for application to large X.
% pca_randomvector - Generate random vectors in PCA subspace.
% pca_visualize - Visualization of quality of approximation of X given principal components.
% visualize_data - Project high dim. data unto principal components (PCA) for visualization.
%
% Classification methods with a common interface:
% democlassify - A demo used to test and demonstrate the usage of classifiers (clf_*)
% nfoldxval - Runs n-fold cross validation on data with a given classifier.
% confmatrix - Generates a confusion matrix according to true and predicted data labels.
% confmatrix_show - Used to display a confusion matrix.
% clf_dectree - Wrapper for treefit that makes decision trees compatible with nfoldxval.
% clf_dectree_fwd - Apply the decision tree to data X.
% clf_dectree_train - Train a decision tree classifier.
% clf_ecoc - Wrapper for ecoc that makes ecoc compatible with nfoldxval.
% clf_ecoc_code - Generates optimal ECOC codes when 3<=nclasses<=7.
% clf_knn - Create a k nearest neighbor classifier.
% clf_knn_dist - k-nearest neighbor classifier based on a distance matrix D.
% clf_knn_fwd - Apply a k-nearest neighbor classifier to X.
% clf_knn_train - Train a k nearest neighbor classifier (memorization).
% clf_lda - Create a Linear Discriminant Analysis (LDA) classifier.
% clf_lda_fwd - Apply the Linear Discriminant Analysis (LDA) classifier to data X.
% clf_lda_train - Train a Linear Discriminant Analysis (LDA) classifier.
% clf_svm - Wrapper for svm that makes svm compatible with nfoldxval.
Binary file added classify/clf_data.mat
Binary file not shown.
25 changes: 25 additions & 0 deletions classify/clf_dectree.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
% Wrapper for treefit that makes decision trees compatible with nfoldxval.
%
% INPUTS
% p - data dimension
% params - parameters for treefit, ex: 'splitmin'',2,'priorprob',ones(1,n)/n
%
% OUTPUTS
% clf - model ready to be trained
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also NFOLDXVAL, TREEFIT

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function clf = clf_dectree( p, varargin )
clf.p = p;
clf.type = 'dectree';
clf.params = varargin;

clf.fun_train = @clf_dectree_train;
clf.fun_fwd = @clf_dectree_fwd;
25 changes: 25 additions & 0 deletions classify/clf_dectree_fwd.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
% Apply the decision tree to data X.
%
% INPUTS
% clf - trained model
% X - nxp data array
%
% OUTPUTS
% Y - nx1 vector of labels predicted according to the clf
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also CLF_DECTREE

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function Y = clf_dectree_fwd( clf, X )
if( ~strcmp( clf.type, 'dectree' ) ) error( ['incorrect type: ' clf.type] ); end;
if( size(X,2)~= clf.p ) error( 'Incorrect data dimension' ); end;
T = clf.T;

[Y,d,cnames] = treeval( T, X );
Y = str2double( cnames ); % convert Y back to an int format
34 changes: 34 additions & 0 deletions classify/clf_dectree_train.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
% Train a decision tree classifier.
%
% INPUTS
% clf - model to be trained
% X - nxp data array
% Y - nx1 array of labels
%
% OUTPUTS
% clf - a trained binary clf_LDA clf
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also CLF_DECTREE

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function clf = clf_dectree_train( clf, X, Y )
if( ~strcmp( clf.type, 'dectree' ) ) error( ['incorrect type: ' clf.type] ); end;
if( size(X,2)~= clf.p ) error( 'Incorrect data dimension' ); end;

% apply treefit
Y = int2str2( Y ); % convert Y to string format for treefit.
params = clf.params;
T = treefit(X,Y,'method','classification',params{:});

% apply cross validation (on training data), and prune
[c,s,n,best] = treetest(T,'cross',X,Y);
T = treeprune(T,'level',best);

clf.T = T;

43 changes: 43 additions & 0 deletions classify/clf_ecoc.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
% Wrapper for ecoc that makes ecoc compatible with nfoldxval.
%
% Requires the SVM toolbox by Anton Schwaighofer.
%
% INPUTS
% p - data dimension
% clfinit - binary classifier init (see nfoldxval)
% clfparams - binary classifier parameters (see nfoldxval)
% nclasses - number of classes (currently only 3<=nclasses<=7 suppored)
% use01targets - see ecoc
%
% OUTPUTS
% clf - see ecoc
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also ECOC, NFOLDXVAL, CLF_ECOC_CODE

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function clf = clf_ecoc(p,clfinit,clfparams,nclasses,use01targets)
if( nclasses<3 || nclasses>7 )
error( 'currently only works if 3<=nclasses<=7'); end;
if( nargin<5 || isempty(use01targets)) use01targets=0; end;

% create code (limited for now)
[C,nbits] = clf_ecoc_code( nclasses );
clf = ecoc(nclasses, nbits, C );
clf.verbosity = 0; % don't diplay output

% initialize and temporarily store binary learner
clf.templearner = feval( clfinit, p, clfparams{:} );

% ecoctrain2 is custom version of ecoctrain
clf.fun_train = @clf_ecoctrain;
clf.fun_fwd = @ecocfwd;


function clf = clf_ecoctrain( clf, varargin )
clf = ecoctrain( clf, clf.templearner, varargin{:} );
32 changes: 32 additions & 0 deletions classify/clf_ecoc_code.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
% Generates optimal ECOC codes when 3<=nclasses<=7.
%
% INPUTS
% nclasses - number of classes
%
% DATESTAMP
% 29-Sep-2005 2:00pm
%
% See also CLF_ECOC

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function [C,nbits] = clf_ecoc_code( k )
if( k<3 || k>7 )
error( 'method only works if k is small: 3<=k<=7'); end;

% create C
C = ones(k,2^(k-1));
for i=2:k
partw = 2^(k-i); nparts = 2^(i-2);
row = [zeros(1,partw) ones(1,partw)];
row = repmat( row, 1, nparts );
C(i,:) = row;
end
C = C(:,1:end-1);
nbits = size(C,2);

% alter C to have entries [-1,1]
C(C==0)=-1;

28 changes: 28 additions & 0 deletions classify/clf_knn.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
% Create a k nearest neighbor classifier.
%
% INPUTS
% p - data dimension
% k - number of nearest neighbors to look at
% dist_fn - [optional] distance function, squared euclidean by default
%
% OUTPUTS
% clf - model ready to be trained
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also NFOLDXVAL, CLF_KNN_TRAIN, CLF_KNN_FWD

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function clf = clf_knn( p, k, dist_fn )
if( nargin<3 ) dist_fn = @dist_euclidean; end;

clf.p = p;
clf.type = 'knn';
clf.k = k;
clf.dist_fn = dist_fn;
clf.fun_train = @clf_knn_train;
clf.fun_fwd = @clf_knn_fwd;
59 changes: 59 additions & 0 deletions classify/clf_knn_dist.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
% k-nearest neighbor classifier based on a distance matrix D.
%
% k==1 is much faster than k>1. For k>1, ties are broken randomly.
%
% INPUTS
% D - MxN array of distances from M-TEST points to N-TRAIN points.
% IDX - ntrain length vector of class memberships
% if IDX(i)==IDX(j) than sample i and j are part of the same class
% k - [optional] number of nearest neighbors to use, 1 by default
%
% OUTPUTS
% IDXpred - length M vector of classes for training data
%
% EXAMPLE
% % [given D and IDX]
% for k=1:size(D,2) err(k)=sum(IDX==clf_knn_dist(D,IDX,k)); end;
% figure(1); plot(err)
%
% DATESTAMP
% 11-Oct-2005 8:00pm
%
% See also CLF_KNN

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function IDXpred = clf_knn_dist( D, IDX, k )
if( nargin<3 | isempty(k) ) k=1; end;

[n ntrain] = size(D);
if( ntrain ~= length(IDX) );
error('Distance matrix and IDX vector dimensions do not match.'); end;

%%% 1NN [fast and easy]
if( k==1 )
[dis,Dind]=min(D,[],2);
IDXpred=IDX(Dind);

%%% kNN
else
[IDXnames,dis,IDX]=unique(IDX);

%%% get closests k prototypes [n x k matrix]
[D,knns_inds] = sort(D,2);
knns_inds = knns_inds(:,1:k);
knns = IDX(knns_inds);
if( n==1 ) knns = knns'; end;

%%% get counts of each of the prototypes
nclasses = max(IDX);
counts = zeros(n,nclasses);
for i=1:nclasses counts(:,i)=sum(knns==i,2); end;
counts = counts + randn(size(counts))/1000; % hack to break ties randomly!
[ counts, classes ] = sort(counts,2,'descend');

%%% get IDXpred
IDXpred = IDXnames( classes(:,1) );
end
31 changes: 31 additions & 0 deletions classify/clf_knn_fwd.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
% Apply a k-nearest neighbor classifier to X.
%
% INPUTS
% clf - trained model
% X - nxp data array
%
% OUTPUTS
% Y - nx1 vector of labels predicted according to the clf
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also CLF_KNN, CLF_KNN_TRAIN

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function Y = clf_knn_fwd( clf, X )
if( ~strcmp( clf.type, 'knn' ) ) error( ['incorrect type: ' clf.type] ); end;
if( size(X,2)~= clf.p ) error( 'Incorrect data dimension' ); end;

dist_fn = clf.dist_fn;
Xtrain = clf.Xtrain;
Ytrain = clf.Ytrain;
k = clf.k;
n = size(X,1);

% get nearest neighbors for each X point
D = feval( dist_fn, X, Xtrain );
Y = clf_knn_dist( D, Ytrain, k );
31 changes: 31 additions & 0 deletions classify/clf_knn_train.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
% Train a k nearest neighbor classifier (memorization).
%
% INPUTS
% clf - model to be trained
% X - nxp data array
% Y - nx1 array of labels
%
% OUTPUTS
% clf - a trained k-nearest neighbor classifier.
%
% DATESTAMP
% 11-Oct-2005 2:45pm
%
% See also CLF_KNN, CLF_KNN_FWD

% Piotr's Image&Video Toolbox Version 1.03
% Written and maintained by Piotr Dollar pdollar-at-cs.ucsd.edu
% Please email me if you find bugs, or have suggestions or questions!

function clf = clf_knn_train( clf, X, Y )
if( ~strcmp( clf.type, 'knn' ) ) error( ['incorrect type: ' clf.type] ); end;
if( size(X,2)~= clf.p ) error( 'Incorrect data dimension' ); end;

%%% error check
n=size(X,1); Y=double(Y);
[Y,er] = checknumericargs( Y, [n 1], 0, 0 ); error(er);

%%% training is memorization
clf.Xtrain = X;
clf.Ytrain = Y;

Loading

0 comments on commit d593990

Please sign in to comment.