Purpose: Behavioral analyses require users to pre-define what they are looking for in an animal. Unlike BSOID_MT.m, this algorithm takes an unsupervised, purely data-driven approach to segment statistically different behaviors. This core concept of finding trends based on only the data has been utilized in many artificial intelligence designs, including, but not limited to, speaker identification, hand-writing analyses, brain-computer interfaces, and identification of stages in certain biological diseases, and has been proven effective in uncovering latent variables.
function [f_10fps,tsne_feats,grp,llh,bsoid_fig] = bsoid_gmm(data,fps,comp,smth_hstry,smth_futr,gmmclass,it)
Run dlc_preprocess.md first.
-
DATA
6-body parts (x,y) matrix outlining the mouse viewed from bottom-up. Rows represents frame numbers. Columns 1 & 2: Snout; Columns 3, 4, 5 & 6: two front paws (Left-Right order does not matter); Columns 7, 8, 9 & 10: two hind paws (Left-Right order does not matter); Columns 11 & 12: base of tail (Place it where the tail extends out from the butt).- Last tested on 071219, using data generated by DeepLabCut 2.0, with outlier (jump) frames at 0.3%.
-
FPS
Rounded video sampling frame rate. Use VideoReader in MATLAB or ffmpeg bash command to detect.vidObj = VideoReader('videos/thisismyvideo.mp4'); fps = round(vidObj.FrameRate);
cd videos/ ffmpeg -i thisismyvideo.mp4
-
COMP
If you desire 1 classifier built on multiple animals, set this parameter to 1; Otherwise, this will build individual classifier/.csv file. Default is 1. -
SMTH_HSTRY
andSMTH_FUTR
designates number of frames for BOXCAR smoothing to reduce noise levels of the signal detected from DeepLabCut 2.0. This depends on your frame-rate. The default setting will automatically scale the number of frames to smooth from to approximately 40ms before and after. Obviously, the higher the sampling rate, the more de-noise this will perform. -
GMMCLASS
Maximum number of randomly-initialized Gaussian Mixture Models to fit. Default is 30. For naturalistic open field behavior, sub-sampling action groups may overfit the animal. -
IT
The number of random initialization for Gaussian Mixture Models. This attempts to find a global optimum, instead of local optimum. Default is 20.
F_10FPS
Compiled features that were used to cluster, 10fps temporal resolution.TSNE_FEATS
An N x 3 matrix that represents action space. Based on the 7 informative features collected, we utilized a type of dimensionality reduction algorithm: t-Distributed Stochastic Neighbor Embedding, or t-SNE, which emphasizes on preservation of local distances.GRP
Statistically different groups of actions based on unsupervised Gaussian Mixture Models. If you are sampling above 10 frames-per-second (~100ms/frame), this will reduce the samples to assist with the efficiency of clustering.LLH
Log-likelihood to see for yourself that the expectation-maximization indeed converged at a local/global optimum.BSOID_FIG
A 3-dimensional scatter plot showing the user how the different groups are located in t-SNEaction space.
Run bsoid_mdl.md next.