Skip to content

norbertbin/SpecClustPack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SpecClustPack

A bunch of R functions related to spectral clustering.

Installation

The SpecClustPack package can be installed in R directly from GitHub by using devtools.

library(devtools)
install_github("norbertbin/SpecClustPack")

Simulate from Stochastic Blockmodel

blockPMat = matrix(c(.6,.2,.2,.6), nrow=2)
nMembers = c(5,5)

adjMat = simSBM(blockPMat, nMembers)
adjMat
##  10 x 10 sparse Matrix of class "dsCMatrix"
##                         
##  [1,] . . 1 1 1 . . . . 1
##  [2,] . . 1 1 1 . . 1 . .
##  [3,] 1 1 . . . . . . . .
##  [4,] 1 1 . . 1 . . . 1 .
##  [5,] 1 1 . 1 . . . . . .
##  [6,] . . . . . . 1 1 . 1
##  [7,] . . . . . 1 . 1 . 1
##  [8,] . 1 . . . 1 1 . . 1
##  [9,] . . . 1 . . . . . .
## [10,] 1 . . . . 1 1 1 . .

Plot the SBM Probability Matrix

plotSBM(blockPMat, nMembers)

Plot the Simulated Adjacency Matrix

plotAdj(adjMat)

Run Spectral Clustering

By default, the specClust function uses regularized spectral clustering (Qin and Rohe, 2013) with row normalization, but can be adjusted by changing the method and rowNorm parameters.

(clusters = specClust(adjMat, nBlocks = 2))
## [1] 1 1 1 1 1 2 2 2 1 2

Compute the Mis-clustering Rate

The function misClustRate computes the proportion of mis-clustered nodes (up to identifiability) given the cluster sizes.

misClustRate(clusters, nMembers)
## [1] 0.1

Estimate SBM Probabilities

The function estSBM estimates the block probability matrix given the adjacency matrix and the cluster assignments.

estSBM(adjMat, clusters)
##            [,1]       [,2]
## [1,] 1.00000000 0.08333333
## [2,] 0.08333333 0.53333333

Simulate Node Covariates

covProbMat = matrix(c(.8,.2,.2,.8), nrow=2)
nMembers = c(5,5)

covMat = simBernCovar(covProbMat, nMembers)
covMat
## [1,] 1 .
## [2,] 1 1
## [3,] 1 .
## [4,] . .
## [5,] 1 1
## [6,] . .
## [7,] . 1
## [8,] . 1
## [9,] 1 1
##[10,] . 1

Covariate-Assisted Spectral Clustering

The required input for the casc function includes an adjacency matrix, adjMat, a node covariate matrix, covMat, and the number of blocks to be recovered, nBlocks. For more details see the documentation.

casc(adjMat, covMat, nBlocks=2)
## $cluster
## [1] 1 1 1 1 1 2 2 2 2 2
##
## $h
## [1] 0.08101691
##
## $wcss
## [1] 0.1789759
##
## $eigenGap
## [1] 0.06532486

Partial Spectral Clustering

The partSpecClust function only runs an eigendecomposition on the adjacency matrix of the highest degree nodes in the network and uses the Nystrom extension to approximate the full eigenvectors (Belabbas and Wolfe, 2009). The approximate eigenvectors are then used for spectral clustering. The parameter subSampleSize specifies how many of the top degree nodes should be used.

(clusters = partSpecClust(adjMat, nBlocks = 2, subSampleSize = 8))
## [1] 1 1 1 1 1 2 2 2 1 2

About

An R package of functions related to spectral clustering.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages