pygkernels — Kernels on Graphs for Python

Framework for clustering graph nodes using various similarity/dissimilarity measures.

List of measures:

Adjacency matrix based kernels:
- Katz: Katz kernel (a.k.a. Walk, Von Neumann diffusion kernel)
- Comm: Communicability kernel (a.k.a. Exponential diffusion kernel)
- DFS: Double Factorial similarity
Laplacian based kernels:
- For: Forest kernel (a.k.a. Regularized Laplacian kernel)
- Heat: Heat kernel (a.k.a. Laplacian exponential diffusion kernel)
- NHeat: Normalized Heat kernel
- Abs: Absorption kernel
Markov matrix based kernels and measures:
- PPR: Personalized PageRank
- MPPR: Modified Personalized PageRank
- HPR: PageRank heat similarity measure
- RSP: Randomized Shortest Path distance
- FE: Free Energy distance
Sigmoid Commute Time:
- SCT: Sigmoid Commute Time
- CCT: Corrected Commute Time
- SCCT: Sigmoid Corrected Commute Time
SP-CT: Shortest Path, Commute Time, and their linear combination
logarithmic version of every measure

Every measure is presented as dissimilarity (distance) and similarity (kernel/proximity) measure. All of them can be used in any classification/clustering/community detection algorithm which uses kernel trick (e.g. kernel k-means).

List of clustering algoritms:

Kernel k-means (GPU support, powered with pytorch)
Spectral clustering
Ward clustering
Wrappers for kernel k-means from kernlab, sklearn k-means

Graph generators:

Stochastic Block Model
LFR (networkx wrapper)

Graph datsets:

https://github.com/vlivashkin/community-graphs

Install

pip install git+https://github.com/vlivashkin/pygkernels.git

Usage

Simple clustering:

import networkx as nx
from pygkernels.cluster import KKMeans
from pygkernels.measure import logComm_H

G = nx.read_gml('news_2cl1.gml')
A = nx.adjacency_matrix(G).toarray()

estimator = KKMeans(n_clusters=2)
K = logComm_H(A).get_K(param=0.1)
partition = estimator.predict(K, A=A)

Grid search measure parameters:

import numpy as np
from sklearn.metrics import adjusted_rand_score
from pygkernels.cluster import KKMeans
from pygkernels.data import StochasticBlockModel
from pygkernels.measure import logComm_H
from pygkernels.scenario import ParallelByGraphs

n_graphs, n, k, p_in, p_out = 100, 100, 2, 0.3, 0.1  # params for SBM graph generator
n_params = 30  # grid search through n_params in [0, 1] space

graphs, _ = StochasticBlockModel(n, k, p_in=p_in, p_out=p_out).generate_graphs(n_graphs)
gridsearch = ParallelByGraphs(adjusted_rand_score, n_params, progressbar=True, ignore_errors=True)

params, ari, error = gridsearch.perform(KKMeans, logComm_H, graphs, k, n_jobs=-1, n_gpu=2)
best_param = params[np.argmax(ari)]

Generate LFR graphs like datasets:

from pygkernels.data import Datasets, LFRGenerator

(A, partition), info = Datasets()['news_2cl1_0.1']
gen = LFRGenerator.params_from_adj_matrix(A, partition, info['k'])
print(gen.generate_info())
A, partition = gen.generate_graph()
gen = LFRGenerator.params_from_adj_matrix(A, partition, info['k'])
print(gen.generate_info())

All examples are located in ./examples. More usage approaches are in a separate repo https://github.com/vlivashkin/pygkernels-experiments

Citation

@inproceedings{ivashkin2021dissecting,
  title={Dissecting graph measure performance for node clustering in LFR parameter space},
  author={Ivashkin, Vladimir and Chebotarev, Pavel},
  booktitle={International Conference on Complex Networks and Their Applications},
  pages={328--341},
  year={2021},
  organization={Springer}
}

Name		Name	Last commit message	Last commit date
Latest commit History 376 Commits
examples		examples
pygkernels		pygkernels
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
LICENSE		LICENSE
README.md		README.md
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pygkernels — Kernels on Graphs for Python

List of measures:

List of clustering algoritms:

Graph generators:

Graph datsets:

Install

Usage

Simple clustering:

Grid search measure parameters:

Generate LFR graphs like datasets:

Citation

About

Releases

Packages

Contributors 3

Languages

License

vlivashkin/pygkernels

Folders and files

Latest commit

History

Repository files navigation

pygkernels — Kernels on Graphs for Python

List of measures:

List of clustering algoritms:

Graph generators:

Graph datsets:

Install

Usage

Simple clustering:

Grid search measure parameters:

Generate LFR graphs like datasets:

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages