Skip to content

Commit

Permalink
Merge branch 'release-0.5.2'
Browse files Browse the repository at this point in the history
  • Loading branch information
vtraag committed Mar 5, 2015
2 parents 9e9f80a + dd4296b commit 79ab87c
Show file tree
Hide file tree
Showing 11 changed files with 160 additions and 53 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
0.5.2
- Ensured that random neighbour selection works in O(1) rather than O(k), with k the average number of neighbours.
- Optimized the calculation of weight from/to community.
- Included some missing references.

0.5.1
Corrected some mistakes which prevented it from being properly used on PyPi.
No serious changes were made.

0.5
Initial release
1 change: 1 addition & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
include LICENSE
include INSTALL
include CHANGELOG
include README.md
recursive-include include *.h
40 changes: 22 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,44 @@
INTRODUCTION
============

This package implements the louvain algorithm in ``C++`` and exposes it to
This package implements the louvain algorithm [1] in ``C++`` and exposes it to
``python``. It relies on (python-)``igraph`` for it to function. Besides the
relative flexibility of the implementation, it also scales well, and can be run
on graphs of millions of nodes (as long as they can fit in memory). The core
function is ``find_partition`` which finds the optimal partition using the
louvain algorithm for a number of different methods. The methods currently
implemented are:
louvain algorithm for a number of different methods. The original implementation
is available from https://sites.google.com/site/findcommunities/. The methods
currently implemented are:

Modularity
This method compares the actual graph to the expected graph, taking into
account the degree of the nodes [1]. The expected graph is based on a
account the degree of the nodes [2]. The expected graph is based on a
configuration null-model.

RBConfiguration
This is an extension of modularity which includes a resolution parameter [2].
This is an extension of modularity which includes a resolution parameter [3].
In general, a higher resolution parameter will lead to smaller communities.

RBER
A variant of the previous method that instead of a configuration null-model
uses a Erdös-Rényi null-model in which each edge has the same probability of
appearing [2].
appearing [3].

CPM
This method compares to a fixed resolution parameter, so that it finds
communities that have an internal density higher than the resolution
parameter, and is separated from other communities with a density lowerer than
the resolution parameter [3].
the resolution parameter [4].

Significance
This is a probabilistic method based on the idea of assessing the probability
of finding such dense subgraphs in an (ER) random graph [4].
of finding such dense subgraphs in an (ER) random graph [5].

Surprise
Another probabilistic method, but rather than the probability of finding dense
subgraphs, it focuses on the probability of so many edges within communities
[5, 6].
[6, 7].


INSTALLATION
============
Expand Down Expand Up @@ -115,7 +117,7 @@ sum over all layers, weighted by some weight. If we denote by ``q_k`` the qualit
of layer ``k`` and the weight by ``w_k``, the overall quality is then ``q = sum_k
w_k q_k``. This can also be useful in case you have negative links. In
principle, this could also be used to detect temporal communities in a dynamic
setting, cf. [7].
setting, cf. [8].

For example, assuming you have a graph with positive weights ``G_positive`` and
a graph with negative weights ``G_negative``, and you want to use Modularity for
Expand All @@ -135,7 +137,7 @@ the partition. Notice that this runs much slower than only considering
neighbouring communities (which is the default).

Various methods (such as Reichardt and Bornholdt's Potts model, or CPM) support
a (linear) resolution parameter, which can be effectively bisected, cf. [4]. You
a (linear) resolution parameter, which can be effectively bisected, cf. [5]. You
can do this by calling:
```python
res_parts = louvain.bisect(G, method='CPM', resolution_range=[0,1]);
Expand Down Expand Up @@ -164,18 +166,20 @@ REFERENCES

Please cite the references appropriately in case they are used.

1. Newman, M. & Girvan, M. Finding and evaluating community structure in networks.
1. Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding
of communities in large networks. J. Stat. Mech. 2008, P10008 (2008).
2. Newman, M. & Girvan, M. Finding and evaluating community structure in networks.
Physical Review E 69, 026113 (2004).
2. Reichardt, J. & Bornholdt, S. Partitioning and modularity of graphs with arbitrary
3. Reichardt, J. & Bornholdt, S. Partitioning and modularity of graphs with arbitrary
degree distribution. Physical Review E 76, 015102 (2007).
3. Traag, V. A., Van Dooren, P. & Nesterov, Y. Narrow scope for resolution-limit-free
4. Traag, V. A., Van Dooren, P. & Nesterov, Y. Narrow scope for resolution-limit-free
community detection. Physical Review E 84, 016114 (2011).
4. Traag, V. A., Krings, G. & Van Dooren, P. Significant scales in community structure.
5. Traag, V. A., Krings, G. & Van Dooren, P. Significant scales in community structure.
Scientific Reports 3, 2930 (2013).
5. Aldecoa, R. & Marín, I. Surprise maximization reveals the community structure
6. Aldecoa, R. & Marín, I. Surprise maximization reveals the community structure
of complex networks. Scientific reports 3, 1060 (2013).
6. Traag, V.A., Aldecoa, R. & Delvenne, J.-C. Detecting communities using Asymptotical
7. Traag, V.A., Aldecoa, R. & Delvenne, J.-C. Detecting communities using Asymptotical
Surprise. Forthcoming (2015).
7. Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P.
8. Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P.
Community structure in time-dependent, multiscale, and multiplex networks.
Science 328, 876–8 (2010).
11 changes: 11 additions & 0 deletions include/GraphHelper.h
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,16 @@ class Graph
get_neighbour_edges(size_t v, igraph_neimode_t mode);
vector< size_t >*
get_neighbours(size_t v, igraph_neimode_t mode);
size_t get_random_neighbour(size_t v, igraph_neimode_t mode);
inline size_t get_random_node()
{
return this->get_random_int(0, this->vcount() - 1);
};

inline size_t get_random_int(size_t from, size_t to)
{
return igraph_rng_get_integer(igraph_rng_default(), from, to);
};

inline size_t vcount() { return igraph_vcount(this->_graph); };
inline size_t ecount() { return igraph_ecount(this->_graph); };
Expand Down Expand Up @@ -161,6 +171,7 @@ class Graph
void set_default_edge_weight();
void set_default_node_size();
void set_self_weights();

};

// We need this ugly way to include the MutableVertexPartition
Expand Down
4 changes: 2 additions & 2 deletions include/Optimiser.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,14 @@ class Optimiser
double optimize_partition(MutableVertexPartition* partition);
template <class T> T* find_partition(Graph* graph);
template <class T> T* find_partition(Graph* graph, double resolution_parameter);
double move_nodes(MutableVertexPartition* partition);
double move_nodes(MutableVertexPartition* partition, int consider_comms);

// The multiplex functions that simultaneously optimize multiple graphs and partitions (i.e. methods)
// Each node will be in the same community in all graphs, and the graphs are expected to have identical nodes
// Optionally we can loop over all possible communities instead of only the neighbours. In the case of negative
// layer weights this may be necessary.
double optimize_partition(vector<MutableVertexPartition*> partitions, vector<double> layer_weights);
double move_nodes(vector<MutableVertexPartition*> partitions, vector<double> layer_weights);
double move_nodes(vector<MutableVertexPartition*> partitions, vector<double> layer_weights, int consider_comms);

virtual ~Optimiser();

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ def read(fname):

options = dict(
name = 'louvain',
version = '0.5.1',
version = '0.5.2',
description = 'Louvain is a general algorithm for methods of community detection in large networks.',
long_description=read('README.md'),
license = 'GPLv3+',
Expand Down
2 changes: 1 addition & 1 deletion setup_ms.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ def read(fname):

options = dict(
name = 'louvain',
version = '0.5.1',
version = '0.5.2',
description = 'Louvain is a general algorithm for methods of community detection in large networks.',
long_description=read('README.md'),
license = 'GPLv3+',
Expand Down
89 changes: 82 additions & 7 deletions src/GraphHelper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,7 @@ void Graph::set_self_weights()

void Graph::init_admin()
{

size_t m = this->ecount();

// Determine total weight in the graph.
Expand Down Expand Up @@ -343,14 +344,8 @@ double Graph::weight_tofrom_community(size_t v, size_t comm, vector<size_t>* mem
igraph_neighbors(this->_graph, &neighbours, v, mode);
for (size_t i = 0; i < degree; i++)
{
size_t e = VECTOR(incident_edges)[i];
size_t u = VECTOR(neighbours)[i];

// Get the weight of the edge
double w = this->_edge_weights[e];
// Self loops appear twice here if the graph is undirected, so divide by 2.0 in that case.
if (u == v && !this->is_directed())
w /= 2.0;
// If it is an edge to the requested community
#ifdef DEBUG
size_t u_comm = (*membership)[u];
Expand All @@ -360,12 +355,19 @@ double Graph::weight_tofrom_community(size_t v, size_t comm, vector<size_t>* mem
#ifdef DEBUG
cerr << "\t" << "Sum edge (" << v << "-" << u << "), Comm (" << comm << "-" << u_comm << ") weight: " << w << "." << endl;
#endif
size_t e = VECTOR(incident_edges)[i];
// Get the weight of the edge
double w = this->_edge_weights[e];
// Self loops appear twice here if the graph is undirected, so divide by 2.0 in that case.
if (u == v && !this->is_directed())
w /= 2.0;

total_w += w;
}
#ifdef DEBUG
else
{
cerr << "\t" << "Ignore edge (" << v << "-" << u << "), Comm (" << comm << "-" << u_comm << ") weight: " << w << "." << endl;
cerr << "\t" << "Ignore edge (" << v << "-" << u << "), Comm (" << comm << ") weight: " << this->_edge_weights[VECTOR(incident_edges)[i]] << "." << endl;
}
#endif
}
Expand Down Expand Up @@ -415,6 +417,79 @@ Graph::get_neighbours(size_t v, igraph_neimode_t mode)
return neighs;
}

/********************************************************************************
* This should return a random neighbour in O(1)
********************************************************************************/
size_t Graph::get_random_neighbour(size_t v, igraph_neimode_t mode)
{
size_t node=v;
size_t rand_neigh = -1;

if (this->degree(v, mode) <= 0)
throw Exception("Cannot select a random neighbour for an isolated node.");

if (igraph_is_directed(this->_graph) && mode != IGRAPH_ALL)
{
if (mode == IGRAPH_OUT)
{
// Get indices of where neighbours are
size_t cum_degree_this_node = (size_t) VECTOR(this->_graph->os)[node];
size_t cum_degree_next_node = (size_t) VECTOR(this->_graph->os)[node+1];
// Get a random index from them
size_t rand_neigh_idx = igraph_rng_get_integer(igraph_rng_default(), cum_degree_this_node, cum_degree_next_node - 1);
// Return the neighbour at that index
#ifdef DEBUG
cerr << "Degree: " << this->degree(node, mode) << " diff in cumulative: " << cum_degree_next_node - cum_degree_this_node << endl;
#endif
rand_neigh = VECTOR(this->_graph->to)[ (size_t)VECTOR(this->_graph->oi)[rand_neigh_idx] ];
}
else if (mode == IGRAPH_IN)
{
// Get indices of where neighbours are
size_t cum_degree_this_node = (size_t) VECTOR(this->_graph->is)[node];
size_t cum_degree_next_node = (size_t) VECTOR(this->_graph->is)[node+1];
// Get a random index from them
size_t rand_neigh_idx = igraph_rng_get_integer(igraph_rng_default(), cum_degree_this_node, cum_degree_next_node - 1);
#ifdef DEBUG
cerr << "Degree: " << this->degree(node, mode) << " diff in cumulative: " << cum_degree_next_node - cum_degree_this_node << endl;
#endif
// Return the neighbour at that index
rand_neigh = VECTOR(this->_graph->from)[ (size_t)VECTOR(this->_graph->ii)[rand_neigh_idx] ];
}
}
else
{
// both in- and out- neighbors in a directed graph.
size_t cum_outdegree_this_node = (size_t)VECTOR(this->_graph->os)[node];
size_t cum_indegree_this_node = (size_t)VECTOR(this->_graph->is)[node];

size_t cum_outdegree_next_node = (size_t)VECTOR(this->_graph->os)[node+1];
size_t cum_indegree_next_node = (size_t)VECTOR(this->_graph->is)[node+1];

size_t total_outdegree = cum_outdegree_next_node - cum_outdegree_this_node;
size_t total_indegree = cum_indegree_next_node - cum_indegree_this_node;

size_t rand_idx = igraph_rng_get_integer(igraph_rng_default(), 0, total_outdegree + total_indegree - 1);

#ifdef DEBUG
cerr << "Degree: " << this->degree(node, mode) << " diff in cumulative: " << total_outdegree + total_indegree << endl;
#endif
// From among in or out neighbours?
if (rand_idx < total_outdegree)
{ // From among outgoing neighbours
size_t rand_neigh_idx = cum_outdegree_this_node + rand_idx;
rand_neigh = VECTOR(this->_graph->to)[ (size_t)VECTOR(this->_graph->oi)[rand_neigh_idx] ];
}
else
{ // From among incoming neighbours
size_t rand_neigh_idx = cum_indegree_this_node + rand_idx - total_outdegree;
rand_neigh = VECTOR(this->_graph->from)[ (size_t)VECTOR(this->_graph->ii)[rand_neigh_idx] ];
}
}

return rand_neigh;
}

/****************************************************************************
Creates a graph with communities as node and links as weights between
communities.
Expand Down
2 changes: 1 addition & 1 deletion src/MutableVertexPartition.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -246,7 +246,7 @@ void MutableVertexPartition::move_node(size_t v,size_t new_comm)
size_t node_size = this->graph->node_size(v);
size_t old_comm = this->_membership[v];

// Incidentally, this is indepentend of whether we take into account self-loops or not
// Incidentally, this is independent of whether we take into account self-loops or not
// (i.e. whether we count as n_c^2 or as n_c(n_c - 1). Be careful to do this before the
// adaptation of the community sizes, otherwise the calculations are incorrect.
_total_possible_edges_in_all_comms += 2.0*node_size*(this->_csize[new_comm] - this->_csize[old_comm] + node_size)/(2.0 - this->graph->is_directed());
Expand Down
Loading

0 comments on commit 79ab87c

Please sign in to comment.