Skip to content

somisettynaresh/TargetedInfluenceMaximization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TargetedInfluenceMaximization

The objective of influence maximization problem is to find a set of highly influential nodes that maximizes the spread of influence in a social network. Such a set of nodes is called seed set. Targeted labeled influence maximization problem is an extension that attempts to find a seed set that maximizes influence among certain labeled nodes. However, in certain application areas such as market and political sciences, it is desirable to limit the spread of influence on certain set of nodes while maximizing the influence spread among different set of nodes. Motivated by this, in this work we formulate and study Constrained Targeted Influence Maximization problem where a network has two types of nodes -- targets and non-targets. For a given k and theta, the objective is to find a k size seed set which maximizes the influence over the targets and keeps the influence over the non-targets within the threshold theta.

Getting Started

This project consists of implementations of various algorithms for influence maximization problem.

Prerequisites

Java 1.8.

IDE IntelliJ/Eclipse

Table Of Contents

Modules

Our framework consists of four major modules.

1. Graph Generator Module:

This module is responsible for generating a graph from a file, generating labels for the nodes and converting undirected graph to a directed graph.

2. Non-Target Estimator Module:

This module is responsible for estimating the nontargets for every node in the graph (Phase 1). This module consists of two strategies (implementations) to estimate the non-targets: 1) Using Simulation 2) Reachability Tests on random DAG's.

3. Max Influential Node Module:

This module is responsible for finding the max influential node among the given set of nodes (crucial sub step in Phase 2). This module has three strategies : 1) Using Simulation 2) Reachability Tests on random DAG's 3) Degree Discount Heuristic.

4. IMTree Module:

This module is responsible for constructing the IMTree based on the strategy input and finding the seed set from constructed IMTree.This module has two variations for constructing the IMTree - one with pruning and other without pruning. This module also consists of a utility to read/write IMTree from/to a file.

Usage

Creating Graph :

Given a file (edges, vertices seperated by tab space) and propagation probability (0.0 - 1.0f), we can create a graph using the following

FileDataReader wikiVoteDataReader = new FileDataReader(filename, probability);
DirectedGraph graph = wikiVoteDataReader.createGraphFromData();

We used datasets from SNAP.

Generating Graph Labels :

Given a graph file and percentage (0.0f - 1.0f), we label the given percentage of vertices as A and other vertices as B at random.

GenerateGraphLabels generateGraphLabels = new GenerateGraphLabels();
generateGraphLabels.generateLabels("DBLP.txt",0.8f);

The generated file can be found in the main project directory with filename of the format DBLP_0.8_labels.txt

We can use the above generated label file and update the labels in the DirectedGraph using the following

FileDataReader wikiVoteDataReader = new FileDataReader(filename, probability);
DirectedGraph graph = wikiVoteDataReader.createGraphFromData();
ReadLabelsFromFile readLabels = new ReadLabelsFromFile();
readLabels.read(graph, label_file);

Independent Cascade Model :

This is the implementation of diffusion model, Independent Cascade Model, proposed by Kempe. et al in the paper Maximizing the Spread of Influence through a Social Network. Given a graph, seed set and number of monto carlo simulations we perform diffusion and outputs the activated nodes at the end of diffusion. The results are averaged among all the simulations. Our implementation supports caching, if you already have an activated set from previous experiments, you can pass the already activated set as the last paramter. If we pass an empty set, our algorithm assumes no cache and do the diffusion completely.

Set<Integer> activeSet = IndependentCascadeModel.performDiffusion(graph,seedSet,10000, new HashSet<>());

Algorithms for Influence Maximization problem

This sections consists of various implementations of influence maximization algorithms.

Greedy

This is the implemetation of Greedy algorithm for influence maximization problem described in the paper Maximizing the Spread of Influence through a Social Network. Given a graph, budget, target labels (make every label as target labels to ignore targets) and number of simulations, greedy algorithm outputs the seed set of the budget size.

Greedy greedy = new Greedy();
Set<Integer> seedSet = greedy.findSeedSet(graph, budget, targetLabels, 10000);

CELF Greedy

This is the implemetation of CELF Greedy algorithm for influence maximization problem described in the paper Cost-effective Outbreak Detection in Networks. Given a graph, budget, target labels (make every label as target labels to ignore targets) and number of simulations, celf greedy algorithm outputs the seed set of the budget size.

CELFGreedy celfGreedy = new CELFGreedy();
Set<Integer> seedSet = celfGreedy.findSeedSet(graph, budget, targetLabels, 10000);

Degree Discount

This is the implemetation of Degree Discount algorithm for influence maximization problem described in the paper Efficient Influence Maximization in Social Networks. Given a graph, budget, target labels (make every label as target labels to ignore targets), degree discount algorithm outputs the seed set of the budget size.

DegreeDiscount dd = new DegreeDiscount();
Set<Integer> seedSet = dd.findSeedSet(graph, budget, targetLabels);

Algorithms for CTIM problem :

This sections consists of various implementations of constrained targeted influence maximization algorithms.

Baseline Greedy :

Given a graph, budget, nonTargetThreshold, target labels , non target labels and number of simulations, baseline greedy algorithm outputs the seed set.

NaiveGreedy greedy = new NaiveGreedy();
Set<Integer> seedSet = greedy.findSeedSet(graph, budget, nonTargetThreshold, targetLabels, nonTargetLabels, noOfSimulations);

DAG Baseline :

Given a graph, budget, nonTargetThreshold, target labels , non target labels and number of simulations, dag baseline algorithm outputs the seed set.

DAGBaselineGreedy dagBaseline = new DAGBaselineGreedy();
Set<Integer> seedSet = dagBaseline.findSeedSet(graph, budget, nonTargetThreshold, targetLabels, nonTargetLabels, noOfSimulations);

Two Phase Implementation for CTIM :

Simlulator is the main class for simulating the two phase implementation for the ctim problem. When you execute the Simulator.java , it takes the following as the input

Enter Graph File Name
ca-GrQc.txt
Enter the propagation probability
0.05
Enter percentage of A's to be in Graph
80
Enter budget of seed set
20
Enter non target threshold
10
Enter the NonTargetsEstimate filename
results\ca-GrQc-data.dat
Enter the Influence Maximization Strategy (1-6)
1

Based on the input of Influence Maximization Strategy, simulator runs the corresponding implementations of the two phase implementation. The Simulator writes the constucted the Influence Maximization Tree in a file for which it prints the filename. For the two phase implementation, we can reuse the result of phase 1 (estimating non-targets) using the parameter NonTargetsEstimate filename in the simulator.

Phase 1- Estimating Non Targets:

This is the implementation of Phase 1. We can generate the estimate file using the following strategies.

Estimating Non-Targets Using Simulation:

EstimateNonTargets egreedy = new EstimateNonTargetsUsingGreedy();
egreedy.estimate(graph, nonTargetLabels, 10000);

Estimating Non-Targets Using DAG:

EstimateNonTargets edag = new EstimateNonTargetsUsingRandomDAG();
edag.estimate(graph, nonTargetLabels, 10000);

Phase 2 Constructing the IMTree :

The following are the various Influence Maximization Stratagies

GREEDY_ESTIMATOR_AND_GREEDY_INFLUENTIAL(1),
GREEDY_ESTIMATOR_AND_DEGREE_DISCOUNT_INFLUENTIAL(2),
GREEDY_ESTIMATOR_AND_RANDOM_DAG_INFLUENTIAL(3),
GREEDY_ESTIMATOR_AND_CELF_INFLUENTIAL(4),
RANDOM_DAG_ESTIMATOR_AND_GREEDY_INFLUENTIAL(5),
RANDOM_DAG_ESTIMATOR_AND_DEGREE_DISCOUNT_INFLUENTIAL(6),
RANDOM_DAG_ESTIMATOR_AND_RANDOM_DAG_INFLUENTIAL(7),
RANDOM_DAG_ESTIMATOR_AND_CELF_INFLUENTIAL(8);

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages