EPFL

Algo+LMA

- Run experiments to demonstrate that the probability of an error (in identification of mapping between two graphs) is small and the proposed function of error measure behave better then the other functions (e.g. degree distribution matching)
- Read the paper “Bootstrap percolation on G(n,p)” by S.Janson et al. Formulated the model for a propagation de-anonymization algorithm by analogy with the paper.

- Presented an introduction to a Graphical models in the Netdynx reading group
- Working on social networks de-anonymization problem. Establishing a result of feasibility of large scale attacks on the networks. Searching for function measuring an error in the graph matching. Finding this function would allow us to identify a correct mapping among the set of all the possible mappings and thus, to de-anonymize the network.
- Trying to identify common drawbacks of propagation in networks algorithms such us: local consistency vs global consistency (propagation of an error), an oscillation or a stop before finding a result.

- Working on creation of the new graph matching algorithm. The key idea is to adopt bootstrap percolation theory to the graphs matching problem. The algorithm starts with a predefined small set of correct matching pairs named “seeds” and it matches a pair nodes if it has several already mapped neighbors. Firstly, we work on the theoretical part. In particular we want to establish a threshold on the initial size of the seed set.
- Continue working on the feasibility of network de-anonymization. We assume that network is sampled from G(n,p) twice each node independently with probability t. Assuming that sizes of the networks are equal their expected sizes (nt), we prove that the “correct” matching is the maximal matching preserving isomorphism.

- Continue working on the propagation graph mapping algorithm. I show that with condition that G(n,p) network is dense enough (p*n is greater than log n) the probability to get a wrong matching is going to 0 a.a.s.
- Finalizing “De-anonymization of Social Networks. Node sampling model” result. With proposed assumption we show that the probability of existing non-correct mapping preserving isomorphic is 0 a.a.s.

- Working on the propagation graph mapping algorithm.
- Finishing the result on “De-anonymization of Social Networks. Node sampling model”. Establishing conditions on density and similarity of two networks to make de-anonymization feasible.

- Working on the propagation graph mapping algorithm. We prove that there is a phase transition on the size of the initial seed set. We show that if the size of the seed set S is greater than threshold S* than the algorithm propagates and outputs a correct mapping, if S<S* algorithm dies out on early phase. We made an explicit form expression for S*.
- Exploring applications of matching algorithms for Gene Networks. Reading “Gene regulatory network inference: Data integration in dynamic models” by M. Hecker et al.

- Experimenting on and implementing the basic version of the propagation graph mapping algorithm.

We implement and run algorithm on artificial G(n,p) graph to observe phase transitions proved in theoretical part.

- Exploring applications of matching algorithms for Gene and Protein Networks. We consider a problem of probabilistic inferring network. We explored main characteristic of gene networks, such as motifs and we try to build probabilistic framework to apply to large gene-regulatory-network (GRN) inferring problem

- Experimenting on the propagation graph mapping algorithm on real graphs. We took EPFL e-mail exchange dataset, build two overlapping version of it via sampling model and run the algorithm. The results are very promising in terms of low error rate and observed phase transitions

- Preparing journal paper with Pedram Pedarsani on de-anonymization social networks. Union of Node and Edge sampling models.
- Running experiments of the algorithm on the real and artificial (regular) graphs.

- Optimizing algorithm's accuracy. Instead of adding all the new matchings that have a number mapped neighbors over threshold, we add only one matching with maximal number mapped neighbors each time step.
- We meet the problem that the algorithm takes too much memory for large graphs. Working on memory optimization and running the algorithm on Condor cluster.

- Run experiments on graph matching algorithm on the large real datasets. We took a publicly available SlashDot social network (80K nodes) and run the propagation algorithm on it.
- Work on summarizing results.
- Combine the algorithm with another probabilistic algorithm to obtain a small initial seed set.

- Participating in “Data for Development”-D4D an open data challenge from Orange. Having datasets of user calls for several month in Ivory Coat, we are modeling epidemic spread in country

- Running simulation of the newer versions of the propagation algorithm on large scale artificial and real datasets using Condor cluster.

- Orange-d4d challenge: We introduced a mobility pattern based on place of antenas where users make calls. We use SIR model to build an epidemic process. We are developing a set of recommended measures to mitigate epidemics within a fixed budget.

- Finalizing Orange-d4d challenge project. We proposed three micromeasures which allow significantly slow-down epidemic. Prepare submission report.
- Preparing materials for a new Internet Analytics course

- Summarizing results of expirements on the propagation graph mapping algorithm
- Preparing materials for a new Internet Analytics course

Algorithms, Autumn-2012.

Internet Analytics, Spring-2013