Today’s paper review is of particular interest to all working or interested in the intersection of Machine Learning, Statistical Learning and Complex Systems. The proper combination of those subjects coalesces in dynamical and highly complex systems of multilayer networks, that can be used to predict events and developments in a wide range of fields: from cognitive psychology to social interactions, sales prediction, marketing campaigns, infectious diseases and international relations, to mention just a few:
A plethora of networks is being collected in a growing number of fields, including disease transmission, international relations, social interactions, and others. As data streams continue to grow, the complexity associated with these highly multidimensional connectivity data presents new challenges. In this paper, we focus on time-varying interconnections among a set of actors in multiple contexts, called layers. Current literature lacks flexible statistical models for dynamic multilayer networks, which can enhance quality in inference and prediction by efficiently borrowing information within each network, across time, and between layers. Motivated by this gap, we develop a Bayesian nonparametric model leveraging latent space representations. Our formulation characterizes the edge probabilities as a function of shared and layer-specific actors positions in a latent space, with these positions changing in time via Gaussian processes. This representation facilitates dimensionality reduction and incorporates different sources of information in the observed data. In addition, we obtain tractable procedures for posterior computation, inference, and prediction. We provide theoretical results on the flexibility of our model. Our methods are tested on simulations and infection studies monitoring dynamic face-to-face contacts among individuals in multiple days, where we perform better than current methods in inference and prediction.
Highly complex networks such as social networks continues to be a subject of study where there’s a gap between the interest it generates, the volume of attention it gathers on the one hand and the relative little progress made so far due to lack of sufficient understanding, the right methodologies and/or the tradeoff between these goals with the costs and time needed to achieve them:
In modeling these highly complex networks, it is of paramount interest to learn the wiring processes underlying the observed data and to infer differences in networks’ structures across layers and times. Improved estimation of the data’s generating mechanism can refine the understanding of social processes and enhance the quality in prediction of future networks. In order to successfully accomplish these goals, it is important to define statistical models which can incorporate the different sources of information in the observed data, without affecting flexibility. However, current literature lacks similar methods, to our knowledge.
The authors of the paper, composed of a group of researchers from Universities in Europe and the United States, proposed a methodology to address these issues, with a particular focus on a non-parametric Bayesian model with preserving flexibility in dealing with the data (heterogeneity vs homogeneity), that is, there isn’t much rigidity when setting the experimental set up for performing the implementation of the model. They conclude that their approach out performs other approaches and current methods in inference and out-of-sample prediction:
Motivated by this gap, we develop a Bayesian nonparametric model for dynamic multilayer networks which efficiently incorporates dependence within each network, across time and between the different layers, while preserving flexibility. Our formulation borrows network information by defining the edge probabilities as a function of pairwise similarities between actors in a latent space. In order to share information among layers without affecting flexibility in modeling layer-specific structures, we force a subset of the coordinates of each actor to be common across layers and let the remaining coordinates to vary between layers. Finally, we accommodate network dynamics by allowing the actors’ coordinates to change in time and incorporate time information by modeling the dynamic actors’ coordinates via Gaussian processes (e.g. Rasmussen and Williams, 2006). Our model is tractable and has a theoretical justification, while providing simple procedures for inference and prediction. In addition, we find that our procedures out perform current methods in inference and out-of-sample prediction on both simulated and real data
Earlier work on statistical learning with multidimensional network data has focused on an either or approach between models of multilayer networks and dynamic networks. There were not ever an attempt to put those models on the same setting merged in a combined model, where both are used. This work manages to overcome that:
Current literature for multidimensional network data considers settings in which the multiple networks are either dynamic or multilayer. Statistical modeling of dynamic networks has focused on developing stochastic processes which are designed to borrow information between edges and across time, whereas inference for multilayer networks has motivated formulations which can suitably induce dependence between edges and across the different types of relationships — characterizing the multiple layers . These contributions have generalized exponential random graph models and latent variables models for a single network to allow inference in multidimensional frameworks, when the multiple networks arise either from dynamic or multilayer studies. These methods are valuable building blocks for more flexible models, but fall far short of the goal of providing efficient procedures in more complex settings when the networks are both dynamic and multilayer.
The routine collection of dynamic multilayer networks is a recent development and statistical modeling of such data is still in its infancy. For example, Lee and Monge (2011) considered a generalization of exponential random graph models for multilayer networks, but performed a separate analysis for each time point. Oselio et al. (2014) focused instead on a dynamic stochastic block model which borrows information across time and within each network, but forces the underlying block structures to be shared between layers. Dynamic multilayer networks are complex objects combining homogenous structures shared between actors, layers, and smoothly evolving across time with layer-specific patterns and acrossactor heterogeneity. Due to this, any procedure that fails to incorporate the different sources of information in the observed data (e.g. Lee and Monge, 2011) is expected to lose efficiency, whereas models focusing on shared patterns (e.g. Oselio et al., 2014) may lack flexibility.
More general formulations are the multilayer stochastic actor-oriented model (Snijders et al., 2013) and the multilinear tensor regression (Hoff, 2015). Snijders et al. (2013) allowed for dynamic inference on network properties within and between layers, but failed to incorporate across-actor heterogeneity. This may lead to a lack of flexibility in prediction. Hoff (2015) considered autoregressive models with the vector of parameters having a tensor factorization representation. This formulation allows for across-actor heterogeneity, but forces the model parameters to be constant across time. In addition, the parameterization of the interdependence between layers relies on homogeneity assumptions. Consistent with these methods, our representation incorporates the different types of dependencies in the observed data, but crucially preserves flexibility to avoid restrictive homogeneity assumptions.
Our motivation is drawn from epidemiologic studies monitoring hourly face-to-face contacts among individuals in a rural area of Kenya during three consecutive days. Data are available from the human sensing platform SocioPatterns (http://www.sociopatterns.org) and have been collected using wearable devices that exchange low-power radio packets when two individuals are located within a sufficiently close distance to generate a potential occasion of contagion. Leveraging this technology it is possible to measure for each hour in the three consecutive days which pairs of actors had a face-to-face proximity contact. These information are fundamental to monitor the spread of diseases and learn future patterns.
Dynamic Multilayer Latent Space Model
The basic modeling of the authors, despite the added effort and underlying uncertainty it may entail, was nevertheless accomplished successfully:
One major modeling objective is to carefully incorporate dependence among edges, between layers and across time, without affecting flexibility. Recalling the motivating application in Section 1.2 and Figure 1, it is reasonable to expect three main sources of information in the dynamic multilayer face-to-face contact data, summarized below.
- Network information: For example, if individual v had a face-to-face contact with both actors u and w at time ti in day k, this information may be relevant to learn the face-to-face contact behavior between u and w at time ti in day k.
- Layer information: For example, if individuals v and u had a face-to-face contact at time ti in day k, this information may be relevant to learn the contact behavior between v and u at the same time ti in other days.
- Time information: For example, if individuals v and u had a face-to-face contact at time ti in day k, this information may be relevant to learn the contact behavior between v and u at the next time ti+1 in the same day
Incorporating such information can substantially improve the quality of inference and prediction, while facilitating dimensionality reduction and scalability. However, in reducing dimensionality and enhancing borrowing of information, it is important to avoid restrictive formulations that lead to inadequate characterizations of dynamic patterns, layer-specific structures, and across-node heterogeneity.
We accomplish the aforementioned goals via a dynamic latent bilinear model combining shared and layer-specific actors coordinates which are allowed to change in time via Gaussian process priors.
Discussion and Conclusion
After going through the process, comparing the in-the-sample with out-of-the sample results, the authors provide us with their main topics of further discussion and main conclusions of this research, which points in a good direction on the improvement of the methods used for statistical learning in dynamic settings with mulilayer networks (state-of-the art information processing):
The increasing availability of multidimensional, complex, and dynamic information on social interaction processes, motivates a growing demand for novel statistical models. In order to successfully enhance quality in inference and prediction, these models need to efficiently incorporate the complex set of dependencies in the observed data, without affecting flexibility. Motivated by this consideration and by epidemiological studies monitoring disease transmission via dynamic face-to-face interactions, we have developed a Bayesian nonparametric model for dynamic multilayer networks leveraging latent space representations. In order to preserve flexibility and borrow information across layers and time, we modeled the edge probabilities as a function of shared and layer-specific latent coordinates which evolve in time via Gaussian process priors. We provided theoretical support for our model and developed simple procedures for posterior computation and formal prediction. Finally, we illustrated on both simulated data and on infection studies monitoring face-to-face contacts that our methods perform better than competitors in terms of inference and prediction.
Although we focus on face-to-face interaction networks collected at multiple times and days, our methodology has a broad range of potential applications. Notable examples include dynamic cooperations among countries with respect to different types of international relations, time-varying interactions between researchers according to multiple forms of academic collaborations and dynamic contacts between terrorists in relation to different types of dark interactions. In all these relevant applications, our flexible methodology can provide an appealing direction in accurately learning and predicting hidden wiring mechanisms and their implication in several environments and phenomena.
In addition, our contribution motivates further directions of research. An important one is to facilitate scaling to wider time grids by reducing the Gaussian process computational complexity of order O(n^3 ). A possible strategy to successfully address this issue is to consider more scalable processes such as the low-rank approximations to the Gaussian process or the nested Gaussian process. Another important generalization is accommodating the contact counts instead of just a binary variable indicating presence or absence of face-to-face interactions. These data contain potentially more information than binary networks and may provide more refined prevention policies. In accomplishing this goal one possibility is to adapt the methodology proposed in Canale and Dunson (2011) to our framework and assume the weighted edges are realizations from a rounded Gaussian whose mean is factorized as in equation .
Please refer to the reading of the full paper for further knowledge of the details. There is a nice appendix section outlining the full theorem proofs, computational and experimental set up of the research for this paper.