skip to content

Timetable (SNAW02)

Network science and its applications

Monday 22nd August 2016 to Friday 26th August 2016

Thursday 25th August 2016
09:00 to 09:50 Registration
09:50 to 10:00 Welcome from John Toland (INI Director) INI 1
10:00 to 10:40 Yi Yu
Estimating whole brain dynamics using spectral clustering
The estimation of time-varying networks for functional Magnetic Resonance Imaging (fMRI) data sets is of increasing importance and interest. In this work, we formulate the problem in a high-dimensional time series framework and introduce a data-driven method, namely Network Change Points Detection (NCPD), which detects change points in the network structure of a multivariate time series, with each component of the time series represented by a node in the network. NCPD is applied to various simulated data and a resting-state fMRI data set. This new methodology also allows us to identify common functional states within and across subjects. Finally, NCPD promises to offer a deep insight into the large-scale characterisations and dynamics of the brain.  This is joint work with Ivor Cribben (Alberta School of Business).
10:40 to 11:00 Pariya Behrouzi
Detecting Epistatic Selection in the Genome of RILs via a latent Gaussian Copula Graphical Model
Recombinant Inbred Lines (RILs) derived from divergent parental lines can display extensive segregation distortion and long-range linkage disequilibrium (LD) between distant loci on same or different chromosomes. These genomic signatures are consistent with epistatic selection having acted on entire networks of interacting parental alleles during inbreeding. The reconstruction of these interaction networks from observations of pair-wise marker-marker correlations or pair-wise genotype frequency distortions is challenging as multiple testing approaches are under-powered and true long-range LD is difficult to distinguish from drift, particularly in small RIL panels. Here we develop an efficient method for reconstructing an underlying network of genomic signatures of high-dimensional epistatic selection from multi-locus genotype data. The network captures the conditionally dependent short- and long-range LD structure of RIL genomes and thus reveals "aberrant" marker-marker associations that are due to epistatic selection rather than gametic linkage. The network estimation relies on penalized Gaussian copula graphical models, which accounts for large number of markers p and small number of individuals n. A multi-core implementation of our algorithm makes it feasible to estimate the graph in high-dimensions (max markers ~ 3000). We demonstrate the efficiency of the proposed method on simulated datasets as well as on genotyping data in A.thaliana and Maize.
11:00 to 11:30 Morning Coffee
11:30 to 12:10 Ginestra Bianconi
Multiplex networks
Multiplex networks describe interacting systems where the same set of nodes are linked by different type of interactions. Multiplex networks include social networks, infrastructures and biological systems. Characterizing and modelling the structure of multiplex networks is fundamental for solving network inference problems. Here we will discuss recent results showing that multiplex networks encode more information than their single layers taken in isolation. In fact they are characterized by strongly correlated structures that reveal important statistical properties of the complex system that they describe.
12:10 to 12:30 Mirko Signorelli
Modelling community structure in the Italian Parliament: a penalized inference approach
In many parliamentary systems, bills can be proposed by a single parliamentarian, or cosponsored by a group of parliamentarians. In the latter case, bill cosponsorship defines a symmetric relation that can be taken as a measure of ideological agreement between parliamentarians.  
Political scientists have often analysed bill cosponsorship networks in the US Congress, assessing its community structure and the behaviour of minorities therein. In this talk, I will consider data on bill cosponsorship in the Italian Chamber of Deputies over the last 15 years. If compared to the US Congress, a distinguishing feature of the Italian Chamber is the presence of a large number of political groups: the primary purpose of the analysis is thus to infer the pattern of collaborations between these groups.  

We consider a stochastic blockmodel for edge-valued graphs that views bill cosponsorship as the result of a Poisson process, which explicitly depends on membership of parliamentary groups. As the number of model parameters increases quickly with the number of groups, we pursue a penalized likelihood approach to model estimation that enables us to infer a sparse reduced graph, which summarizes relations between parliamentary groups.  

Besides showing the effects of gender and geographic proximity on bill cosponsorship, the analysis points out the evolution from a highly polarized political arena, in which Deputies base collaborations on their identification with left or right-wing values, towards an increasingly fragmented Parliament, where a rigid separation of political groups into coalitions does not seem to hold any more, and collaborations beyond the perimeter of coalitions become possible.

Joint work with Ernst Wit.

Related links: (arXiv preprint)
12:30 to 13:30 Lunch @ Wolfson Court
13:30 to 14:10 Matteo Barigozzi
Networks, Dynamic Factors, and the Volatility Analysis of High-Dimensional Financial Series
Co-author: Marc Hallin (ECARES-ULB )

We consider weighted directed networks for analysing large panels of financial volatilities.For a given horizon $h$, the weight associated with edge $(i,j)$ represents the $h$-step-ahead forecast error variance of variable $i$ accounted for by variable $j$ innovations. To challenge the curse of dimensionality, we decompose the panel into a factor (market) driven component and an idiosyncratic one modelled by means of a sparse VAR. Inversion of the VAR together with suitable identification restrictions, produce the estimated network, bymeans of which we can assess how systemic each firm is. An analysis of the U.S. stock market demonstrates the prominent role of Financial firms as source of contagion during the 2007-2008 crisis.
14:10 to 14:50 Daniele Durante
Bayesian modeling of networks in complex business intelligence problems
Co-authors: Sally Paganin (University of Padova, Dept. of Statistical Sciences), Bruno Scarpa (University of Padova, Dept. of Statistical Sciences), David B. Dunson (Duke University, Dept. of Statistical Science)

Complex network data problems are increasingly common in many fields of application. Our motivation is drawn from strategic marketing studies monitoring customer choices of specific products, along with co-subscription networks encoding multiple purchasing behavior. Data are available for several agencies within the same insurance company, and our goal is to efficiently exploit co-subscription networks to inform targeted advertising of cross-sell strategies to currently mono-product customers. We address this goal by developing a Bayesian hierarchical model, which clusters agencies according to common mono-product customer choices and co-subscription networks. Within each cluster, we efficiently model customer behavior via a cluster-dependent mixture of latent eigenmodels. This formulation provides key information on mono-product customer choices and multiple purchasing behavior within each cluster, informing targeted cross-sell strategies. We develop simple algorithms for tractable inference, and assess performance in simulations and an application to business intelligence

Related Links
14:50 to 15:30 Luca De Benedictis
Implementing Propensity Score Matching with Network Data: The effect of GATT on bilateral trade
Co-authors: Bruno Arpino, Alessandra Mattei  

Motivated by the evaluation of the causal effect of the General Agreement on Tariffs
and Trade on bilateral international trade flows, we investigate the role of network structure in propensity score matching under the assumption of strong ignorability. We study the sensitivity of causal inference with respect to the presence of characteristics of the network in the set of confounders conditional on which strong ignorability is assumed to hold. We find that estimates of the average causal effect are highly sensitive to the presence of node-level network statistics in the set of confounders. Therefore, we argue that estimates may suffer from omitted variable bias when the relational dimension of units is ignored, at least in our application.
15:30 to 16:00 Afternoon Tea
16:00 to 16:20 Nynke Niezink
Modeling the dynamics of social networks and continuous actor attributes
Co-authors: Tom Snijders (University of Groningen)  

Social networks and the characteristics of the actors who constitute these networks are not static; they evolve interdependently over time. People may befriend others with similar political opinions or change their own opinion based on that of their friends. The stochastic actor-oriented model is used to statistically model such dynamics. We will present an extension of this model for continuous dynamic actor characteristics. The method available until now assumed actor characteristics to be measured on an ordinal categorical scale, which yielded practical limitations for applied researchers. We now model the interdependent dynamics by a stochastic differential equation for the attribute evolution and a Markov chain model for the network evolution. Although the model is too complicated to calculate likelihoods or estimators in closed form, the stochastic evolution process can be easily simulated. Therefore, we estimate model parameters using the method of moments and the Robbins-Monro algorithm for stochastic approximation. We will illustrate the proposed method by a study of the relation between friendship and obesity, analyzing body mass index as continuous dynamic actor attribute.
16:20 to 17:00 Katherine McLaughlin
Analysis of Networks with Missing Data with Application to the National Longitudinal Study of Adolescent Health
Co-authors: Krista J. Gile (University of Massachusetts at Amherst), Mark S. Handcock (University of California, Los Angeles)

It is common in the analysis of social network data to assume that it represents a census of the networked population of interest. Often the data result from sampling of the networked population via a known mechanism. However, most social network analysis ignores the problem of missing data by including only actors with complete observations. In this talk we address the modeling of networks with missing data, developing previous ideas in missing data, network modeling, and network sampling. We show the value of the mean value parametrization to study differences between modeling approaches. We also develop goodness-of-fit techniques to better understand model fit. The ideas are motivated by an analysis of a friendship network from the National Longitudinal Study of Adolescent Health. The work presented is by Krista J. Gile and Mark S. Handcock.
19:30 to 22:00 Formal Dinner at Corpus Christi College
Friday 26th August 2016
09:00 to 09:40 Alberto Roverato
The Networked Partial Correlation and its Application to the Analysis of Genetic Interactions
Genetic interactions confer robustness on cells in response to genetic perturbations. This often occurs through molecular buffering mechanisms that can be predicted using, among other features, the degree of coexpression between genes, commonly estimated through marginal measures of association such as Pearson or Spearman correlation coefficients. However, marginal correlations are sensitive to indirect effects and often partial correlations are used instead. Yet, partial correlations convey no information about the (linear) influence of the coexpressed genes on the entire multivariate system, which may be crucial to discriminate functional associations from genetic interactions. To address these two shortcomings, here we propose to use the edge weight derived from the covariance decomposition over the paths of the associated gene network. We call this new quantity the networked partial correlation and use it to analyze genetic interactions in yeast. More concretely, in its well-characterized leucine biosynthesis pathway and on a previously published data set of genome-wide quantitative genetic interaction profiles. In both cases, networked partial correlations substantially improve the identification of genetic interactions over classical coexpression measures.
09:40 to 10:20 Reza Mohammadi
Bayesian modelling of Dupuytren disease using Gaussian copula graphical models
Co-authors: Fentaw Abegaz (University of Liege, Belgium), Edwin van den Heuvel (Eindhoven University of Technology, The Netherlands), Ernst Wit (University of Groningen, The Netherlands)

Dupuytren disease is a fibroproliferative disorder with unknown etiology that often progresses and eventually can cause permanent contractures of the affected fingers. In this talk, we provide a computationally efficient Bayesian framework to discover potential risk factors and investigate which fingers are jointly affected. Our Bayesian approach is based on Gaussian copula graphical models, which provide a way to discover the underlying conditional independence structure of variables in multivariate mixed data. In particular, we combine the semiparametric Gaussian copula with extended rank likelihood to analyse multivariate mixed data with arbitrary marginal distributions. For the structural learning, we construct a computationally efficient search algorithm using a trans-dimensional MCMC algorithm based on a birth-death process. In addition, to make our statistical method easily accessible to other researchers, we have implemented our method in C++ and provide an interface with R software as an R package BDgraph, which is freely available online. 
10:20 to 10:40 Silvia Ioana Fierascu
Applying network science to political problems. A conceptual and analytical framework for understanding and predicting corruption risks in business-political networks
This short talk is a summary of my in-progress PhD dissertation, “The Network Phenomenon of State Capture. Network Dynamics, Unintended Consequences, and Political-Business Elite Relations in Hungary.” The thesis comes as a critique to the typical conceptual and methodological approaches in political science to studying institutionalized grand corruption. It proposes a novel conceptual and analytical framework, rooted in network theory and using network scientific research designs and methods, to better understand the complex case of a successful post-communist democracy turning hybrid regime - Hungary. To this end, I analyze the formation, evolution, and development of different types of business-political elite and organizational networks, and their effects on the quality of the state and the market, from the beginning of the democratic regime until today’s illiberal democracy. Using two large empirical datasets - interlocking directorates (business-political elite and organizational networks, 1990-2010) and corruption risks in issuer-winner public procurement networks (2009-2012), I model these multi-mode, dynamic, and projected networks using statistical methods for network data (e.g., comparisons to random models, motif detection, ergms) and machine learning algorithms (e.g., regression trees, random forests). In this presentation, I will showcase some of the main findings and future research plans. The study is part of a broader research agenda - building a robust conceptual and (network) analytical framework for a large cross-country analysis of corruption risks in public procurement, with data-driven and evidence-based policy recommendations. I will end the talk with highlighting some of the successes, challenges, and questions I have encountered in applying network science to better understand political problems.
10:40 to 11:00 Gwenael Leday
Incorporating biological information into network inference using structured shrinkage
High-throughput biotechnologies such as microarrays provide the opportunity to study theinterplay between molecular entities, which is central to the understanding of disease biology.The statistical description and analysis of this interplay is naturally carried out with Gaussiangraphical models in which nodes represent molecular variables and edges between them representinteractions. Inferring the edge set is, however, a challenging task as the number of parametersto estimate easily is much larger than the sample size. A conventional remedy is to regularize orpenalize the model likelihood. In network models, this is often done locally in the neighbourhoodof each node. However, estimation of the many regularization parameters is often dicult andcan result in large statistical uncertainties. We show how to combine local regularization withglobal shrinkage of the regularization parameters, via empirical Bayes (EB), to borrow strengthbetween nodes and improve inference. Furthermore, we show how one can use EB so the level ofregularization may dier across an arbitrary number of predened groups of interactions. Suchauxiliary information is often available in Biology. It is shown that accurate prior information cangreatly improve the reconstruction of the network, but need not harm the reconstruction if wrong.
11:00 to 11:30 Morning Coffee
11:30 to 12:10 Juliane Manitz
Source Estimation for Propagation Processes on Complex Networks with an Application to Delays in Public Transportation Systems
Co-authors: Jonas Harbering (University of Göttingen), Marie Schmidt (Erasmus University Rotterdam), Thomas Kneib (University of Göttingen), Anita Schöbel (University of Göttingen)

The correct identification of the source of a propagation process is crucial in many research fields. As a specific application, we onsider source estimation of delays in public transportation networks. We propose two approaches: a effective distance median and a backtracking method. The former is based on the structurally generic effective distance-based approach for the identification of infectious disease origins, and the latter is specifically designed for delay propagation. We examine the performance of both methods in simulation studies and in an application to the German railway system, and compare the results to a centrality-based approach for source detection.
12:10 to 12:30 Carsten Chong
Contagion in Financial Systems: A Bayesian Network Approach
We conduct a probabilistic analysis for a structural default model of interconnected financial institutions. For all possible network structures we characterize the joint default distribution of the system using Bayesian network methodologies. Particular emphasis is given to the treatment and consequences of cyclic financial linkages. We further demonstrate how Bayesian network theory can be applied to detect contagion channels within the financial network, to measure the systemic importance of selected entities on others, and to compute conditional or unconditional probabilities of default for single or multiple institutions.
12:30 to 13:30 Lunch @ Wolfson Court
13:30 to 14:10 Ben Parker
Optimal Design of Experiments on Connected Units with Application to Social Networks
Co-authors: Steven G. Gilmour and John Schormans  

When experiments are performed on social networks, it is difficult to justify the usual
assumption of treatment-unit additivity, due to the connections between actors in the network. We investigate how connections between experimental units affect the design of experiments on those experimental units. Specifically, where we have unstructured treatments, whose effects propagate according to a linear network effects model which we introduce, we show that optimal designs are no longer necessarily balanced; we further demonstrate how experiments which do not take a network effect into account can lead to much higher variance than necessary and/or a large bias. We show the use of this methodology in a very wide range of experiments in agricultural trials, and crossover trials, as well as experiments on connected individuals in a social network.
14:10 to 14:50 Eric Laber
On-line estimation of an optimal treatment allocation strategy for the control of white-nose syndrome in ba
Co-authors: Nick J. Meyer (North Carolina State University), Brian J. Reich (North Carolina State University), Krishna Pacifici (North Carolina State University), John Drake (University of Georgia), Jaime Collazo (North Carolina State University)

Emerging infectious diseases are responsible for high morbidity and mortality, economic damages to affected countries, and are a major vulnerability for global stability. Technological advances have made it possible to collect, curate, and access large amounts of data on the progression of an infectious disease. We derive a framework for using this data in real-time to inform disease management. We formalize a treatment allocation strategy as a sequence of functions, one per treatment period, that map up-to-date information on the spread of an infectious disease to a subset of locations for treatment. An optimal allocation strategy optimizes some cumulative outcome, e.g., the number of uninfected locations, the geographic footprint of the disease, or the cost of the epidemic. Estimation of an optimal allocation strategy for an emerging infectious disease is challenging because spatial proximity induces interference among locations, the number of possible allocations is exponential in the number of locations, and because disease dynamics and intervention effectiveness are unknown at outbreak. We derive a Bayesian online estimator of the optimal allocation strategy that combines simulation-optimization with Thompson sampling. The proposed estimator performs favorably in simulation experiments. This work is motivated by and illustrated using data on the spread of white-nose syndrome a highly fatal infectious disease devastating bat populations in North America.

Related Links
14:50 to 15:10 Steffen Lauritzen
What COSTNET can do for and with you!
Statistical Network Science is one of the hot topics of this moment, as evidenced by this Isaac Newton Programme. Network phenomena are widespread and a multitude of network data formats appear in a variety of applications. Often the applied scientist has only a limited awareness of the available modelling and inference techniques available. We will describe infectious disease phenomena, social network models and genetic regulatory networks  

COSTNET is a European Cooperation in Science and Technology funded initiative to bring together quantitative network modellers in Europe and beyond to collaborate between 2016 - 2020 on joint, capacity building projects to push the boundary of statistical network science.   As part of the MC committee of this COST Action, I would like to invite interested parties to join and be part of this initiative.
15:10 to 15:30 Organisers wrap-up INI 1
15:30 to 16:00 Afternoon Tea
University of Cambridge Research Councils UK
    Clay Mathematics Institute London Mathematical Society NM Rothschild and Sons