2013

Abstracts, see below.

31/1, Mikhail Lifshits, S:t Peterburg State University and Linköping University, Small deviation probabilities and their interplay with operator theory and bayesian statistics.
28/2, Peter Jagers, Extinction - a Never Ending Story Abstract. All populations must die out, or not? Glimpses from an ongoing controversy, from Galton (1873) to recent quarrels (2013)
14/3, Anders Sandberg, Future of Humanity Institute, Oxford, Probing the Improbable: Methodological Challenges for Risks with Low Probabilities and High Stakes.
4/4, Daniel Johansson, Fysisk resursteori, Chalmers, Climate sensitivity: Learning from observations. 
11/4, Johan Johansson, Chalmers, On the BK inequality. 
16/4, Alexandra Jauhiainen, Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles.
2/5, Maryam Zolghadr and Sergei Zuyev, Chalmers, Optimal design of dilution experiments under volume constraints.
21/5, Johan Wallin, Lund, ​Spatial Matérn fields generated by non-Gaussian noise.
23/5, Youri Davydov, Université Lille-1, On convex hulls of sequences of stochastic processes.
28/5, Patrik Rydén, Department of Mathematics and Mathematical statistics and Computational Life science Cluster (CLiC), Umeå University, Analysis of high-dimensional genomics data - challenges and opportunities.
13/6, Fima Klebaner, Monash, Evaluations of expectations of functionals of diffusions by simulations.
29/8, ​Jesper Möller, Aalborg University, Determinantal point process models and statistical inference.
12/9, Christos Dimitrakakis, Chalmers, ABC Reinforcement Learning.
19/9, ​Jeff Steif, Strong noise sensitivity and Erdos Renyi random graphs.
24/9, Ben Morris, University of California, Mixing time of the card-cyclic to random shuffle.
26/9, Ben Morris, University of California, Mixing time of the overlapping cycles shuffle and square lattice rotations shuffle.
3/10, ​Malwina Luczak, Queen Mary, University of London, The stochastic logistic epidemic. 
10/10, Johan Tykesson, Chalmers, ​The Poisson cylinder model.
15/10, ​Manuel García Magariños, UDC, Spain, A new parametric approach to kinship testing.
17/10, Tanja Stadler, ETH, Phylogenetics in action: Uncovering macro-evolutionary and epidemiological dynamics based on molecular sequence data.
17/10, ​Gordon Slade, University of British Columbia, Weakly self-avoiding walk in dimension four.
22/10, Gennady Martynov, Inst. for Information Transmission Problems, RAS, Moscow, Cramer-von Mises Gaussianity test for random processes on [0,1].
31/10, ​Alexandre Proutiere, KTH, Bandit Optimisation with Large Strategy Sets and Applications.
7/11, Boualem Djehiche, KTH, On the subsolution approach to efficient importance sampling.
12/11, Ioannis Papastathopoulos, Bristol, Graphical structures in extreme multivariate events.
14/11, Yu. Belyaev, Umeå University​, Consistent estimates of distribution's characteristics and their accuracies based on interval-data.
21/11, Annika Lang, ​How does one computationally solve a stochastic partial differential equation?
28/11, Arne Pommerening, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf, Switzerland, ​What are the differences between competition kernels and traditional size-ratio based competition indices used in plant ecology?
​​5/12, Kaspar Stucki, University of Göttingen, Germany, Continuum percolation for Gibbs point processes.
12/12, Olle Häggström, Are all ravens black? The problem of induction.
19/12, Marianne Månsson, Astra Zeneca, ​Statistical paradoxes and curiosities in medical applications.
 

 
31/1, Mikhail Lifshits, S:t Petersburg State University: Small deviation probabilities and their interplay with operator theory and bayesian statistics
 
Abstract:
Small deviation, or small ball, probability simply means P(||X||<r) as r tends to zero for X being a random element of a Banach space. Typically X is a trajectory of a random process such as Wiener process, fractional Brownian motion, Levy process, etc., while ||.|| is some norm on a functional space. There is no general technique for evaluating small deviation probability but in some important cases interesting links lead from small deviations to entropy of linear operators, eigenvalues of Sturm-Liouville problems etc. We will discuss these links, supply examples, and will review some applications to Bayesian statistics.
 
 
14/3, Anders Sandberg, Future of Humanity Institute, Oxford, Probing the Improbable: Methodological Challenges for Risks with Low Probabilities and High Stakes
 
Abstract:
Some risks have extremely high stakes. For example, a worldwide pandemic or asteroid impact could potentially kill more than a billion people. Comfortingly, scientific calculations often put very low probabilities on the occurrence of such catastrophes. In this paper, we argue that there are important new methodological problems which arise when assessing global catastrophic risks and we focus on a problem regarding probability estimation. When an expert provides a calculation of the probability of an outcome, they are really providing the probability of the outcome occurring, given that their argument is watertight. However, their argument may fail for a number of reasons such as a flaw in the underlying theory, a flaw in the modeling of the problem, or a mistake in the calculations. If the probability estimate given by an argument is dwarfed by the chance that the argument itself is flawed, then the estimate is suspect. We develop this idea formally, explaining how it differs from the related distinctions of model and parameter uncertainty. Using the risk estimates from the Large Hadron Collider as a test case, we show how serious the problem can be when it comes to catastrophic risks and how best to address it. This is joint work with Toby Ord and Rafaela Hillerbrand.
 
 
4/4, Daniel Johansson, Fysisk resursteori, Chalmers: Climate sensitivity: Learning from observations
 
Abstract:
Although some features of climate change are known with relative certainty, many uncertainties in the climate science remain. The most important uncertainty pertains to the Climate Sensitivity (CS), i.e., the equilibrium increase in the global mean surface temperature that follows from a doubling of the atmospheric CO2 concentration. A probability distribution for the CS can be estimated from the observational record of global mean surface temperatures and ocean heat uptake together with estimates of anthropogenic and natural radiative forcings. However, since the CS is statistically dependent on other uncertain factors, such as the uncertainty in the direct and indirect radiative forcing of aerosols, it is difficult to constrain this distribution from observations. The primary aim with this presentation is to analyse how the distribution of the climate sensitivity changes over time as the observational record becomes longer. We are using a Bayesian Markov Chain Monte Carlo approach together with an Upwelling Diffusion Energy Balance Model for this. Also, we will discuss in brief how sensitive the climate sensitivity estimate is to changes in the structure of the geophysical model and to changes on the observational time series.


11/4, Johan Johansson, Chalmers: On the BK inequality
 
Abstract:
A family of binary random variables is said to have the BK property if, loosely speaking, for any two events that are increasing in the random variables, the probability that they occur disjointly is at most the product of the probabilities of the two events. The classical BK inequality states that this holds if the random variables are independent. Since the BK property is stronger than negative association, it is a form of negative dependence property and one would expect other negatively dependent families to have the BK property. This has turned out to be quite a challenge and until very recently, no substantial example beside the independent case were known. In this talk I will give two of these examples, the k-out-of-n measure and pivotal sampling, and sketch how to prove the BK inequality for these. I will also mention a few seemingly "simple questions" and how solutions to these would be profoundly important.
 
 

16/4, Alexandra Jauhiainen: Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

Abstract:
Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or expensive to acquire. On the other hand, observational data of the organism in steady state (e.g. wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network.

The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in uncovering the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network.


2/5, Maryam Zolghadr and Sergei Zuyev, Chalmers: Optimal design of dilution experiments under volume constraints

Abstract:
We develop methods to construct a one-stage design of dilution experiments under the total available volume constraint typical for bio-medical applications. We consider different optimality criteria based on the Fisher information in both non-Bayesian and Bayesian settings. It turns out that the optimal design is typically one atomic, meaning that all the dilutions should be of the same size. Our proposed approach to solve such optimization problems is a variational analysis of functionals of a measure. The advantage of the measure optimization approach is that additional requirements like a total cost of experiment can be easily incorporated into the goal function.

 

21/5, Johan Wallin, Lund: Spatial Matérn fields generated by non-Gaussian noise

Abstract:
In this work, we study non-Gaussian extensions of a recently discovered link between certain Gaussian random fields, expressed as solutions to stochastic partial differential equations, and Gaussian Markov random fields. We show how to construct efficient representations of non-Gaussian random fields generated by generalized asymmetric Laplace noise and normal inverse Gaussian noise, and discuss parameter estimation and spatial prediction for these models. Finally, we look at an application to precipitation data from the US.
 
 
23/5, Youri Davydov, Université Lille-1: On convex hulls of sequences of stochastic processes

Abstract:
Let X_i =  { X_i (t), t in T}  be i.i.d. copies of a d-dimensional process X = { X(t), \; t in T}, where T is a general separable metric space. Assume that X has a.s. bounded paths and consider the convex hulls W_n constructed by the trajectories of X_i's. We are studying the existence of a limit shape W for the sequence {W_n} normalised by appropriate constants b_n. We show that in the case of Gaussian processes, W_n/b_n converges a.s. to W which is nonrandom, whereas for the processes satisfying a regular variation condition the convergence is in law and the limit set W in many cases is a random polytope.

 

28/5, Patrik Rydén, Department of Mathematics and Mathematical statistics and Computational Life science Cluster (CLiC), Umeå University: Analysis of high-dimensional genomics data - challenges and opportunities

Abstract:
High throughput technologies in life science such as high-throughput DNA and RNA sequencing, gene expression arrays, mass spectrometry, ChIP-chip and methylation arrays have allowed genome-wide measurements of complex cellular responses for a broad range of treatments and diseases. The modern technologies are powerful, but in order for them to reach their full potential new statistical tools need to be developed.
I will discuss pre-processing of microarray data (the discussion will also be relevant for other techniques), how pre-processing affects down-stream cluster analysis and why cluster analysis of samples (e.g. tumour samples) often fails to cluster the samples in a relevant manner. Finally, I will give my view on the future in the field of genomics research and what role statisticians can play.

 

13/6, Fima Klebaner, Evaluations of expectations of functionals of diffusions by simulations

Abstract:
We consider the problem of evaluations of expectations by simulations. After a brief introduction, we point out that there is a problem with the standard approach if the functional in question is not continuous.  Evaluation of probability of absorption (or ruin probability) by simulations is shown as an example. We give a modification of the standard Euler-Maruyama scheme to obtain convergence. Open problems still remain. This joint work with Pavel Chigansky, Hebrew University.

 

29/8, Jesper Möller, Aalborg University, Determinantal point process models and statistical inference

Abstract:
Statistical models and methods for determinantal point processes (DPPs) seem largely unexplored, though they possess a number of appealing properties and have been studied in mathematical physics, combinatorics, and random matrix theory. We demonstrate that DPPs provide useful models for the description of repulsive spatial point processes, particularly in the 'soft-core' case. Such data are usually modelled by Gibbs point processes, where the likelihood and moment expressions are intractable and simulations are time consuming. We exploit the appealing probabilistic properties of DPPs to develop parametric models, where the likelihood and moment expressions can be easily evaluated and realizations can be quickly simulated. We discuss how statistical inference is conducted using the likelihood or mo- ment properties of DPP models, and we provide freely available software for simulation and statistical inference.
The work has been carried out in collaboration with Ege Rubak, Aalborg University, and Frederic Lavancier, University of Nantes. The paper is available at arXiv:1205.4818.

 

12/9, Christos Dimitrakakis, Chalmers, ABC Reinforcement Learning

Abstract:
We introduces a simple, general framework for \emph{likelihood-free} Bayesian reinforcement learning, through Approximate Bayesian Computation (ABC). The main advantage is that we only require a prior distribution on a class of simulators. This is useful in domains where a probabilistic model of the underlying process is too complex to formulate, but where detailed simulation models are available. ABC-RL allows  the use of any Bayesian reinforcement learning technique in this case. In fact, it can be seen as an extension of simulation methods to both planning and inference.
We experimentally demonstrate the potential of this approach in a comparison with LSPI. Finally, we introduce a theorem showing that ABC is sound.

 

19/9, Jeff Steif, Strong noise sensitivity and Erdos Renyi random graphs

Abstract: Noise sensitivity concerns the question of when complicated events involving many i.i.d. random variables are (or are not) sensitive to small perturbations in these variables.
The Erdos Renyi random graph is the graph obtained by taking n vertices and connecting each pair of vertices independently with probability p_n.
This random graph displays very interesting behaviour. We will discuss some recent results concerning noise sensitivity for events involving the Erdos Renyi random graph. This is joint work with Eyal Lubetzky.

 

24/9, Ben Morris, University of California, Mixing time of the card-cyclic to random shuffle

Abstract: We analyse the following method for shuffling n cards. First, remove card 1 (i.e., the card with label 1) and then re-insert it randomly into the deck. Then repeat with cards 2, 3,..., n. Call this a round. R. Pinsky showed, somewhat surprisingly, that the mixing time is greater than one round. We show that in fact the mixing time is on the order of log n rounds. Joint work with Weiyang Ning and Yuval Peres.

 

26/9, Ben Morris, University of California, Mixing time of the overlapping cycles shuffle and square lattice rotations shuffle

Abstract: The overlapping cycles shuffle, invented by Johan Jonasson, mixes a deck of n cards by moving either the nth card or (n-k)th card to the top of the deck, with probability half each. Angel, Peres and Wilson determined the spectral gap for the location of a single card and found the following surprising behaviour. Suppose that k is the closest integer to cn for a fixed c in (0,1). Then for rational c, the spectral gap is on the order of n^{-2}, while for poorly approximable irrational numbers c, such as the reciprocal of the golden ratio, the spectral gap is on the order of n^{-3/2}. We show that the mixing time for all the cards exhibits the same behaviour (up to logarithmic factors), proving a conjecture of Jonasson.
   The square lattice rotations shuffle, invented by Diaconis, is defined as follows. The cards are arrayed in a square. At each step a row or column is chosen, uniformly at random, and then cyclically rotated by one unit. We find the mixing time of this shuffle to within logarithmic factors. Joint work with Olena Blumberg.

 

3/10, Malwina Luczak, Queen Mary, University of London, The stochastic logistic epidemic

Abstract: This talk concerns one of the simplest and most studied models of the spread of an epidemic within a population. The model has two parameters, the infection rate and the recovery rate, and the behaviour is very different depending on which is larger.  We focus on the case where the recovery rate is greater, when the epidemic is doomed to die out quickly.  But exactly how quickly turns out to be a thorny problem.

(joint work with Graham Brightwell)

 

10/10, Johan Tykesson, The Poisson cylinder model

Abstract:
We consider a Poisson point process on the space of lines in R^d, where a multiplicative factor u>0 of the intensity measure determines the density of lines. Each line in the process is taken as the axis of a bi-infinite solid cylinder of radius 1. We show that there is a phase transition in the parameter u regarding the existence of infinite connected components in the complement of the union of the cylinders. We also show that given any two cylinders c_1 and c_2 in the process, one can find a sequence of d-2 other cylinders which creates a connection between c_1 and c_2. 

The talk is based on joint works with Erik Broman and David Windisch.

 

15/10, Manuel García Magariños, UDC, Spain, A new parametric approach to kinship testing

Abstract:
Determination of family relationships from DNA data goes back decades. Statistical inference of relationships has traditionally followed a likelihood-based approach. In the forensic science, hypothesis testing is usually formulated verbally in order to provide with a good understanding to non-experts. Nonetheless, this formulation lacks a proper mathematical parameterization, leading to controversy in the field. We propose an alternative hypothesis testing framework based on the likelihood calculations for pairwise relationships of Thompson, 1975. This is in turn based on the concept of identity-by-descent (IBD) genes shared between individuals. Pairwise relationships can be specified by (k0,k1,k2), the probability that two individuals share 0, 1 and 2 IBD alleles. The developed approach allows to build a complete framework in statistical inference: point estimation, hypothesis testing and confidence regions for (k0,k1,k2). Theoretical properties have been studied. Extension to trios has been carried out in order to consider common problems in forensics. Results indicate the hypothesis testing procedure is quite powerful, especially with trios. Accurate point estimations of (k0,k1,k2) are obtained. This holds even for low number of markers and intricate relationships. Extensions to more than three individuals and inbreeding cases remain to be developed.

 

17/10 kl 13.15-14.30, Tanja Stadler, ETH, Phylogenetics in action: Uncovering macro-evolutionary and epidemiological dynamics based on molecular sequence data

Abstract:
What factors determine speciation and extinction dynamics? How can we explain the spread of an infectious disease? In my talk, I will discuss computational advances in order to address these key questions in the field of macro-evolution and epidemiology. In particular, I will present phylogenetic methodology to infer (i) macro-evolutionary processes based on species phylogenies shedding new light on mammal and bird diversification, and (ii) epidemiological processes based on genetic sequence data from pathogens shedding new light on the spread of HCV and HIV.

 

17/10, Gordon Slade, University of British Columbia, Weakly self-avoiding walk in dimension four

Abstract: We report on recent and ongoing work on the continuous-time weakly self-avoiding walk on the 4-dimensional integer lattice, with focus on a proof that the susceptibility diverges at the critical point with a logarithmic correction to mean-field scaling.  The method of proof, which is of independent interest, is based on a rigorous renormalisation group analysis of a supersymmetric field theory representation of the weakly self-avoiding walk.  The talk is based on collaborations with David Brydges, and with Roland Bauerschmidt and David Brydges.

 

22/10, Prof. Gennady Martynov, Inst. for Information Transmission Problems, RAS, Moscow: Cramer-von Mises Gaussianity test  for  random processes on [0,1]
 
Abstract.  We consider the problem of testing the hypothesis that the observed in the interval (0,1) is a Gaussian random process. Representation of the process in the Hilbert space it is used . The proposed test is based on the classic Cramer-von Mises test. We introduce also a modification of the concept of the distribution function. It was developed an asymmetric Cramer-von Mises test. The methods must be considered for exact  calculation of  limiting distributions tables of the proposed statistics.

 

31/10, Alexandre Proutiere, KTH, Bandit Optimisation with Large Strategy Sets and Applications

Abstract: Bandit optimisation problems constitute the most fundamental and basic instances of sequential decision problems with an exploration-exploitation trade-off. They naturally arise in many contemporary applications found in communication networks, e-commerce and recommendation systems. In this lecture, we present recent results on bandit optimisation problems with large strategy sets. For such problems, the number of possible strategies may not be negligible compared to the time horizon. Results are applied to the design of protocols and resource sharing algorithms in wireless systems.

 

7/11, Boualem Djehiche, KTH, On the subsolution approach to efficient importance sampling

Abstract:  The widely used Monte Carlo simulation technique where all the particles are independent and statistically identical and their weights are constant is by no means universally applicable. The reason is that particles may wander off to irrelevant parts of the state space, leaving only a small fraction of relevant particles that contribute to the computational task at hand. Therefore it  may require a huge number of particles to obtain a desired precision, resulting in a computational cost that is too high for all practical purposes. A control mechanism is needed to force the particles to move to the relevant part of the space, thereby increasing the importance of each particle and reducing the computational cost. Importance sampling  technique  offers a way to  choose a sampling dynamics  (the main difficult part)  to steer the particles towards the relevant part of the state space.  In this talk I will review some recent results on the so-called subsolution approach to Importance Sampling  that is able to tune the sampling dynamics at hopefully lower costs.

This is joint work with Henrik Hult and Pierre Nyquist.

 

12/11, Ioannis Papastathopoulos, Bristol University, Graphical structures in extreme multivariate events

Abstract: Modelling and interpreting the behaviour of extremes is quite challenging, especially when the dimension of the problem under study is large. Initially, univariate extreme value models are used for marginal tail estimation and then, the inter-relationships between random variables are captured by modelling the dependence of the extremes. Here, we propose graphical structures in extreme multivariate events of a random vector given that one of its components is large. These structures aim to provide better estimates and predictions of extreme quantities of interest as well as to reduce the problems with the curse of dimensionality. The imposition of graphical structures in the estimation of extremes is approached via simplified parameter structure in maximum likelihood setting and through Monte Carlo simulation from conditional kernel densities. The increase in efficiency of the estimators and the benefits of the proposed method are illustrated through simulation studies. 

 

21/11, Annika Lang, How does one computationally solve a stochastic partial differential equation?
 
Abstract:
The solution of a stochastic partial differential equation can for example be seen as a Hilbert-space-valued stochastic process. In this talk I discuss discretizations in space, time, and probability to simulate the solution with a computer and I derive convergence rates for different types of approximation errors.

 

28/11 Arne Pommerening, Swiss Federal Institute for Forest, Snow and Landscape Research WSL, Birmensdorf, Switzerland ​What are the differences between competition kernels and traditional size-ratio based competition indices used in plant ecology?

Abstract: Both traditional competition indices and competition kernels are used in many studies to quantify competition between plants for resources. Yet it is not fully clear what the differences between these two concepts really are.

For a fair comparison of the two approaches we selected two fundamental and widespread types of competition indices based on distance weighted size ratios, an additional index without distance weighting as a control and developed the corresponding competition kernels. In contrast to the latter, competition indices require individual influence zones derived from tree crown-radius measurements. We applied these competition measures to three spatial tree time series in forest ecosystems in Europe and North America. Stem diameter increment served as a response variable.

Contrary to our expectation, the results of both methods indicated similar performance, however, the use of competition kernels produced slightly better results with only one exception out of six comparisons.
Although the performance of both competition measures is not too different, competition kernels are based on more solid mathematical and ecological grounds. This is why applications of this method are likely to increase. The trade-off of the use of competition kernels, however, is the need for more sophisticated spatial regression routines that researchers are required to program.

(2011). Correct testing of mark independence for marked point patterns. Ecological Modelling 222, 3888-3894.

 

5/12, Kaspar Stucki, University of Göttingen, Germany, Continuum percolation for Gibbs point processes

Abstract:
We consider percolation properties of the Boolean model generated by a Gibbs point process and balls with deterministic radius. We show that for a large class of Gibbs point processes there exists a critical activity, such that percolation occurs a.s. above criticality. For locally stable Gibbs point processes we show a converse result, i.e. they do not percolate a.s. at low activity.

 

12/12, Olle Häggström, Are all ravens black? The problem of induction

How do we draw conclusions about what we have not yet seen based on what we have seen? That is the age-old problem of induction. Statistical inference is meant to solve it, but does it? In this talk I will give a brief and selective review of the history of the problem, and discuss where we stand today.


19/12, Marianne Månsson, Astra Zeneca, Statistical paradoxes and curiosities in medical applications

Abstract:
Why do my friends have more friends than I have? A feeling shared by most people which was formulated in the 90s as The friendship paradox. Is it really true? Can it be used for prediction of epidemics? This is one of the paradoxes which will be discussed in this seminar.

Published: Tue 08 Jan 2019.