19/1, Anna Dreber Almenberg, Stockholm School of Economics: (Predicting) replication outcome
Abstract: Why are there so many false positive results in the published scientific literature? And what is the actual share of results that do not replicate in different literatures in the experimental social sciences? I will discuss several large replication projects on direct and conceptual replications, as well as our studies on "wisdom-of-crowds" mechanisms like prediction markets and forecasting surveys where researchers attempt to predict replication outcomes as well as new outcomes.
2/2, Claudia Redenbach, Technische Universität Kaiserslautern: Using stochastic models for segmentation and characterization of spatial microstructures
Abstract: The performance of engineering materials such as foams, fibre composites or concrete is heavily influenced by the microstructure geometry. Quantitative analysis of 3D images, provided for instance by micro computed tomography (µCT), allows for a characterization of material samples. In this talk, we will illustrate how models from stochastic geometry may support the segmentation of image data and the statistical analysis of the microstructures. Our first example deals with the estimation of the fibre length distribution from µCT images of glass fibre reinforced composites. As examples of segmentation tasks we present the reconstruction of the solid component of a porous medium from focused ion beam scanning electron microscopy (FIB-SEM) image data and the segmentation of cracks in µCT images of concrete.
16/2, Fredrik Johansson, Chalmers: Making the most of observational data in causal estimation with machine learning
Abstract: Decision making is central to all aspects of society, private and public. Consequently, using data and statistics to improve decision-making has a rich history, perhaps best exemplified by the randomized experiment. In practice, however, experiments carry significant risk. For example, making an online recommendation system worse could result in millions of lost profits; selecting an inappropriate treatment for a patient could have devastating consequences. Luckily, organizations like hospitals and companies who serve recommendations routinely collect vast troves of observational data on decisions and outcomes. In this talk, I discuss how to make the best use of such data to improve policy, starting with an example of what can go wrong if we’re not careful. Then, I present two pieces of research on how to avoid such perils if we are willing to say more about less.
2/3, Andrea De Gaetano, IRIB CNR: Modelling haemorrhagic shock and statistical challenges for parameter estimation
Abstract: In the ongoing development of ways to mitigate the consequences of penetrating trauma in humans, particularly in the area of civil defence and military operations, possible strategies aimed at identifying the victim's physiological state and its likely evolution depend on mechanistic, quantitative understanding of the compensation mechanisms at play. In this presentation, time-honored and recent mathematical models of the dynamical response to hemorrhage are briefly discussed and their applicability to real-life situations is examined. Conclusions are drawn as to the necessary formalization of this problem, which however poses methodological challenges for parameter estimation.
16/3, Fredrik Lindsten, Linköping University: Monte Carlo for Approximate Bayesian Inference
Abstract: Sequential Monte Carlo (SMC) is a powerful class of methods for approximate Bayesian inference. While originally used mainly for signal processing and inference in dynamical systems, these methods are in fact much more general and can be used to solve many challenging problems in Bayesian statistics and machine learning, even if they lack apparent sequential structure. In this talk I will first discuss the foundations of SMC from a machine learning perspective. We will see that there are two main design choices of SMC: the proposal distribution and the so-called intermediate target distributions, where the latter is often overlooked in practice. Focusing on graphical model inference, I will then show how deterministic approximations, such as variational inference and expectation propagation, can be used to approximate the optimal intermediate target distributions. The resulting algorithm can be viewed as a post-correction of the biases associated with these deterministic approximations. Numerical results show improvements over the baseline deterministic methods as well as over "plain" SMC.
The first part of the talk is an introduction to SMC inspired by our recent Foundations and Trends tutorial
30/3, Manuela Zucknick, University of Oslo: Bayesian modelling of treatment response in ex vivo drug screens for precision cancer medicine
Abstract: Large-scale cancer pharmacogenomic screening experiments profile cancer cell lines or patient-derived cells versus hundreds of drug compounds. The aim of these in vitro studies is to use the genomic profiles of the cell lines together with information about the drugs to predict the response to a particular combination therapy, in particular to identify combinations of drugs that act synergistically. The field is hyped with rapid development of sophisticated high-throughput miniaturised platforms for rapid large-scale screens, but development of statistical methods for the analysis of resulting data is lagging behind. I will discuss typical challenges for estimation and prediction of response to combination therapies, from large technical variation and experimental biases to modelling challenges for prediction of drug response using genomic data. I will present two Bayesian models that we have recently developed to address diverse problems relating to the estimation and prediction tasks, and show how they can improve the identification of promising drug combinations over standard non-statistical approaches.
6/4, Prashant Singh, Uppsala University: Likelihood-free parameter inference of stochastic time series models: exploring neural networks to enhance scalability, efficiency and performance
Abstract: Parameter inference of stochastic time series models, such as gene regulatory networks in the likelihood-free setting is a challenging task, particularly when the number of parameters to be inferred is large. Recently, data-driven machine learning models (neural networks in particular) have delivered encouraging results towards addressing the scalability, efficiency and parameter inference quality of the likelihood-free parameter inference pipeline. In particular, this talk will present a detailed discussion on neural networks as trainable, expressive and scalable summary statistics of high-dimensional time series for parameter inference tasks.
Preprint reference: Åkesson, M., Singh, P., Wrede, F., & Hellander, A. (2020). Convolutional neural networks as summary statistics for approximate bayesian computation. arXiv preprint arXiv:2001.11760
11/5, Ilaria Prosdocimi, University of Venice: Statistical models for the detection of changes in peak river flow in the UK
Abstract: Several parts of the United Kingdom have experienced highly damaging flooding events in the recent decades, raising doubts on whether methods used to assess flood risk, and therefore design flood defences, are "fit for purpose". It has also been hypothesized that the high number of recent extreme events might be one of the impacts of the (anthropogenic) changes in the climate. Indeed, with the increasing evidence of a changing climate, there is much interest in investigating the potential impacts of these changes on the risks linked to natural hazards such as intense rainfall, extreme waves and flooding. This has resulted in several studies investigating changes in natural hazard extremes, including peak river flow extremes in the UK. This talk will review a selection of these studies, discussing some of the pitfalls of statistical models typically employed to assess whether any change can be detected in peak river flow extremes. Solutions to these pitfalls are outlined and discussed. In particular, the consequences of the functional forms assumed to describe change in extremes on the ability of describing changes in the risk profiles of natural hazards are discussed.
25/5, Matteo Fasiolo, University of Bristol: Generalized additive models for ensemble electricity demand forecasting
Abstract: Future grid management systems will coordinate distributed production and storage resources to manage, in a cost-effective fashion, the increased load and variability brought by the electrification of transportation and by a higher share of weather-dependent production.
Electricity demand forecasts at a low level of aggregation will be key inputs for such systems. In this talk, I'll focus on forecasting demand at the individual household level, which is more challenging than forecasting aggregate demand, due to the lower signal-to-noise ratio and to the heterogeneity of consumption patterns across households.
I'll describe a new ensemble method for probabilistic forecasting, whichborrows strength across the households while accommodating their individual idiosyncrasies.
The first step consists of designing a set of models or 'experts' which capture different demand dynamics and fitting each of them to the data from each household.
Then the idea is to construct an aggregation ofexperts where the ensemble weights are estimated on the whole data set,the main innovation being that we let the weights vary with the covariates by adopting an additive model structure.In particular, the proposed aggregation method is an extension of regression stacking (Breiman, 1996) where the mixture weights are modelled using linear combinations of parametric, smooth or random effects.
The methods for building and fitting additive stacking models are implemented by the gamFactory R package, available at https://github.com/mfasiolo/gamFactory
8/6, Seyed Morteza Najibi, Lund University, Functional Singular Spectrum Analysis with application to remote sensing data
One of the popular approaches in the decomposition of time series is accomplished using the rates of change. In this approach, the observed time series is partitioned (decomposed) into informative trends plus potential seasonal (cyclical) and noise (irregular) components. Aligned with this principle, Singular Spectrum Analysis (SSA) is a model-free procedure that is commonly used as a nonparametric technique in analysing the time series. SSA does not require restrictive assumptions such as stationarity, linearity, and normality. It can be used for a wide range of purposes such as trend and periodic component detection and extraction, smoothing, forecasting, change-point detection, gap filling, causality, and so on.
In this talk, I will briefly overview SSA methodology and introduce a new extension called functional SSA to analyze functional time series. This is developed by integrating ideas from functional data analysis and univariate SSA. I will demonstrate this approach for tracking changes in vegetation over time by analysing the kernel density functions of Normalized Difference Vegetation Index (NDVI) images. At the end of the talk, I will also illustrate a simulated example in the interactive Shiny web application implemented in the Rfssa package.
25/8, Jonas Wallin, Lund University: Locally scale invariant proper scoring rules
Abstract: Averages of proper scoring rules are often used to rank probabilistic forecasts. In many cases, the variance of the individual observations and their predictive distributions vary in these averages. We show that some of the most popular proper scoring rules, such as the continuous ranked probability score (CRPS) which is the go-to score for continuous observation ensemble forecasts, up-weight observations with large uncertainty which can lead to unintuitive rankings.
To describe this issue, we define the concept of local scale invariance for scoring rules. A new class of generalized proper kernel scoring rules is derived, and as a member of this class, we propose the scaled CRPS (SCRPS). This new proper scoring rule is locally scale-invariant and therefore works in the case of varying uncertainty. Like CRPS it is computationally available for output from ensemble forecasts and does not require the ability to evaluate the density of the forecast. The theoretical findings are illustrated in a few different applications, where we in particular focus on models in spatial statistics.
14/9, Moritz Schauer, Chalmers/GU: The sticky Zig-Zag sampler: an event chain Monte Carlo (PDMP-) sampler for Bayesian variable selection
Abstract: During the talk, I will present the sticky event chain Monte Carlo (piecewise deterministic Monte Carlo) samplers . This is a new class of efficient Monte Carlo methods based on continuous-time piecewise deterministic Markov processes (PDMPs) suitable for inference in high dimensional sparse models, i.e. models for which there is prior knowledge that many coordinates are likely to be exactly 0. This is achieved with the fairly simple idea of endowing existing PDMP samplers with sticky coordinate axes, coordinate planes etc. Upon hitting those subspaces, an event is triggered, during which the process sticks to the subspace, this way spending some time in a sub-model. That introduces non-reversible jumps between different (sub-)models. During the talk, I will touch upon computational aspects of the algorithm and illustrate the method for a number of statistical models where both the sample size N and the dimensionality d of the parameter space are large.
 J. Bierkens, S. Grazzi, F. van der Meulen, and M. Schauer. Sticky PDMP samplers for sparse and local inference problems. arXiv: 2103.08478, 2021.
Joris Bierkens, Delft University of Technology, email@example.com
Sebastiano Grazzi, Delft University of Technology, firstname.lastname@example.org
Frank van der Meulen, Delft University of Technology, email@example.com
Moritz Schauer, Chalmers University of Technology, University of Gothenburg, firstname.lastname@example.org
21/9, Johan Larsson, Lund University: The Hessian Screening Rule and Adaptive Paths for the Lasso
Abstract: Predictor screening rules, which discard predictors from the design matrix before fitting the model, have had sizable impacts on the speed at which sparse regression models, such as the lasso, can be solved in the high-dimensional regime. Current state-of-the-art methods, however, face difficulties when dealing with highly-correlated predictors, often becoming too conservative.
In this talk we introduce a new screening rule that deals with this issue: The Hessian Screening Rule, which offers considerable improvements in computational performance when fitting the lasso. These benefits result both from the screening rule itself, but also from much-improved warm starts.
The Hessian Screening Rule also presents a welcome improvement to the construction of the lasso path: the set of lasso models produced by varying the strength of the penalization. The default approach, to a priori construct a log-spaced penalty grid, often fails in approximating the true (exact) lasso path. Leaning on the information already used when computing the Hessian Screening Rule, however, we can improve upon the construction of this grid by adaptively picking penalty parameters along the path.
12/10, Konstantinos Konstantinou, Chalmers/GU: Spatial modeling of epidermal nerve fiber patterns
Abstract: Peripheral neuropathy is a condition associated with poor nerve functionality. Epidermal nerve fiber (ENF) counts per epidermal surface are dramatically reduced and the two dimensional spatial structure of ENFs tends to become moreclustered as neuropathy progresses. Therefore, studying the spatial structure of ENFs is essential to fully understand the mechanisms that guide those morphological changes. In this paper, we compare ENF patterns of healthy controls and subjects sufferingfrom mild diabetic neuropathy by using suction skin blister specimens obtained from the right foot. Previous analysis of these data has focused on the analysis and modelling of the spatial ENF patterns consisting of the points where the nerves enter the epidermis,base points, and the points where the nerve fibers terminate, end points, projected on a two dimensional plane, regarding the patterns as realisations of spatial point processes. Here, we include the first branching points, the points where the nerve treesbranch for the first time, and model the three dimensional patterns consisting of these three types of points. To analyze the patterns, spatial summary statistics are used and a new epidermal active territory(EAT) that measures the volume in the epidermisthat is covered by the individual nerve fibers is constructed. We developed a model for both the two dimensional and the three dimensionalpatterns including the branching points. Also, possible competitive behavior between individual nerves is examined.Our results indicate that changes in the ENFs spatial structure can more easily be detected in the later parts of the ENFs.
See Konstantinou, K., & Särkkä, A. (2021). Spatial modeling of epidermal nerve fiber patterns. Statistics in Medicine https://doi.org/10.1002/sim.9194
19/10, Alice Corbella, Warwick University: Introducing Zig-Zag Sampling and making it applicable
Abstract: Recent research showed that Piecewise Deterministic Markov Processes (PDMP) may be exploited to design efficient MCMC algorithms . The Zig-Zag sampler is an example of this: it is based on the simulation of a PDMP whose switching rate λ(t) is governed by the derivative of a (minus log) target density.
While many theoretical properties of this sampler have been derived, less has been done to explore the applicability of the Zig-Zag sampler to solve Bayesian inference problems. In particular, the computation of the derivative of the log-density in the rate λ(t) might be challenging. To expand the applicability of the Zig-Zag sampler, we incorporate Automatic Differentiation tools in the Zig-Zag algorithm, to evaluate λ(t) from the functional form of the log-target density. Moreover, to allow the simulation of a PDMP via Poisson thinning, we use univariate optimization routines to find local upper bounds.
In this talk we introduce PDMPs and the Zig-Zag sampler; we expose our Automatic Zig-Zag sampler; we discuss the challenges that arise with the simulation via thinning and the need of a new tuning parameter; and we comment on efficiencies and bottlenecks of AD for Zig-Zag. We present many examples to compare our method to HMC, another widely used gradient-based method.
This is joint work with Simon Spencer and Gareth Roberts.
 Fearnhead, P., Bierkens, J., Pollock, M., and Roberts, G.O., 2018. Piecewise deterministic Markov processes for continuous-time Monte Carlo. Statistical Science, 33(3), pp.386-412.