**24/1 - Mats Gyllenberg, Helsingfors Universitet: On models of physiologically structured populations and their reduction to ordinary differential equations**Sammanfattning: Considering the environmental condition as a given function of time, we formulate a physiologically structured population model as a linear non-autonomous integral equation for the, in general distributed, population level birth rate. We take this renewal equation as the starting point for addressing the following question: When does a physiologically structured population model allow reduction to an ODE without loss of relevant information? We formulate a precise condition for models in which the state of individuals changes deterministically, that is, according to an ODE. Specialising to a one-dimensional individual state, like size, we present various sufficient conditions in terms of individual growth-, death-, and reproduction rates, giving special attention to cell fission into two equal parts and to the catalogue derived in an other paper of ours (submitted). We also show how to derive an ODE system describing the asymptotic large time behaviour of the population when growth, death and reproduction all depend on the environmental condition through a common factor (so for a very strict form of physiological age).

**31/1 - Christian A. Naesseth, Automatic Control, Linköping: Variational and Monte Carlo methods - Bridging the Gap**

**7/2 - Jonas Wallin, Lund University: Multivariate Type-G Matérn fields**

Abstract: I will present a class of non-Gaussian multivariate random fields is formulated using systems of stochastic partial differential equations (SPDEs) with additive non-Gaussian noise. To facilitate computationally efficient likelihood-based inference, the noise is constructed using normal-variance mixtures (type-G noise). Similar, but simpler, constructions have been proposed earlier in the literature, however they lack important properties such as ergodicity and flexibility of predictive distributions. I will present that for a specific system of SPDEs the marginal of the fields has Matérn covariance functions.

Further I will present a parametrization of the system, that one can use to separate the cross-covariance and the extra dependence coming from the non-Gaussian noise in the proposed model.

If time permits I will discuss some recent result on proper scoring rules (PS). PS is the standard tool for evaluating which model fits data best in spatial statistics (like Gaussian vs non-Gaussian models).

We have developed a new class of PS that I argue is better suited for evaluation model if one has observations at irregular locations.**14/2 - Jes Frellsen, IT University of Copenhagen: Deep latent variable models: estimation and missing data imputation**

Abstract: Deep latent variable models (DLVMs) combine the approximation abilities of deep neural networks and the statistical foundations of generative models. In this talk, we first give a brief introduction to deep learning. Then we discuss how DLVMs are estimated: variational methods are commonly used for inference; however, the exact likelihood of these models has been largely overlooked. We show that most unconstrained models used for continuous data have an unbounded likelihood function and discuss how to ensure the existence of maximum likelihood estimates. Then we present a simple variational method, called MIWAE, for training DLVMs, when the training set contains missing-at-random data. Finally, we present Monte Carlo algorithms for missing data imputation using the exact conditional likelihood of DLVMs: a Metropolis-within-Gibbs sampler for DLVMs trained on complete datasets and an importance sampler for DLVMs trained on incomplete data sets. For complete training sets, our algorithm consistently and significantly outperforms the usual imputation scheme used for DLVMs. For incomplete training set, we show that MIWAE trained models provide accurate single and multiple imputations, and are highly competitive with state-of-the-art methods.

This is joint work with Pierre-Alexandre Mattei.**21/2 - Riccardo De Bin, University of Oslo: Detection of influential points as a byproduct of resampling-based variable selection procedures**

Abstract: Influential points can cause severe problems when deriving a multivariable regression model. A novel approach to check for such points is proposed, based on the variable inclusion matrix, a simple way to summarize results from resampling-based variable selection procedures. These procedures rely on the variable inclusion matrix, which reports whether a variable (column) is included in a regression model fitted on a pseudo-sample (row) generated from the original data (e.g., bootstrap sample or subsample). The variable inclusion matrix is used to study the variable selection stability, to derive weights for model averaged predictors and in others investigations. Concentrating on variable selection, it also allows understanding whether the presence of a specific observation has an influence on the selection of a variable.

From the variable inclusion matrix, indeed, the inclusion frequency (I-frequency) of each variable can be computed only in the pseudo-samples (i.e., rows) which contain the specific observation. When the procedure is repeated for each observation, it is possible to check for influential points through the distribution of the I-frequencies, visualized in a boxplot, or through a Grubbs’ test. Outlying values in the former case and significant results in the latter point to observations having an influence on the selection of a specific variable and therefore on the finally selected model. This novel approach is illustrated in two real data examples.**28/2 - Johan Henriksson: Single-cell perturbation analysis – the solution to systems biology?**

Abstract: The ideas behind systems biology has been around for ages. However, the field has been held back by the lack of data. In this talk I will cover new methods, by me and others, toward generating the large amounts of data needed to fit realistic regulatory models. Focus will be on wet lab methods as well as equations, and how we practically can solve them. I will try to cover, in particular, CRISPR, RNA-seq, ATAC-seq, STARR-seq, bayesian models, ODE and a bit of physics.

**7/3 - Larisa Beilina: Time-adaptive parameter identification in mathematical model of HIV infection with ****drug therapy**

Abstract: Parameter identification problems are frequently occurring within biomedical applications. These problems are often ill-posed, and thus challenging to solve numerically. In this talk will be presented the time-adaptive optimization method for determination of drug efficacy in the mathematical model of HIV infection. Time-adaptive method means that first we determine drug efficacy at known coarse time partition using known values of observed functions. Then we locally refine time-mesh at points where a posteriori error indicator is large and compute drug efficacy on a new refined mesh until the error is reduced to the desired accuracy. The time-adaptive method can eventually be used by clinicians to determine the drug-response for each treated individual. The exact knowledge of the personal drug efficacy can aid in the determination of the most suitable drug as well as the most optimal dose for each person, in the long run resulting in a personalized treatment with maximum efficacy and minimum adverse drug reactions.**14/3 - Umberto Picchini: Accelerating MCMC sampling via approximate delayed-acceptance**

Abstract: While Markov chain Monte Carlo (MCMC) is the ubiquitous tool for sampling from complex probability distributions, it does not scale well with increasing datasets. Also, its structure is not naturally suited for parallelization.

When pursuing Bayesian inference for model parameters, MCMC can be computationally very expensive, either when the dataset is large, or when the likelihood function is unavailable in closed form and itself requires Monte Carlo approximations. In these cases each iteration of Metropolis-Hastings may result intolerably slow. The so-called "delayed acceptance" MCMC (DA-MCMC) was suggested by Christen and Fox in 2005 and allows the use of a computationally cheap surrogate of the likelihood function to rapidly screen (and possibly reject) parameter proposals, while using the expensive likelihood only when the proposal has survived the "scrutiny" of the cheap surrogate. They show that DA-MCMC samples from the exact posterior distribution and returns results much more

rapidly than standard Metropolis-Hastings. Here we design a novel delayed-acceptance algorithm, which is between 2 and 4 times faster than the original DA-MCMC, though ours results in approximate inference. Despite this, we show empirically that our algorithm returns accurate inference. A computationally intensive case study is discussed,

involving ~25,000 observations from protein folding reaction coordinate, fit by an SDE model with an intractable likelihood approximated using sequential Monte Carlo (that is particle MCMC).

This is joint work with Samuel Wiqvist, Julie Lyng Forman, Kresten Lindorff-Larsen and Wouter Boomsma.

keywords: Bayesian inference, Gaussian process; intractable likelihood; particle MCMC; protein folding; SDEs**21/3 - Samuel Wiqvist, Lund University: Automatic learning of summary statistics for Approximate Bayesian Computation using Partially Exchangeable Networks**

Abstract: Likelihood-free methods enable statistical inference for the parameters of complex models, when the likelihood function is analytically intractable. For these models, several tools are available that only require the ability to run a computer simulator of the mathematical model, and use the output to replace the unavailable likelihood function. The most famous of these type of methodologies is Approximate Bayesian Computation (ABC), which relies on the access to low-dimensional summary statistics of the data. Learning these summary statistics is a fundamental problem in ABC, and selecting them is not trivial. It is in fact the main challenge when applying ABC in practice, and it affects the resulting inference considerably. Deep learning methods have previously been used to learn summary statistics for ABC.

Here we introduce a novel deep learning architecture (Partially Exchangeable Networks, PENs), with the purpose to automatize the summaries selection task. We only need to provide our network with samples from the prior predictive distribution, and this will return summary statistics for ABC use. PENs are designed to have the correct invariance property for Markovian data, and PENs are therefore particularly useful when learning summary statistics for Markovian data.

Case studies show that our methodology outperforms other popular methods, resulting in more accurate ABC inference for models with intractable likelihoods. Empirically, we show that for some case studies our approach seems to work well also with non-Markovian and non-exchangeable data.**28/3 - ****Hans Falhlin (****Chief Investment Officer, ****AP2, Andra AP-fonden)**** and Tomas Morsing (****Head of Quantitative Strategies, ****AP2, Andra AP-fonden): ****A scientific approach to financial decision making**** in the context of managing Swedish pension assets**

**11/4 - Daniele Silvestro: Birth-death models to understand the evolution of (bio)diversity**

Abstract: Our planet and its long history are characterized by a stunning diversity of organisms, environments and, more recently, cultures and technologies. To understand what factors contribute to generating diversity and shaping its evolution we have to look beyond diversity patterns. Here I present a suite of Bayesian models to infer the dynamics of origination and extinction processes using fossil occurrence data and show how the models can be adapted to the study of cultural evolution. Through empirical examples, I will demonstrate the use of this probabilistic framework to test specific hypotheses and quantify the processes underlying (bio)diversity patterns and their evolution.

**12/4 - Erika B. Roldan Roa, Department of Mathematics, The Ohio State University: Evolution of the homology and related geometric properties of the Eden Growth Model**

Abstract: In this talk, we study the persistent homology and related geometric properties of the evolution in time of a discrete-time stochastic process defined on the 2-dimensional regular square lattice. This process corresponds to a cell growth model called the Eden Growth Model (EGM). It can be described as follows: start with the cell square of the 2-dimensional regular square lattice of the plane that contains the origin; then make the cell structure grow by adding one cell at each time uniformly random to the perimeter. We give a characterization of the possible change in the rank of the first homology group of this process (the "number of holes"). Based on this result we have designed and implemented a new algorithm that computes the persistent homology associated to this stochastic process and that also keeps track of geometric features related to the homology. Also, we present obtained results of computational experiments performed with this algorithm, and we establish conjectures about the asymptotic behaviour of the homology and other related geometric random variables. The EGM can be seen as a First Passage Percolation model after a proper time-scaling. This is the first time that tools and techniques from stochastic topology and topological data analysis are used to measure the evolution of the topology of the EGM and in general in FPP models.

**16/5 - Susanne Ditlevsen, University of Copenhagen: Inferring network structure from oscillating systems with cointegrated phase processes**

**23/5 - Chun-Biu Li, Stockholms Universitet: Information Theoretic Approaches to Statistical Learning**

Abstract: Since its introduction in the context of communication theory, information theory has extended to a wide range of disciplines in both natural and social sciences. In this talk, I will explore information theory as a nonparametric probabilistic framework for unsupervised and supervised learning free from a prioriassumption on the underlying statistical model. In particular, the soft (fuzzy) clustering problem in unsupervised learning can be viewed as a tradeoff between data compression and minimizing the distortion of the data. Similarly, modeling in supervised learning can be treated as a tradeoff between compression of the predictor variables and retaining the relevant information about the response variable. To illustrate the usage of these methods, some applications in biophysical problems and time series analysis will be briefly addressed in the talk.

**13/6 - Sara Hamis, Swansea University: DNA Damage Response Inhibition: Predicting in vivo treatment responses using an in vitro- calibrated mathematical model**

In this talk I will present an individual based mathematical cancer model in which one individual corresponds to one cancer cell. This model is governed by a few observable and well documented principles, or rules. To account for differences between the in vitro and in vivo scenarios, these rules can be appropriately adjusted. By only adjusting the rules (whilst keeping the fundamental framework intact), the mathematical model can first be calibrated by in vitro data and thereafter be used to successfully predict treatment responses in mouse xenografts in vivo. The model is used to investigate treatment responses to a drug that hinders tumour proliferation by targeting the cellular DNA damage response process.

**19/9 - Ronald Meester, Vrije University, Amsterdam: The DNA Database Controversy 2.0**

**26/9 - Valerie Monbet, Université de Rennes: Time-change models for asymmetric processes**

**3/10 - Peter Jagers, Chalmers: Populations - from few independently reproducing individuals to continuous and deterministic flows. Or: From branching processes to adaptive population dynamics**

**17/10 - Richard Davis, Columbia University and Chalmers Jubileum Professor 2019: Extreme Value Theory Without the Largest Values: What Can Be Done?**

**24/10 - Erica Metheney, Department of Political Sciences, University of Gothenburg: Modifying Non-Graphic Sequences to be Graphic**

**31/10 - Sofia Tapani, AstraZeneca: Early clinical trial design - Platform designs with the patient at its center**

This feature of clinical trial design can also add value to other therapy areas due to its potential exploratory nature. The platform design allows for multi-arm clinical trials to evaluate several experimental treatments perhaps not all available at the same point in time. At the early clinical development stage, new drugs are rarely at the same stage of development. The alternative, several separate two-arm studies is time consuming and can be a bottle neck in development due to budget limitations in comparison to the more efficient platform study where arms are added at several different time points after start of enrolment.

Platform designs within the heart failure therapy area in early clinical development are exploratory of nature. Clear prognostic and predictive biomarker profiles for disease are not available and need to be explored to be identified for each patient population. As an example, we’ll have a look at the HIDMASTER trial design for biomarker identification and compound graduation throughout the platform.

All platform trials need to be thoroughly simulated, and simulations should be used as a tool to decide among design options. Simulations of platform trials gives the opportunity to investigate many scenarios including null scenario to establish overall type I error. We can evaluate bias estimation and sensitivity to patient withdrawals, missing data, enrolment rates/patterns, interim analysis timings, data access delays, data cleanliness, analysis delays, etc.

Simulations should also comprise decision operating characteristics to be able to make decisions on the design based on the objective of the trial: early stops of underperforming arms, early go for active arms, prioritise arms on emerging data or drawing insights from whole study data analysis.

Over time the trial learns about the disease, new endpoints, stratification biomarkers and prognostic vs predictive effects.

**6/11 - Richard Torkar, Software Engineering, Chalmers: Why do we encourage even more missingness when dealing with missing data?**

**7/11 - Krzysztof Bartoszek, Linköping University: Formulating adaptive hypotheses in multivariate phylogenetic comparative methods**

* after branching the traits evolve independently

* the distribution of the trait at time t, X(t), conditional on the ancestral value, X(s), at time s<t, is Gaussian with ** E[X(t) | X(s)] =

w(s,t) + F(s,t)X(s)

** Var[X(t) | X(s) ] = V(s,t),

where neither w(s,t), F(s,t), nor V(s,t) can depend on X(.) but may be further parametrized. Using the likelihood computational engine PCMBase [2, available on CRAN] the PCMFit [3, publicly available on GitHub] package allows for inference of models belonging to the GLInv family and furthermore allows for finding points of shifts between evolutionary regimes n the tree. What is particularly novel is that it allows not only for shifts between a model's parameters but for switches between different types of models within then GLInv family (e.g. a shift from a Brownian motion (BM) to an Ornstein-Uhlenbeck (OU) process and vice versa). Interactions between traits can be understood as magnitudes and signs of off-diagonal entries of F(s,t) or V(s,t). What is particularly interesting is that in this family of models one may obtain changes in the direction of the relationship, i.e. the long and short term joint dynamics can be of a different nature. This is possible even if one simplifies the process to an OU one. Here, one is able to very finely understand the dynamics of the process and propose specific model parameterizations [PCMFit and current CRAN version of mvSLOUCH, 1, which is based on PCMBase]. In the talk I will discuss how one can setup different hypotheses concerning relationships between the traits in terms of model parameters and how one can view the long and short term evolutionary dynamics. The software's possibilities will be illustrated by considering the evolution of fruit in the Ferula genus. I will also discuss some limit results that are amongst others, useful for setting initial seeds of the numerical estimation procedures.

A phylogenetic comparative method for studying multivariate adaptation.

J. Theor. Biol. 314:204-215, 2012.

[2] V. Mitov, K. Bartoszek, G. Asimomitis, T. Stadler. Fast likelihood calculation for multivariate phylogenetic comparative methods: The PCMBase R package. arXiv:1809.09014, 2018.

[3] V. Mitov, K. Bartoszek, T. Stadler. Automatic generation of evolutionary hypotheses using mixed Gaussian phylogenetic models. PNAS, 201813823, 2019.

**20/11 - Paul-Christian Bürkner, Aalto University: Bayesflow: Software assisted Bayesian workflow**

A principled Bayesian workflow consists of several steps from the design of the study, gathering of the data, model building, estimation, and validation, to the final conclusions about the effects under study. I want to present a concept for a software package that assists users in following a principled Bayesian workflow for their data analysis by diagnosing problems and giving recommendations for sensible next steps. This concept gives rise to a lot of interesting research questions we want to investigate in the upcoming years.

**27/11 - Geir Storvik, Oslo University: Flexible Bayesian Nonlinear Model Configuration**

This is joint work with Aliaksandr Hubin (Norwegian Computing Center) and Florian Frommlet (CEMSIIS, Medical University of Vienna)

**4/12 - Moritz Schauer, Chalmers/GU: Smoothing and inference for high dimensional diffusions**

We apply this to the problem of tracking convective cloud systems from satellite data with low time resolution.

**11/12 - Johannes Borgqvist, Chalmers/GU: The polarising world of Cdc42: the derivation and analysis of a quantitative reaction diffusion model of cell polarisation**

In this project, we develop a quantifiable model of cell polarisation accounting for the morphology of the cell. The model consists of a coupled system of PDEs, more specifically Reaction Diffusion equations, with two spatial domains: the cytosol and the cell membrane. In this setting, we prove sufficient conditions for pattern formation. Using a “Finite Element”-based numerical scheme, we simulate cell polarisation for these two domains. Further, we illustrate the impact of the parameters on the patterns that emerge and we estimate the time until polarization. Using this work as a starting point, it is possible to integrate data into the theoretical description of the process to deeper understand cell polarisation mechanistically.