# Bayes

## Hierarchical Compartmental Reserving Models for Business Planning

Introduction It’s been three years since the Casualty Actuarial Society published our research paper on Hierarchical Compartmental Reserving Models (Gesmann and Morris (2020)). Time to revisit it, as developments of the Stan language, and its interfaces such as cmdstanr and brms have progressed and simplified the treatment of differential equations. We have updated the bookdown version version of the paper to take advantage of these newer versions. This post will give another example of how to use hierarchical compartmental reserving models, but rather than working with historical claims data, we use the model to generate future data, as may be required for a business plan of a new product, where no historical data exists.

## Portfolio Allocation for Bayesian Dummies

This post is about the Black-Litterman (BL) model for asset allocation and the basis of my talk at the Dublin Data Science Meet-up. The original BL paper (Black and Litterman (1991)) is over 30 years old and builds on the ideas of modern portfolio theory by Harry Markowitz (Markowitz (1952)). A good introduction to the BL model is (Idzorek (2005)) or (Maggiar (2009)). I am not sure how much the model is used by investment professionals, as many of the underlying assumptions may not hold true in the real world.

## Modelling incremental vs cumulative growth data - Does it matter?

It is exactly one year ago that the Casualty Actuarial Society published our research paper on Hierarchical Compartmental Reserving Models (Gesmann and Morris (2020)). One aspect we looked into was the question if the choice of modelling cumulative or incremental payment data over time matters. Many traditional reserving methods (including the chain-ladder technique) take cumulative claims triangles as an input. Plotting cumulative claims data allows us to quickly understand key data features by eye.

## Prediction for the 100m final at the Tokyo Olympics

On Sunday the Tokyo Olympics men sprint 100m final will take place. Francesc Montané reminded me in his analysis that 9 years ago I used a simple regression model to predict the winning time for the 100m men sprint final of the 2012 Olympics in London. My model predicted a winning time of 9.68s, yet Usain Bolt finished in 9.63s. For this Sunday my prediction is 9.72s, with a 50% credible interval of [9.

## Fitting multivariate ODE models with brms

This article illustrates how ordinary differential equations and multivariate observations can be modelled and fitted with the brms package (Bürkner (2017)) in R1. As an example I will use the well known Lotka-Volterra model (Lotka (1925), Volterra (1926)) that describes the predator-prey behaviour of lynxes and hares. Bob Carpenter published a detailed tutorial to implement and analyse this model in Stan and so did Richard McElreath in Statistical Rethinking 2nd Edition (McElreath (2020)).

## Research on Hierarchical Compartmental Reserving Models published

Over the last year I worked with Jake Morris on a research paper for the Casualty Actuarial Society. We are delighted to see it published: Gesmann, M., and Morris, J. “Hierarchical Compartmental Reserving Models.” Casualty Actuarial Society, CAS Research Papers, 19 Aug. 2020, https://www.casact.org/sites/default/files/2021-02/compartmental-reserving-models-gesmannmorris0820.pdf The paper demonstrates how one can describe the dynamics of claims processes with differential equations and probability distributions. All of this is set into a Bayesian framework that allows us to combine judgement and historical data into a consistent framework.

## Fitting a distribution in Stan from scratch

Last week the French National Institute of Health and Medical Research (Inserm) organised with the Stan Group a training programme on Bayesian Inference with Stan for Pharmacometrics in Paris. Daniel Lee and Michael Betancourt, who run the course over three days, are not only members of Stan’s development team, but also excellent teachers. Both were supported by Eric Novik, who gave an Introduction to Stan at the Paris Dataiku User Group last week as well.

## Hierarchical Loss Reserving with Stan

I continue with the growth curve model for loss reserving from last week’s post. Today, following the ideas of James Guszcza [2] I will add an hierarchical component to the model, by treating the ultimate loss cost of an accident year as a random effect. Initially, I will use the nlme R package, just as James did in his paper, and then move on to Stan/RStan [6], which will allow me to estimate the full distribution of future claims payments.

## Loss Developments via Growth Curves and Stan

Last week I posted a biological example of fitting a non-linear growth curve with Stan/RStan. Today, I want to apply a similar approach to insurance data using ideas by David Clark [1] and James Guszcza [2]. Instead of predicting the growth of dugongs (sea cows), I would like to predict the growth of cumulative insurance loss payments over time, originated from different origin years. Loss payments of younger accident years are just like a new generation of dugongs, they will be small in size initially, grow as they get older, until the losses are fully settled.

## Non-linear growth curves with Stan

I suppose the go to tool for fitting non-linear models in R is nls of the stats package. In this post I will show an alternative approach with Stan/RStan, as illustrated in the example, Dugongs: “nonlinear growth curve”, that is part of Stan’s documentation. The original example itself is taken from OpenBUGS. The data describes the length and age measurements for 27 captured dugongs (sea cows). Carlin and Gelfand (1991) model the data using a nonlinear growth curve with no inflection point and an asymptote as $$x_i$$ tends to infinity:

## Bayesian regression models using Stan in R

It seems the summer is coming to end in London, so I shall take a final look at my ice cream data that I have been playing around with to predict sales statistics based on temperature for the last couple of weeks [1], [2], [3]. Here I will use the new brms (GitHub, CRAN) package by Paul-Christian Bürkner to derive the 95% prediction credible interval for the four models I introduced in my first post about generalised linear models.

## Posterior predictive output with Stan

I continue my Stan experiments with another insurance example. Here I am particular interested in the posterior predictive distribution from only three data points. Or, to put it differently I have a customer of three years and I’d like to predict the expected claims cost for the next year to set or adjust the premium. The example is taken from section 16.17 in Loss Models: From Data to Decisions [1].

## Hello Stan!

In my previous post I discussed how Longley-Cook, an actuary at an insurance company in the 1950’s, used Bayesian reasoning to estimate the probability for a mid-air collision of two planes. Here I will use the same model to get started with Stan/RStan, a probabilistic programming language for Bayesian inference. Last week my prior was given as a Beta distribution with parameters $$\alpha=1, \beta=1$$ and the likelihood was assumed to be a Bernoulli distribution with parameter $$\theta$$: \begin{aligned} \theta & \sim \mbox{Beta}(1, 1)\\ y_i & \sim \mbox{Bernoulli}(\theta), \;\forall i \in N \end{aligned}For the previous five years no mid-air collision were observed, $$x=\{0, 0, 0, 0, 0\}$$.

## Predicting events, when they haven't happened yet

Suppose you have to predict the probabilities of events which haven’t happened yet. How do you do this? Here is an example from the 1950s when Longley-Cook, an actuary at an insurance company, was asked to price the risk for a mid-air collision of two planes, an event which as far as he knew hadn’t happened before. The civilian airline industry was still very young, but rapidly growing and all Longely-Cook knew was that there were no collisions in the previous 5 years [1].

## Measuring temperature with my Arduino

It is really getting colder in London - it is now about 5°C outside. The heating is on and I have got better at measuring the temperature at home as well. Or, so I believe. Last week’s approach of me guessing/feeling the temperature combined with an old thermometer was perhaps too simplistic and too unreliable. This week’s attempt to measure the temperature with my Arduino might be a little OTT (over the top), but at least I am using the micro-controller again.

## How cold is it? A Bayesian attempt to measure temperature

It is getting colder in London, yet it is still quite mild considering that it is late November. Well, indoors it still feels like 20°C (68°F) to me, but I have been told last week that I should switch on the heating. Luckily I found an old thermometer to check. The thermometer showed 18°C. Is it really below 20°C? The thermometer is quite old and I’m not sure that is works properly anymore.

## Hit and run. Think Bayes!

At the R in Insurance conference Arthur Charpentier gave a great keynote talk on Bayesian modelling in R. Bayes’ theorem on conditional probabilities is strikingly simple, yet incredibly thought provoking. Here is an example from Daniel Kahneman to test your intuition. But first I have to start with Bayes’ theorem. Bayes’ theorem Bayes’ theorem states that given two events $$D$$ and $$H$$, the probability of $$D$$ and $$H$$ happening at the same time is the same as the probability of $$D$$ occurring, given $$H$$, weighted by the probability that $$H$$ occurs; or the other way round.

## Not only verbs but also believes can be conjugated

Following on from last week, where I presented a simple example of a Bayesian network with discrete probabilities to predict the number of claims for a motor insurance customer, I will look at continuous probability distributions today. Here I follow example 16.17 in Loss Models: From Data to Decisions [1]. Suppose there is a class of risks that incurs random losses following an exponential distribution (density $$f(x) = \Theta {e}^{- \Theta x}$$) with mean $$1/\Theta$$.

## Predicting claims with a Bayesian network

Here is a little Bayesian Network to predict the claims for two different types of drivers over the next year, see also example 16.15 in [1]. Let’s assume there are good and bad drivers. The probabilities that a good driver will have 0, 1 or 2 claims in any given year are set to 70%, 20% and 10%, while for bad drivers the probabilities are 50%, 30% and 20% respectively.