At the Insurance Data Science conference, both Eric Novik and Paul-Christian Bürkner emphasised in their talks the value of thinking about the data generating process when building Bayesian statistical models. It is also a key step in Michael Betancourt’s Principled Bayesian Workflow.
In this post, I will discuss in more detail how to set priors, and review the prior and posterior parameter distributions, but also the prior predictive distributions with brms (Bürkner (2017)).

Ahead of the Stan Workshop on Tuesday, here is another example of using brms (Bürkner (2017)) for claims reserving. This time I will use a model inspired by the 2012 paper A Bayesian Nonlinear Model for Forecasting Insurance Loss Payments (Zhang, Dukic, and Guszcza (2012)), which can be seen as a follow-up to Jim Guszcza’s Hierarchical Growth Curve Model (Guszcza (2008)).
I discussed Jim’s model in an earlier post using Stan.

This is a follow-up post on hierarchical compartmental reserving models using PK/PD models. It will show how differential equations can be used with Stan/ brms and how correlation for the same group level terms can be modelled.
PK/ PD is usually short for pharmacokinetic/ pharmacodynamic models, but as Eric Novik of Generable pointed out to me, it could also be short for Payment Kinetics/ Payment Dynamics Models in the insurance context.

Today, I will sketch out ideas from the Hierarchical Compartmental Models for Loss Reserving paper by Jake Morris, which was published in the summer of 2016 (Morris (2016)). Jake’s model is inspired by PK/PD models (pharmacokinetic/pharmacodynamic models) used in the pharmaceutical industry to describe the time course of effect intensity in response to administration of a drug dose.
The hierarchical compartmental model fits outstanding and paid claims simultaneously, combining ideas of Clark (2003), Quarg and Mack (2004), Miranda, Nielsen, and Verrall (2012), Guszcza (2008) and Zhang, Dukic, and Guszcza (2012).

Last week the French National Institute of Health and Medical Research (Inserm) organised with the Stan Group a training programme on Bayesian Inference with Stan for Pharmacometrics in Paris. Daniel Lee and Michael Betancourt, who run the course over three days, are not only members of Stan’s development team, but also excellent teachers. Both were supported by Eric Novik, who gave an Introduction to Stan at the Paris Dataiku User Group last week as well.

I continue my Stan experiments with another insurance example. Here I am particular interested in the posterior predictive distribution from only three data points. Or, to put it differently I have a customer of three years and I’d like to predict the expected claims cost for the next year to set or adjust the premium.
The example is taken from section 16.17 in Loss Models: From Data to Decisions [1]. Some time ago I used the same example to get my head around a Bayesian credibility model.

Taking the first step is often the hardest: getting data from Excel into R. Suppose you would like to use the ChainLadder package to forecast future claims payments for a run-off triangle that you have stored in Excel.

How do you get the triangle into R and execute a reserving function, such as MackChainLadder? Well, there are many ways to do this and the ChainLadder package vignette, as well as the R manual on Data Import/Export has all of the details, but here is a quick and dirty solution using a CSV-file.

Following on from last week, where I presented a simple example of a Bayesian network with discrete probabilities to predict the number of claims for a motor insurance customer, I will look at continuous probability distributions today. Here I follow example 16.17 in Loss Models: From Data to Decisions [1]. Suppose there is a class of risks that incurs random losses following an exponential distribution (density $f(x) = \Theta {e}^{- \Theta x}$) with mean $1/\Theta$.

Here is a little Bayesian Network to predict the claims for two different types of drivers over the next year, see also example 16.15 in [1]. Let’s assume there are good and bad drivers. The probabilities that a good driver will have 0, 1 or 2 claims in any given year are set to 70%, 20% and 10%, while for bad drivers the probabilities are 50%, 30% and 20% respectively. Further I assume that 75% of all drivers are good drivers and only 25% would be classified as bad drivers.

Over the last year I worked with two colleagues of mine on the subject of inflation and claims inflation in particular. I didn’t expect it to be such a challenging topic, but we ended up with more questions than answers. The key question and biggest challenge is to define what inflation, or indeed claims inflation actually is and how to measure it. We published a summary of our thoughts and findings in this month’s issue of The Actuary.

Last week we released version 0.1.5-6 of the ChainLadder package on CRAN. The ChainLadder package provides statistical models, which are typically used for the estimation of outstanding claims reserves in general insurance. The package vignette gives an overview of the package functionality.
Output of plot(MackChainLadder(GenIns))
Since the last CRAN release Dan Murphy added new features to the MackChainLadder function and we fixed a bug in BootChainLadder. Here are he details:

The registration for the first R in Insurance is open and there is still time to submit a talk / lightning talk.
The conference will take place at Cass Business School in London on Monday, 15 July 2013. This is the Monday following the useR! 2013 conference in Spain. Thus, if you come from overseas to Spain, why not stop in London on your way back? All further information and registration details are available on the Cass Business School conference site.

This is the third post about Christofides’ paper on Regression models based on log-incremental payments [1]. The first post covered the fundamentals of Christofides’ reserving model in sections A - F, the second focused on a more realistic example and model reduction of sections G - K. Today’s post will wrap up the paper with sections L - M and discuss data normalisation and claims inflation. I will use the same triangle of incremental claims data as introduced in my previous post.

Following on from last week’s post I will continue to go through the paper Regression models based on log-incremental payments by Stavros Christofides [1]. In the previous post I introduced the model from the first 15 pages up to section F. Today I will progress with sections G to K which illustrate the model with a more realistic incremental claims payments triangle from a UK Motor Non-Comprehensive account:# Page D5.17

A recent post on the PirateGrunt blog on claims reserving inspired me to look into the paper Regression models based on log-incremental payments by Stavros Christofides [1], published as part of the Claims Reserving Manual (Version 2) of the Institute of Actuaries.
The paper is available together with a spread sheet model, illustrating the calculations. It is very much based on ideas by Barnett and Zehnwirth, see [2] for a reference.

Last week we released version 0.1.5-4 of the ChainLadder package on CRAN. The R package provides methods which are typically used in insurance claims reserving. If you are new to R or insurance check out my recent talk on Using R in Insurance.
The chain-ladder method which is a popular method in the insurance industry to forecast future claims payments gave the package its name. However, the ChainLadder package has many other reserving methods and models implemented as well, such as the bootstrap model demonstrated below.

Every year the UK’s general insurance actuarial community organises a big conference, which they call GIRO, short for General Insurance Research Organising committee.
This year’s conference is in Brussels from 18 - 21 September 2012. Despite the fact that Brussels is actually in Belgium the UK actuaries will travel all the way to enjoy good beer and great talks. On Wednesday morning I will run a session on Using R in insurance.