mages' blog

Hierarchical Loss Reserving with Stan

I continue with the growth curve model for loss reserving from last week's post. Today, following the ideas of James Guszcza [2] I will add an hierarchical component to the model, by treating the ultimate loss cost of an accident year as a random effect. Initially, I will use the nlme R package, just as James did in his paper, and then move on to Stan/RStan [6], which will allow me to estimate the full distribution of future claims payments.

Last week's model assumed that cumulative claims payment could be described by a growth curve. I used the Weibull curve and will do so here again, but others should be considered as well, e.g. the log-logistic cumulative distribution function for long tail business, see [1].
The growth curve describes the proportion of claims paid up to a given development period compared to the ultimate claims cost at the end of time, hence often called development pattern. Cumulative distribution functions are often considered, as they increase monotonously from 0 to 100%. Multiplying the development pattern with the expected ultimate loss cost gives me then the expected cumulative paid to date value.

However, what I'd like to do is the opposite, I know the cumulative claims position to date and wish to estimate the ultimate claims cost instead. If the claims process is fairly stable over the years and say, once a claim has been notified the payment process is quite similar from year to year and claim to claim, then a growth curve model is not unreasonable. Yet, the number and the size of the yearly claims will be random, e.g. if a windstorm, fire, etc occurs or not. Hence, a random effect for the ultimate loss cost across accident years sounds very convincing to me.

Here is James' model as described in [2]:
\begin{align} CL_{AY, dev} & \sim \mathcal{N}(\mu_{AY, dev}, \sigma^2_{dev}) \\ \mu_{AY,dev} & = Ult_{AY} \cdot G(dev|\omega, \theta)\\ \sigma_{dev} & = \sigma \sqrt{\mu_{dev}}\\ Ult_{AY} & \sim \mathcal{N}(\mu_{ult}, \sigma^2_{ult})\\ G(dev|\omega, \theta) & = 1 - \exp\left(-\left(\frac{dev}{\theta}\right)^\omega\right) \end{align}The cumulative losses $$CL_{AY, dev}$$ for a given accident year $$AY$$ and development period $$dev$$ follow a Normal distribution with parameters $$\mu_{AY, dev}$$ and $$\sigma_{dev}$$.

The mean itself is modelled as the product of an accident year specific ultimate loss cost $$Ult_{AY}$$ and a development period specific parametric growth curve $$G(dev | \omega, \theta)$$. The variance is believed to increase in proportion with the mean. Finally, the ultimate loss cost is modelled with a Normal distribution as well.

Assuming a Gaussian distribution of losses doesn't sound quite intuitive to me, as loss are often skewed to the right, but I shall continue with this assumption here to make a comparison with [2] possible.

Using the example data set given in the paper I can reproduce the result in R with nlme:

The fit looks pretty good, with only 5 parameters. See James' paper for a more detailed discussion.

Let's move this model into Stan. Here is my attempt, which builds on last week's pooled model. With the generated quantities code block I go beyond the scope of the original paper, as I try to estimate the full posterior predictive distribution as well.
The 'trick' is the line  mu[i] <- ult[origin[i]] * weibull_cdf(dev[i], omega, theta); where I have an accident year (here labelled origin) specific ultimate loss.

The notation ult[origin[i]] illustrates the hierarchical nature in Stan's language nicely.

Let's run the model:
The estimated parameters look very similar to the nlme output above.

Let's take a look at the parameter traceplot and the densities of the estimated ultimate loss costs by origin year.
This looks all not too bad. The trace plots don't show any particular patterns, apart from $$\sigma_{ult}$$, which shows a little skewness.

The generated quantities code block in Stan allows me to get also the predictive distribution beyond the current data range. Here I forecast claims up to development year 12 and plot the predictions, including the 95% credibility interval of the posterior predictive distribution with the observations.

The model seems to work rather well, even with the Gaussian distribution assumptions. Yet, it has still only 5 parameters. Note, this model doesn't need an additional artificial tail factor either.

Conclusions

The Bayesian approach sounds to me a lot more natural than many classical techniques around the chain-ladder methods. Thanks to Stan, I can get the full posterior distributions on both, the parameters and predictive distribution. I find communicating credibility intervals much easier than talking about the parameter, process and mean squared error.

James Guszcza contributed to a follow-up paper with Y. Zhank and V. Dukic [3] that extends the model described in [2]. It deals with skewness in loss data sets and the autoregressive nature of the errors in a cumulative time series.

Frank Schmid offers a more complex Bayesian analysis of claims reserving in [4], while Jake Morris highlights the similarities between a compartmental model used in drug research and loss reserving [5].

Finally, Glenn Meyers published a monograph on Stochastic Loss Reserving Using Bayesian MCMC Models earlier this year [7] that is worth taking a look at.

References

[1] David R. Clark. LDF Curve-Fitting and Stochastic Reserving: A Maximum Likelihood Approach. Casualty Actuarial Society, 2003. CAS Fall Forum.

[2] James Guszcza. Hierarchical Growth Curve Models for Loss Reserving, 2008, CAS Fall Forum, pp. 146–173.

[3] Y. Zhang, V. Dukic, and James Guszcza. A Bayesian non-linear model for forecasting insurance loss payments. 2012. Journal of the Royal Statistical Society: Series A (Statistics in Society), 175: 637–656. doi: 10.1111/j.1467-985X.2011.01002.x

[4] Frank A. Schmid. Robust Loss Development Using MCMC. Available at SSRN. See also http://lossdev.r-forge.r-project.org/

[5] Jake Morris. Compartmental reserving in R. 2015. R in Insurance Conference.

[6] Stan Development Team. Stan: A C++ Library for Probability and Sampling, Version 2.8.0. 2015. http://mc-stan.org/.

[7] Glenn Meyers. Stochastic Loss Reserving Using Bayesian MCMC Models. Issue 1 of CAS Monograph Series. 2015.

Session Info

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.1 (El Capitan)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats  graphics  grDevices utils datasets  methods   base

other attached packages:
[5] lattice_0.20-33

loaded via a namespace (and not attached):
[1] nloptr_1.0.4       plyr_1.8.3         tools_3.2.2
[4] digest_0.6.8       lme4_1.1-10        statmod_1.4.21
[7] gtable_0.1.2       nlme_3.1-122       mgcv_1.8-8
[10] Matrix_1.2-2       parallel_3.2.2     biglm_0.9-1
[13] SparseM_1.7        proto_0.3-10       coda_0.18-1
[16] gridExtra_2.0.0    stringr_1.0.0      MatrixModels_0.4-1
[19] lmtest_0.9-34      stats4_3.2.2       grid_3.2.2
[22] nnet_7.3-11        tweedie_2.2.1      inline_0.3.14
[25] cplm_0.7-4         minqa_1.2.4        actuar_1.1-10
[28] reshape2_1.4.1     car_2.1-0          magrittr_1.5
[31] scales_0.3.0       codetools_0.2-14   MASS_7.3-44
[34] splines_3.2.2      rsconnect_0.3.79   systemfit_1.1-18
[37] pbkrtest_0.4-2     colorspace_1.2-6   quantreg_5.19
[40] labeling_0.3       sandwich_2.3-4     stringi_1.0-1
[43] munsell_0.4.2      zoo_1.7-12

Loss Developments via Growth Curves and Stan

Last week I posted a biological example of fitting a non-linear growth curve with Stan/RStan. Today, I want to apply a similar approach to insurance data using ideas by David Clark [1] and James Guszcza [2].

Instead of predicting the growth of dugongs (sea cows), I would like to predict the growth of cumulative insurance loss payments over time, originated from different origin years. Loss payments of younger accident years are just like a new generation of dugongs, they will be small in size initially, grow as they get older, until the losses are fully settled.

Here is my example data set:

Following the articles cited above I will assume that the growth can be explained by a Weibull curve, with two parameters $$\theta$$ (scale) and $$\omega$$ (shape):
$G(dev|\omega, \theta) = 1 - \exp\left(-\left(\frac{dev}{\theta}\right)^\omega\right)$Inspired by the classical over-dispersed Poisson GLM in actuarial science, Guszcza [2] assumes a power variance structure for the process error as well. For the purpose of this post I will assume that claims cost follow a Normal distribution, to make a comparison with a the classical least square regression easier. With a prior estimate of the ultimate claims cost the cumulative loss can be modelled as:
\begin{align} CL_{AY, dev} & \sim \mathcal{N}(\mu_{dev}, \sigma^2_{dev}) \\ \mu_{dev} & = Ult \cdot G(dev|\omega, \theta)\\ \sigma_{dev} & = \sigma \sqrt{\mu_{dev}} \end{align}Perhaps, the above formula suggests a hierarchical modelling approach, with different ultimate loss costs by accident year. Indeed, this is the focus of [2] and I will endeavour to reproduce the results with Stan in my next post, but today I will stick to a pooled model that assumes a constant ultimate loss across all accident years, i.e. $$Ult_{AY} = Ult \;\forall\; AY$$.

To prepare my analysis I read in the data as a long data frame instead of the matrix structure above. Additionally, I compose another data frame that I will use later to predict payments two years beyond the first 10 years. Furthermore, to align the output with [2] I relabelled the development periods from years to months, so that year 1 becomes month 6, year 2 becomes month 18, etc. The accident years run from 1991 to 2000, while the variable origin maps those years from 1 to 10.

To get a reference point for Stan I start with a non-linear least square regression:

The output above doesn't look unreasonable, apart from the accident years 1991, 1992 and 1998. The output of gnls gives me an opportunity to provide my prior distributions with good starting points. I will use an inverse Gamma as a prior for $$\sigma$$, constrained Normals for the parameters $$\theta, \omega$$ and $$Ult$$ as well. If you have better ideas, then please get in touch.

The Stan code below is very similar to last week. Again, I am interested here in the posterior distributions, hence I add a block to generate quantities from those. Note, Stan comes with a build-in function for the cumulative Weibull distribution weibull_cdf.

I store the Stan code in a separate text file, which I read into R to compile the model. The compilation takes a little time. The sampling itself is done in a few seconds.

Let's take a look at the output:

The parameters are a little different to the output of gnls, but well within the standard error. From the plots I notice that the samples for the ultimate loss as well as for $$\theta$$ are a little skewed to the right. Well, assuming cumulative losses to follow a Normal distribution was a bit a of stretch to start with. Still, the samples seem to converge.

Finally, I can plot the 90% credible intervals of the posterior distributions.

The 90% prediction credible interval captures most of the data and although this model might not be suitable for reserving individual accident years, it could provide an initial starting point for further investigations. Additionally, thanks to the Bayesian model I have access to the full distribution, not just point estimators and standard errors.

My next post will continue with this data set and the ideas of James Guszcza by allowing the ultimate loss cost to vary by accident year, treating it as a random effect. Here is a teaser of what the output will look like:

References

[1] David R. Clark. LDF Curve-Fitting and Stochastic Reserving: A Maximum Likelihood Approach. Casualty Actuarial Society, 2003. CAS Fall Forum.

[2] Guszcza, J. Hierarchical Growth Curve Models for Loss Reserving, 2008, CAS Fall Forum,
pp. 146–173.

Session Info

R version 3.2.2 (2015-08-14)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.11.1 (El Capitan)

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
[7] base

other attached packages:
[1] rstan_2.8.0    ggplot2_1.0.1     Rcpp_0.12.1

loaded via a namespace (and not attached):
[1] nloptr_1.0.4       plyr_1.8.3         tools_3.2.2
[4] digest_0.6.8       lme4_1.1-10        statmod_1.4.21
[7] gtable_0.1.2       mgcv_1.8-8         Matrix_1.2-2
[10] parallel_3.2.2     biglm_0.9-1        SparseM_1.7
[13] proto_0.3-10       gridExtra_2.0.0    coda_0.18-1
[16] stringr_1.0.0      MatrixModels_0.4-1 stats4_3.2.2
[19] lmtest_0.9-34      grid_3.2.2         nnet_7.3-11
[22] inline_0.3.14      tweedie_2.2.1      cplm_0.7-4
[25] minqa_1.2.4        reshape2_1.4.1     car_2.1-0
[28] actuar_1.1-10      magrittr_1.5       codetools_0.2-14
[31] scales_0.3.0       MASS_7.3-44        splines_3.2.2
[34] systemfit_1.1-18   rsconnect_0.3.79   pbkrtest_0.4-2
[37] colorspace_1.2-6   labeling_0.3       quantreg_5.19
[40] sandwich_2.3-4     stringi_1.0-1      munsell_0.4.2
[43] zoo_1.7-12