At the Insurance Data Science conference, both Eric Novik and Paul-Christian Bürkner emphasised in their talks the value of thinking about the data generating process when building Bayesian statistical models. It is also a key step in Michael Betancourt’s Principled Bayesian Workflow.
In this post, I will discuss in more detail how to set priors, and review the prior and posterior parameter distributions, but also the prior predictive distributions with brms (Bürkner (2017)).

How do you build a model from first principles? Here is a step by step guide.
Following on from last week’s post on Principled Bayesian Workflow I want to reflect on how to motivate a model.
The purpose of most models is to understand change, and yet, considering what doesn’t change and should be kept constant can be equally important.
I will go through a couple of models in this post to illustrate this idea.

Last week I presented visualisations of theoretical distributions that predict ice cream sales statistics based on linear and generalised linear models, which I introduced in an earlier post. Theoretical distributions Today I will take a closer look at the log-transformed linear model and use Stan/rstan, not only to model the sales statistics, but also to generate samples from the posterior predictive distribution.

Two weeks ago I discussed various linear and generalised linear models in R using ice cream sales statistics. The data showed not surprisingly that more ice cream was sold at higher temperatures.
icecream <- data.frame( temp=c(11.9, 14.2, 15.2, 16.4, 17.2, 18.1, 18.5, 19.4, 22.1, 22.6, 23.4, 25.1), units=c(185L, 215L, 332L, 325L, 408L, 421L, 406L, 412L, 522L, 445L, 544L, 614L) ) I used a linear model, a log-transformed linear model, a Poisson and Binomial generalised linear model to predict sales within and outside the range of data available.

Linear models are the bread and butter of statistics, but there is a lot more to it than taking a ruler and drawing a line through a couple of points.
Some time ago Rasmus Bååth published an insightful blog article about how such models could be described from a distribution centric point of view, instead of the classic error terms convention.
I think the distribution centric view makes generalised linear models (GLM) much easier to understand as well.

Last week’s post about the Kalman filter focused on the derivation of the algorithm. Today I will continue with the extended Kalman filter (EKF) that can deal also with nonlinearities. According to Wikipedia the EKF has been considered the de facto standard in the theory of nonlinear state estimation, navigation systems and GPS. Kalman filter I had the following dynamic linear model for the Kalman filter last week: \[\begin{aligned} x_{t+1} & = A x_t + w_t,\quad w_t \sim N(0,Q)\\\\\\ y_t &=G x_t + \nu_t, \quad \nu_t \sim N(0,R)\\\\\\ x_1 & \sim N(\hat{x}_0, \Sigma_0) \end{aligned} \]With \(x_t\) describing the state space evolution, \(y_t\) the observations, \(A, Q, G, R, \Sigma_0\) matrices of appropriate dimensions, \(w_t\) the evolution error and \(\nu_t\) the observation error.

At the last Cologne R user meeting Holger Zien gave a great introduction to dynamic linear models (dlm). One special case of a dlm is the Kalman filter, which I will discuss in this post in more detail. I kind of used it earlier when I measured the temperature with my Arduino at home.
Over the last week I came across the wonderful quantitative economic modelling site quant-econ.net, designed and written by Thomas J.

Rasmus’ post of last week on binomial testing made me think about p-values and testing again. In my head I was tossing coins, thinking about gender diversity and toast. The toast and tossing a buttered toast in particular was the most helpful thought experiment, as I didn’t have a fixed opinion on the probabilities for a toast to land on either side. I have yet to carry out some real experiments.

A friend of mine asked me the other day how she could use the function optim in R to fit data. Of course, there are built-in functions for fitting data in R and I wrote about this earlier. However, she wanted to understand how to do this from scratch using optim.
The function optim provides algorithms for general-purpose optimisations and the documentation is perfectly reasonable, but I remember that it took me a little while to get my head around how to pass data and parameters to optim.

Of course, a picture on a computer monitor is a coloured plot of x and y coordinates or pixels. Still, I was smitten by David Sparks’ posts on is.r(), where he shows how easy it is to read images into R to analyse them. In two posts [1], [2] he replicates functionality of image manipulation programmes like GIMP.
I can’t resist to write about this here as well. David’s first post is about k-means cluster analysis.

© Markus Gesmann CC BY-NC-SA 3.0 · Powered by the Academic theme for Hugo.