# mages' blog

## Setting the initial view of a motion chart in R

Following on from my article about accessing and plotting World Bank data with R I want to talk about how to change the initial view of a motion chart.

Over the last couple of weeks I have been asked a view times how to do this. For instance Stephen O'Grady wanted to create a motion chart, which shows initially a line chart, rather than a bubble chart.

Changing the initial settings of a motion chart is actually quite easy, if you know how to. The trick is to use the state argument in the list of options of gvisMotionChart.

As a case study I will use the World Bank data set and try to do some homework given by Duncan Temple Lang in his course on introduction to statistical computing course. Duncan asked his students to query the World Bank data base to create a line chart, which would show the number of internet users per 1000 in Africa over time. Further, he would like to see a legend next to the chart to identify which country is which and tooltips for each curve to identify the country.

A motion chart, displayed as a line chart, would do the trick.

Okay, getting the data is easy, thanks to the WDI package, or via a direct download, and so it is to create a motion chart with bubbles. Interactively I can change the bubble chart into a line chart, I can select some countries and change the y-axis to log-scale. However, when I reload the page I am back to square one: a bubble chart. So the idea is to pass the changed chart settings on to the initial plot. I find those settings, of the current view, as a string in the advanced tab of the settings window. I click on the wrench symbol in the bottom right hand corner of a motion chart to access this window.

 Screen shot the settings window of a motion chart

Next I copy this string and paste it into the state argument of the options list. Note the line break at the beginning and at the end of the state string in the example. Alternatively I can add \n to both side of the state string.

Here is an example, where I pre-selected Sierra Leone and Seychelles (countries with the lowest and highest number of internet users) together with Africa, North Africa and Sub-Saharan Africa (all income levels). You find the R code below to replicate the plot.

What does the data tell you? Play around with the graph, e.g. change it to a column graph, deselect all countries and change the y-axes to linear again, and hit the play button. How could we improve the plot?

## Accessing and plotting World Bank data with R

Over the past couple of days I played around with the data sets of the World Bank, and I have to admit that I am blown away by it. It is amazing, to see what is available on their web site and it is worth visiting their Data Visualisation Tools page. It is fantastic that they provide an API to their data. They have used it to build an iPhone App which is pretty cool. You can have the world's data in your pocket.

In this post I will show you how we can access data from the World Bank in R. As an example we create a motion chart, in the Hans Rosling style, as you find it on the Google Public Data Explorer site, which also uses data from the World Bank. Doing this, should give us the confidence that we understand the World Bank's interface. You can find this example as demo WorldBank as part of the googleVis package from version 0.2.10 onwards.

So let's try to replicate the initial plot of the Google Public Data Explorer, which shows fertility rate against life expectancy for each country from 1960 to today, whereby the countries are represented as bubbles, with the size reflecting the population and the colour the region.

## R in the insurance industry

Let's talk about R in the insurance industry today.  David Smith's blog entry reminded me about our poster at the R user conference in Warwick in August 2011:
 Using R in Insurance
We presented examples on how R can be used in the insurance industry. We had a lot of fun presenting our poster. By accident we had printed the poster with quite a bit of access white space to the right. So we asked everyone who came along to sign it and by the end of the evening we had over 100 signatures!

For the historians under the readers, here is my five year old poster from GIRO in Vienna 2006.

 Poster session at useR! 2011 in Warwick, UK
Yesterday Wayne Zhang, with whom I collaborate on the ChainLadder package, released the first version of his new cplm package on CRAN.  The name cplm is short for compound Poisson linear models. The cplm package is for fitting Tweedie compound Poisson linear models using the Monte Carlo EM algorithm. The form of the models that are handled in the package are generalized linear models, mixed-effect models and Bayesian models. For non-Bayesian models, maximum likelihood estimations are obtained for all parameters in the model, especially for the index parameter. Estimation for the Bayesian model is performed by Markov Chain Monte Carlo simulations. These models find their application in actuarial science, see also his paper.

Here are a few more insurance related packages:
• ChainLadder - Reserving methods in R. The package provides Mack-, Munich-, Bootstrap, and Multivariate-chain-ladder methods, as well as the LDF Curve Fitting methods of Dave Clark and GLM-based reserving models.
• cplm - Monte Carlo EM algorithms and Bayesian methods for fitting Tweedie compound Poisson linear models.
• lossDev - A Bayesian time series loss development model. Features include skewed-t distribution with time-varying scale parameter, Reversible Jump MCMC for determining the functional form of the consumption path, and a structural break in this path.
• actuar: Loss distributions modelling, risk theory (including ruin theory), simulation of compound hierarchical models and credibility theory.
• fitdistrplus: Help to fit of a parametric distribution to non-censored or censored data
• favir: Formatted Actuarial Vignettes in R. FAViR lowers the learning curve of the R environment. It is a series of peer-reviewed Sweave papers that use a consistent style.
• mondate: R packackge to keep track of dates in terms of months
• lifecontingencies - Package to perform actuarial evaluation of life contingencies
Other useful documents:
Help! There is a special interest group for R in insurance:

## LondonR, 7 September 2011

On 7 September 2011 I attended the London R user group meeting. It was a very good turn out with about 50 attendees at the Shooting Star, a pub close to Liverpool Street Station. The session started at 18:00 with four presentations, followed by drinks sponsored by Mango Solutions. The slides of the presentation are available on londonr.org.

The first presentation was given by Lisa Wainer from UCL Department of Security and Crime Science about crime data analysis using R. Lisa presented about a project with Merseyside police, where she had built software, in R with the gWidgets package, called the Hot Products Early Warning System, that is used to help understand and characterise the acquisitive crime problem in Merseyside on an ongoing basis, detecting emerging trends in hot products.

Chris Wood gave an insightful talk about his research on sediment biogeochemical modelling in the North Sea. His model uses a set differential equations with over 20 parameters. Chris is able to analyse and fit his model to data he gathered on an expedition in the North Sea using R, the deSolve package and having access to the super-computer at the University of Southampton. How cool is this?

Jean-Robert Avettand-Fenoel talked about the Rook package and how R and Rook has helped him to roll out new applications to his colleagues faster than using Excel, VBA and C++ or RExcel. Rook allows you to build web apps with R. The package is maintained by Jeffery Horner, who also brought us the brew package. The brew allows us, in combination with Rapache, to mix html and R code in the same file. This is quite similar to the approach taken by Sweave for LaTeX and R. However, Rook provides a way to run R web applications on your desktop with the new internal R web server named Rhttpd.

The final presentation was actually given by myself talking about the googleVis package and the recent developments in version 0.2.9:

## Including googleVis output in a blogger post

It seems that you cannot include Google Visualisation Charts into a blog post directly.
So, I tried to include the output of a googleVis function as a gadget, but also unsuccessfully.
Although you can include gadgets into your site template, it doesn't seem to work with blog posts. So, here is the trick which works for me: the iframe tag.
The following geo map is included as
<iframe width="100%" height="400px" frameborder="0" src="http://dl.dropbox.com/u/7586336/blogger/AndrewGeoMap.html">
</iframe>


As you can see, the chart itself is actually displayed in a page hosted by Dropbox and only inserted into this post via the iframe-tag.

For those of you, who would like to replicate the plot of Hurricane Andrew, here is the R code:
library(googleVis)
AndrewGeoMap <- gvisGeoMap(Andrew, locationvar='LatLong', numvar='Speed_kt',
hovervar='Category',
options=list(width=600,height=300,
region='US', dataMode='Markers'))
plot(AndrewGeoMap)
print(AndrewGeoMap, file="~/Dropbox/Public/AndrewGeoMap.html")
Created by Pretty R at inside-R.org

## Correction (18 October 2011)

I just figured out that we can actually embed a chart into a blogger post directly. You can literately copy and paste the code directly into the post. However, it doesn't seem to be displayed with MS Internet Explorer.

Anyhow, here is the example from above again:
 print(AndrewGeoMap, "chart", file="~/Desktop/AndrewGeoMap.js") 
Now I copied and pasted the content of that file below:

Today we published googleVis 0.2.9 on CRAN. The new version updates the package for the new features of the Google Visualisation API and brings a new in-page editor option.

Here is a simple example, displaying the participants of the R user Conference 2011 in Warwick by country. Notice the 'Edit me' button in the top left corner of the chart, which allows you to change and customise the graph.

library(XML)
url <- "http://www.warwick.ac.uk/statsdept/useR-2011/participant-list.html"
names(participants) <- c("Name", "Country", "Organisation")
## Correct typo and shortcut
participants$Country <- gsub("Kngdom","Kingdom",participants$Country)
participants$Country <- gsub("USA","United States",participants$Country)
participants$Country <- factor(participants$Country)
partCountry <- as.data.frame(xtabs( ~ Country, data=participants))
plot(G)