Copy and paste small data sets into R

6 comments

How can I embed a small data set into my R code? That was the question I came across today, when I prepared my talk about Dynamical Systems in R with simecol for the forthcoming Cologne R user group meeting.

I wanted to add all the R code of the talk to the last slide. That's easy, but the presentation makes use of a small data set of 3 columns and 21 rows. Surely there must be an elegant solution that I can embed the data into the R code, without writing x <- c(x1, x2,...).

Of course there is a solution, but let's look at the data first. It shows the numbers of trapped lynx and snowshoe hares recorded by the Hudson Bay company in North Canada from 1900 to 1920.


Data sourced from Joseph M. Mahaffy. Original data believed to be published
in E. P. Odum (1953), Fundamentals of Ecology, Philadelphia, W. B. Saunders.
Another source with data from 1845 to 1935 can be found on D. Hundley's page.


My first idea was to store the data in a character variable:

HaresLynxObservations <- "Year  Hares.x.1000  Lynx.x.1000
1900 30 4
1901 47.2 6.1
1902 70.2 9.8
1903 77.4 35.2
1904 36.3 59.4
1905 20.6 41.7
1906 18.1 19
1907 21.4 13
1908 22 8.3
1909 25.4 9.1
1910 27.1 7.4
1911 40.3 8
1912 57 12.3
1913 76.6 19.5
1914 52.3 45.7
1915 19.5 51.1
1916 11.2 29.7
1917 7.6 15.8
1918 14.6 9.7
1919 16.2 10.1
1920 24.7 8.6"

Now, how do I transform the string into a data frame? My initial thought was to write the string into a temporary file and to read it into R again:

fn <- tempfile()
cat(HaresLynxObservations, file=fn)
read.table(fn, header=TRUE)

However, I was convinced that there was a better, more elegant way. And indeed there is, via connections. A good starting point to learn more about connections in R is Brian Ripley's article on that topic in R News and of course the R Data Import/Export manual:

Text connections are another source of input. They allow R character vectors to be read as if the lines were being read from a text file. A text connection is created and opened by a call to textConnection, which copies the current contents of the character vector to an internal buffer at the time of creation.

Wonderful, textConnection provides basically a virtual file and hence I can use read.table again to read the string into a data frame:

read.table(textConnection(HaresLynxObservations), header=TRUE)

That' all I need to copy and paste small data sets into R.

For moving data between R and Excel via the clipboard see John Cook's post.

6 comments :

  1. Hai Markus,
    there is no need for textConnection anymore, you could just use the 'text' argument in read.table:
    read.table(header = TRUE, text =
      "Year  Hares.x.1000  Lynx.x.1000
    1900 30 4
    1901 47.2 6.1
    1902 70.2 9.8
    1903 77.4 35.2
    1904 36.3 59.4
    1905 20.6 41.7
    1906 18.1 19
    1907 21.4 13
    1908 22 8.3
    1909 25.4 9.1
    1910 27.1 7.4
    1911 40.3 8
    1912 57 12.3
    1913 76.6 19.5
    1914 52.3 45.7
    1915 19.5 51.1
    1916 11.2 29.7
    1917 7.6 15.8
    1918 14.6 9.7
    1919 16.2 10.1
    1920 24.7 8.6")

    Best regards from Landau,

    Edi

    ReplyDelete
  2. This is great info!  Anything that takes the hassle out of moving data from another source to R is beautiful!

    ReplyDelete
  3. you can use dput to output your data.frame into an ascii text representation, and embedd the resulting text in your presentation (e.g. using the data Edi Sz. provided):

    HaresLynxObservations <- structure(list(Year = 1900:1920, Hares.x.1000 = c(30, 47.2, 70.2, 77.4, 36.3, 20.6, 18.1, 21.4, 22, 25.4, 27.1, 40.3, 57, 76.6, 52.3, 19.5, 11.2, 7.6, 14.6, 16.2, 24.7), Lynx.x.1000 = c(4, 6.1, 9.8, 35.2, 59.4, 41.7, 19, 13, 8.3, 9.1, 7.4, 8, 12.3, 19.5, 45.7, 51.1, 29.7, 15.8, 9.7, 10.1, 8.6)), .Names = c("Year", "Hares.x.1000", "Lynx.x.1000"), class = "data.frame", row.names = c(NA, -21L))

    ReplyDelete
  4. I like using:
    a <- read.table("clipboard", sep =  "\t", header = T)

    ReplyDelete
  5. Here's a trick that that I use that's a little more then the "clipboard" trick, and way less that the text or TextConnection approaches.  It also works in the console on a pure terminal connection

    >test.df<-read.table(file=stdin(),header=T)
    The "paste the data with your cursor at 0:
    0: N X_N1: 1  202: 4  783: 8 1304: 12 1705: 16 1906: 20 2107: 24 2208: 28 2459: 32 25510: 48 28011: 64 31012:
    Type a return in the last entry, and your set to go.

    > test.df    N X_N1   1  202   4  783   8 1304  12 1705  16 1906  20 2107  24 2208  28 2459  32 25510 48 28011 64 310> names(test.df)[1] "N"   "X_N"

    ReplyDelete
  6. In this specific case, there would have been a more straightforward alternative:

     > library( gnumeric )
    > read.gnumeric.sheet( "http://www-rohan.sdsu.edu/~jmahaffy/courses/f00/math122/labs/labj/lotvol.xls", head = TRUE )                                                                                             
       Year X.Hares..x1000. X.Lynx.x1000. X.  X X.1 X.2 H0  X30                                             
    1  1900            30.0           4.0 NA NA  NA  NA L0 4.00                                             
    2  1901            47.2           6.1 NA NA  NA  NA a1 0.50                                             
    3  1902            70.2           9.8 NA NA  NA  NA a2 0.02                                             
    4  1903            77.4          35.2 NA NA  NA  NA b1 0.90                                             
    5  1904            36.3          59.4 NA NA  NA  NA b2 0.03

    ReplyDelete