R Fundamentals

About this chapter

  1. Questions:
  • How do I use R?
  1. Objectives:
  • Become familiar with R syntax
  • Understand the concepts of objects and assignment
  • Get exposed to a few functions
  1. Keypoints:
  • R’s capabilities are provided by functions
  • R users call functions and get results

Working with R

In this workshop we’ll use R in the extremely useful RStudio package. For the most part we’ll work interactively, meaning we’ll type stuff straight into the R console in RStudio (Usually this is a window on the left or lower left) and get our results there too (usually in the consoled or in a window on the right). That’s what you see in panels like the ones below - first the thing to type into R, and below it, the calculated result from R. Let’s look at how R works by using it for it’s most basic job - as a calculator:

 3 + 5
[1] 8
 12 * 2
[1] 24
 1 / 3
[1] 0.3333333
 12 * 2
[1] 24
  3 / 0
[1] Inf

Fairly straightforward, we type in the expression and we get a result. That’s how this whole book will work, you type the stuff in, and get answers out. It’ll be easiest to learn if you go ahead and copy the examples one by one. Try to resist the urge to use copy and paste. Typing longhand really encourages you to look at what you’re entering.

As far as the R ouput itself goes, it’s really straightforward - its just the answer with a [1] stuck on the front. This [1] tells us how far through the output we are. Often R will return long lists of numbers and it can be helpful to have this extra information

Variables

We can save the output of operations for later use by giving it a name using the assignment symbol <-. Read this symbol as ‘gets’, so x <- 5 reads as ‘x gets 5’. These names are called variables, because the value they are associated with can change.

Let’s give five a name, x then refer to the value 5 by it’s name. We can then use the name in place of the value. In the jargon of computing we say we are assigning a value to a variable.

 x <- 5
 x
[1] 5
 x * 2
[1] 10
y <- 3
x * y
[1] 15

This is of course of limited value with just numbers but is of great value when we have large datasets, as the whole thing can be referred to by the variable.

Using objects and functions

At the top level, R is a simple language with two types of thing: functions and objects. As a user you will use functions to do stuff, and get back objects as an answer. Functions are easy to spot, they are a name followed by a pair of brackets like mean() is the function for calculating a mean. The options (or arguments) for the function go inside the brackets:

 sqrt(16)
[1] 4

Often the result from a function will be more complicated than a simple number object, often it will be a vector (simple list), like from the rnorm() function that returns lists of random numbers

 rnorm(100)
  [1] -1.832656852 -0.076432192  0.036759331  0.410265449 -0.493090351
  [6] -0.084234740 -0.324705959  0.959965455 -0.168583067  0.921982085
 [11]  0.455359089 -0.147664452  0.263027476 -0.555259038  0.411216176
 [16] -1.042216182  0.584450268  1.014339269  0.751947006  0.499688313
 [21]  1.093430368 -1.702188907 -1.000733593  0.056563390  0.004004616
 [26] -0.470522551  0.921856926 -1.006489672  1.282074544  0.023261591
 [31] -0.757902678  0.667764169  1.336792834  1.960852195  1.681728985
 [36] -0.606466323 -0.321954259  0.157467941 -0.299698783 -0.650271292
 [41]  0.176295381 -0.057932164 -0.705836445  2.184088426 -0.202688545
 [46]  0.744345742  0.264748435  0.192206511  1.427860404 -1.721860750
 [51] -0.146903423  2.320060345 -0.081928020  1.157114906  0.458476849
 [56] -0.008218946  1.696442731 -1.498551336 -1.702424975 -0.607017596
 [61] -1.036169864 -2.087672837  0.348946759  0.391514609 -1.813320278
 [66] -0.606403269 -0.635211517  1.180697019 -0.798405078  0.133294455
 [71]  0.678605452  0.241134618 -2.175046288 -0.176754411  1.049354850
 [76]  1.437495571 -0.308919515 -0.874861507 -0.843351981  0.794153001
 [81]  0.413118634  0.157159653  0.523625229 -0.202873346  2.717037076
 [86]  0.906717727  0.744050618  0.400293816  0.399241977  1.578141821
 [91]  1.734598604  0.632449677 -1.412919558 -0.976221634  1.898445413
 [96]  1.073545055  0.211595238 -0.066714140  0.847497633 -1.272020668

We can combine objects, variables and functions to do more complex stuff in R, here’s how we get the mean of 100 random numbers.

numbers <- rnorm(100)
mean(numbers)
[1] 0.008718684

Here we created a vector object with rnorm(100) and assigned it to the variable numbers. We than used the mean() function, passing it the variable numbers. The mean() function returned the mean of the hundred random numbers.

Quiz

  1. Create two variables, a and b: Add them. What happens if we change a and then re-add a and b?
  2. We can also assign a + b to a new variable, c. How would you do this?
  3. Try some R functions: round(), c(), range(), plot() hint: Get help on a function by typing ?function_name e.g ?c(). Use the mean() function to calculate the average age of everyone in your house (Invent a housemate if you have to).