3 + 5
[1] 8
In this workshop we’ll use R in the extremely useful RStudio package. For the most part we’ll work interactively, meaning we’ll type stuff straight into the R console in RStudio (Usually this is a window on the left or lower left) and get our results there too (usually in the consoled or in a window on the right). That’s what you see in panels like the ones below - first the thing to type into R, and below it, the calculated result from R. Let’s look at how R works by using it for it’s most basic job - as a calculator:
3 + 5
[1] 8
12 * 2
[1] 24
1 / 3
[1] 0.3333333
12 * 2
[1] 24
3 / 0
[1] Inf
Fairly straightforward, we type in the expression and we get a result. That’s how this whole book will work, you type the stuff in, and get answers out. It’ll be easiest to learn if you go ahead and copy the examples one by one. Try to resist the urge to use copy and paste. Typing longhand really encourages you to look at what you’re entering.
As far as the R ouput itself goes, it’s really straightforward - its just the answer with a [1]
stuck on the front. This [1]
tells us how far through the output we are. Often R will return long lists of numbers and it can be helpful to have this extra information
We can save the output of operations for later use by giving it a name using the assignment symbol <-
. Read this symbol as ‘gets’, so x <- 5
reads as ‘x gets 5’. These names are called variables, because the value they are associated with can change.
Let’s give five a name, x
then refer to the value 5 by it’s name. We can then use the name in place of the value. In the jargon of computing we say we are assigning a value to a variable.
<- 5
x x
[1] 5
* 2 x
[1] 10
<- 3
y * y x
[1] 15
This is of course of limited value with just numbers but is of great value when we have large datasets, as the whole thing can be referred to by the variable.
At the top level, R is a simple language with two types of thing: functions and objects. As a user you will use functions to do stuff, and get back objects as an answer. Functions are easy to spot, they are a name followed by a pair of brackets like mean()
is the function for calculating a mean. The options (or arguments) for the function go inside the brackets:
sqrt(16)
[1] 4
Often the result from a function will be more complicated than a simple number object, often it will be a vector (simple list), like from the rnorm()
function that returns lists of random numbers
rnorm(100)
[1] -1.832656852 -0.076432192 0.036759331 0.410265449 -0.493090351
[6] -0.084234740 -0.324705959 0.959965455 -0.168583067 0.921982085
[11] 0.455359089 -0.147664452 0.263027476 -0.555259038 0.411216176
[16] -1.042216182 0.584450268 1.014339269 0.751947006 0.499688313
[21] 1.093430368 -1.702188907 -1.000733593 0.056563390 0.004004616
[26] -0.470522551 0.921856926 -1.006489672 1.282074544 0.023261591
[31] -0.757902678 0.667764169 1.336792834 1.960852195 1.681728985
[36] -0.606466323 -0.321954259 0.157467941 -0.299698783 -0.650271292
[41] 0.176295381 -0.057932164 -0.705836445 2.184088426 -0.202688545
[46] 0.744345742 0.264748435 0.192206511 1.427860404 -1.721860750
[51] -0.146903423 2.320060345 -0.081928020 1.157114906 0.458476849
[56] -0.008218946 1.696442731 -1.498551336 -1.702424975 -0.607017596
[61] -1.036169864 -2.087672837 0.348946759 0.391514609 -1.813320278
[66] -0.606403269 -0.635211517 1.180697019 -0.798405078 0.133294455
[71] 0.678605452 0.241134618 -2.175046288 -0.176754411 1.049354850
[76] 1.437495571 -0.308919515 -0.874861507 -0.843351981 0.794153001
[81] 0.413118634 0.157159653 0.523625229 -0.202873346 2.717037076
[86] 0.906717727 0.744050618 0.400293816 0.399241977 1.578141821
[91] 1.734598604 0.632449677 -1.412919558 -0.976221634 1.898445413
[96] 1.073545055 0.211595238 -0.066714140 0.847497633 -1.272020668
We can combine objects, variables and functions to do more complex stuff in R, here’s how we get the mean of 100 random numbers.
<- rnorm(100)
numbers mean(numbers)
[1] 0.008718684
Here we created a vector object with rnorm(100)
and assigned it to the variable numbers
. We than used the mean()
function, passing it the variable numbers
. The mean()
function returned the mean of the hundred random numbers.
a
and b
: Add them. What happens if we change a and then re-add a and b?a + b
to a new variable, c
. How would you do this?round()
, c()
, range()
, plot()
hint: Get help on a function by typing ?function_name
e.g ?c()
. Use the mean()
function to calculate the average age of everyone in your house (Invent a housemate if you have to).