1 Writing R code in Console

R code, on its own, is just text. We can write R code in a new script within R or RStudio, or in any text editor. However, just writing the code will not do the whole job – in order for our code to be executed (or, interpreted), we need to send it to the command-line interpreter. In RStudio, the command-line interpreter is called the Console.

In R, the command-line interpreter starts with the > symbol. This is called the prompt. It operates on the idea of a “Read, evaluate, print” loop: we type in commands, R tries to execute them, and then returns a result.

The fastest way to have R evaluate code is to type our R code directly into the command-line interpreter. For example,

1+1
## [1] 2

1.1 Using R as a calculator

The simplest thing we can do with R is do arithmetic:

1+100
## [1] 101

When using R as a calculator, the order of operations is the same as we learned in school: From highest to lowest precedence:

  • Parentheses: (, )
  • Exponents: ^ or **
  • Divide: /
  • Multiply: *
  • Add: +
  • Subtract: -
3+5*2
## [1] 13
# But
(3+5)*2
## [1] 16

1.2 Comparing things in R

We can also do comparison in R:

1==1 # equality (note two equals signs, read as "is equal to")
## TRUE
1 != 2  # inequality (read as "is not equal to")
## TRUE
1 < 2  # less than
## TRUE
1 <= 1  # less than or equal to
## TRUE
1 > 0  # greater than
## TRUE
1 >= -9 # greater than or equal to
## TRUE

A word of warning about comparing numbers: we should never use == to compare two numbers unless they are integers (a data type which can specifically represent only whole numbers).

2 Writing R code in Source

There are certainly many cases where it makes sense to type code directly into the console. For example, to open a help menu for a new function with the ? command, to take a quick look at a dataset with the head() function, or to do simple calculations like 1+1, we should type directly into the console. However, the problem with writing all our code in the console is that nothing that we write will be saved. So in case of an error, or if we want to make a change to some earlier code, we have to type it all over again.

For this (and many more reasons), we should write an important code in the Source window, and save as an R script (a R script is a bunch of R code in a single file).

We can write an R script in any text editor, but we should save it with the .R suffix to make it clear that it contains R code.

To start writing a new R script in RStudio, click File – New File – R Script. When we open a new script, we see a blank page like this:

When we type code into an R script, we will notice that, unlike typing code into the Console, nothing happens. In order for R to interpret the code, we need to send it from the Source to the Console.

For example, we write the following code into the Source:

# Create variables
x<-23
y<-36
z<-89
#Do some calculations
x+y+z
(x-z)+y
log(x)

The first thing we do is to save this piece of code. We can save this code by using click File – Save As....

We can type as many code as we like in the Source. R will not execute this until we send these codes to Console. The three most common ways to do this are:

  1. Copy the code from the Source, paste it into the Console, and Enter.
  2. Highlight the code in the Source we want to run, and then use the Run button.
  3. Place the cursor on a single line we want to run, then use the Run button to run just that line.

3 Objects and Functions

The operation of R revolves around two things: objects and functions. Almost everything in R is either an object or a function.

3.1 What is an object?

An object is a thing – like a number, a dataset, a summary statistic like a mean or standard deviation, or a statistical test. Objects come in many different shapes and sizes in R. There are simple objects like which represent single numbers, vectors which represent several numbers, more complex objects like dataframes which represent tables of data, and even more complex objects like hypothesis tests or regression which contain all sorts of statistical information. Objects in R are things, and different objects have different attributes.

3.2 Creating new objects

By now we know that R can be used to do simple calculations. But to really take advantage of R, we need to know how to create and manipulate objects. All of the data, analyses, and even plots, we use and create are, or can be, saved as objects in R. Once an object is loaded, we can use it to calculate descriptive statistics, hypothesis tests, and to create plots.

To create new objects, we need to do object assignment. Object assignment is our way of storing information, such as a number or a statistical test, into something we can easily refer to later. To do an assignment, we use the almighty <- operator called assign. To assign something to a new object (or to change an existing object), we use the notation object <- ..., where object is the new (or updated) object, and ... is whatever we want to store in that object.

For example, we are creating two objects x and y, and store the values 1/40 and 1/50 in these:

x<-1/40
y<-1/50

# Notice that assignment does not print a value. Instead, 
# we stored it for later in something called a variable. 
# x now contains the value 0.025.
x
## [1] 0.025
y
## [1] 0.02

In the Environment tab, we will see that x and y, along with their values, have appeared.

3.3 Properties of objects

Our variable x can be used in place of a number in any calculation that expects a number:

log(x)
## [1] -3.688879
sum(x,y)
## [1] 0.045

We can assign these objects to a new object. For example:

z<-x+y
z
## [1] 0.045

To change an object, we need to assign it again. For example:

# We have an object a with a value of 0. 
# We would like to add 1 to z in order to make it 1.
a<-0
a+1 # let us try first
a
## [1] 0 
# the value of a is still 0! What went wrong?

a<-0
a<-a+1  # Now we are REALLY changing a
a
## [1] 1

We can also store strings in variables:

sentence <- "the cat sat on the mat"
# Note that we need to put strings of characters inside quotes.
sentence
## [1] "the cat sat on the mat"

But the type of data that is stored in a variable affects what we can do with it:

x+1
## [1] 1.025
sentence+1
## Error in sentence + 1 : non-numeric argument to binary operator

Name objects

We can create object names using any combination of letters and a few special characters (like . and _). Here are some valid object names:

group.mean <- 10.21
my.age <- 10
FavoriteFood <- "Sweet"
sum.1.to.5 <- 1 + 2 + 3 + 4 + 5

3.4 Types of objects

3.4.1 Scalar

The simplest object type in R is a scalar. A scalar object is just a single value like a number or a name. For example:

# Examples of numeric scalars
m <- 100
n <- 3 / 100
o <- (m + n) / n
# Examples of character scalars
d <- "ship"
e <- "cannon"
f <- "Do any modern armies still use cannons?"

3.4.2 Vector

A vector object is just a combination of several scalars stored as a single object. For example, the numbers from one to ten could be a vector of length 10, and the characters in the English alphabet could be a vector of length 26. Like scalars, vectors can be either numeric or character (but not both!).

3.4.3 How to create a vector?

There are many ways to create vectors in R.

  • The simplest way to create a vector is with the c() function. The c here stands for concatenate, which means “bring them together”. The c() function takes several scalars as arguments, and returns a vector containing those objects. When using c(), we need to place a comma in between the objects (scalars or vectors) we want to combine:
# Create an object a with the integers from 1 to 5
a <- c(1, 2, 3, 4, 5)
# Print the result
a
## [1] 1 2 3 4 5

char.vec <- c("Ceci", "nest", "pas", "une", "pipe")
char.vec
## [1] "Ceci" "nest" "pas"  "une"  "pipe"
  • The a:b function takes two numeric scalars a and b as arguments, and returns a vector of numbers from the starting point a to the ending point b.
a<-1:10
a
##  [1]  1  2  3  4  5  6  7  8  9 10
b<-2.5:8.5
b
## [1] 2.5 3.5 4.5 5.5 6.5 7.5 8.5
  • The seq() function is a more flexible version of a:b. Like a:b, seq() allows us to create a sequence from a starting number to an ending number. However, seq() has additional arguments that allow us to specify either the size of the steps between numbers, or the total length of the sequence: by and length.out. If we use the by argument, the sequence will be in steps of the input to the by argument:
# Create the numbers from 1 to 10 in steps of 1
seq(from = 1, to = 10, by = 1)
##  [1]  1  2  3  4  5  6  7  8  9 10

# Create the numbers from 0 to 100 in steps of 10
seq(from = 0, to = 100, by = 10)
##  [1]   0  10  20  30  40  50  60  70  80  90 100

If we use the length.out argument, the sequence will have length equal to length.out.

# Create 10 numbers from 1 to 5
seq(from = 1, to = 5, length.out = 10)
##  [1] 1.0 1.4 1.9 2.3 2.8 3.2 3.7 4.1 4.6 5.0

# 3 numbers from 0 to 100
seq(from = 0, to = 100, length.out = 3)
## [1]   0  50 100
  • Finally, the rep() function allows you to repeat a scalar (or vector) a specified number of times, or to a desired length:
rep(x = 3, times = 10) #with scalar
##  [1] 3 3 3 3 3 3 3 3 3 3
rep(x = c(1, 2), each = 3) #with vector
## [1] 1 1 1 2 2 2
rep(x = 1:3, length.out = 10) #with vector
##  [1] 1 2 3 1 2 3 1 2 3 1

3.5 What is a function?

A function is a procedure that typically takes one or more objects as arguments (or, inputs), does something with those objects, then returns a new object.

R has many built in mathematical functions. To call a function, we simply type its name, followed by open and closing parentheses. Anything we type inside the parentheses are called the function’s arguments:

log(1) #natual logarithm
## [1] 0
exp(0.5) # e^(1/2)
## [1] 1.648721

We have no need to remember every function in R. We can simply look them up on Google, or if we remember the start of the function’s name, we can type the start of it, then press the tab key. This will show a list of functions whose name matches what we have typed so far; this is known as tab completion, and can save a lot of typing (and reduce the risk of typing errors). Tab completion works in R (i.e. running it out of RStudio), and in RStudio. In RStudio this feature is even more useful; a extract of the function’s help file will be shown alongside the function name.This is one advantage that RStudio has over R on its own: it has auto-completion abilities that allow you to more easily look up functions, their arguments, and the values that they take.

When we use R, we do three basic things: 1) Define objects, 2) Apply functions to those objects, and 3) Repeat!. Take the following as an example:

# 1: Create a vector object called tattoos
runs <- c(4, 67, 23, 4, 10, 35)

# 2: Apply the mean() function to the tattoos object
mean(runs)
## [1] 23.83333

The mean() function we used above takes a vector object, `runs``, of numeric data as an argument, calculates the arithmetic mean of those data, then returns a single number (a scalar) as a result.

We will talk more about vector functions in the next section.