Logic and if-else statements
This tutorial is designed to help understand logic in R and use that logic to write conditional statements (i.e., if-else statements). These programming skills are essential for doing more complex analyses in R and are useful for working in other programming language.
Disclaimer/clarification: Please note that in this tutorial I use vector to mean anything with length greater than 1, even though in R a single number or single character is technically also a vector albeit of length 1.
Logical values
Logical values can be either true or false. In practice, we generally use them to help filter data or run a specific analysis based on a specific condition. In R logical values are coded as TRUE
and FALSE
. You may also see/use T
or F
, however, it is generally recommended not to use these two. This is because you can always overwrite T
or F
but cannot overwrite TRUE
or FALSE
. This can cause errors if for whatever reason T
or F
is overwritten but used elsewhere as a logical value. Logical values are returned whenever we do a comparison, i.e, use a relational operator.
Relational operators
Here are the most common relational operators in R:
Let’s code up those examples:
2<3
## [1] TRUE
2>8
## [1] FALSE
3<=3
## [1] TRUE
"APPLE"=="APPLE"
## [1] TRUE
"Mayonnaise"!= "Instrument"
## [1] TRUE
It is important to note that comparing characters is case sensitive. For example,"Apple"=="APPLE"
returns FALSE
.
NA
and logic
Using comparison operators and other logical functions on NA
will return NA
even when comparing NA
to NA
. For example:
3>NA
## [1] NA
NA == NA
## [1] NA
If NA == NA
is NA
, how do I test if there is an NA
value? Luckily there is a special R function for that: is.na()
:
is.na(NA)
## [1] TRUE
is.na(4)
## [1] FALSE
Other is.x()
functions
While is.na()
is probably the most commonly used, there are other similar functions to test for different data types. For example, is.character()
tests if the input is a character, is.numeric()
tests if the input is numerical. This is often useful when data wrangling. For example:
is.numeric("five")
## [1] FALSE
is.numeric(5)
## [1] TRUE
is.character("five")
## [1] TRUE
Logical operators
The next import operators to cover are ones that work on logical values and then return a logical value. These are particulary important if there are multiple conditions you need to be met.
The most relevant ones are NOT (!
), AND (&
), and OR (|
) which are described below.
The truth table shows the result of either &
or |
given all possible combinations of inputs (x and y).
An example of what that looks like in R
!(1>2)#should return TRUE
## [1] TRUE
(1>2)|(3>2)#Should return TRUE
## [1] TRUE
(1>2)&(3>2)#Should return FALSE since 1 is not greater than 2
## [1] FALSE
Vector logic
All of the operations described above work on vectors of data and there are also special functions
Relational operators on vectors
The relational operators discussed above also work on vectors!
We can compare a whole vector to a single value. This will return a vector with logical values for the comparison being applied to each element of the vector. For example:
x<-c(1,3,5,6)
3<x
## [1] FALSE FALSE TRUE TRUE
If we compare vectors of the same length, it will return a logical vector doing pairwise comparisons:
y<-c(1,5,8)
z<-c(1,1,5)
z==y
## [1] TRUE FALSE FALSE
Logical operators on vectors
The logical operators discussed above also work on vectors! Like in the previous section, if we use a logical operator between a vector and a single logical value it will apply that comparison to every element:
!c(TRUE,FALSE,TRUE)
## [1] FALSE TRUE FALSE
TRUE & c(TRUE,FALSE,TRUE)
## [1] TRUE FALSE TRUE
If the two vectors are the same length, it will do pairwise operations:
c(TRUE,FALSE,FALSE) & c(TRUE,FALSE,TRUE)
## [1] TRUE FALSE FALSE
c(TRUE,FALSE,FALSE) | c(TRUE,TRUE,TRUE)
## [1] TRUE TRUE TRUE
Special vector operators
There are also special functions that act on logical vectors: any()
, all()
,which()
, and %in%
described below:
any()
returns TRUE
if there is a single TRUE
value. Here are a few examples:
any(2>c(1,7,10)) #should be true cause of 1
## [1] TRUE
any(1>c(1,7,10))
## [1] FALSE
all()
only returns true if all elements are TRUE
. Here are a few examples of
all(c(1,2,3,4)==c(1,2,3,5))
## [1] FALSE
all(c(1,2,3,4)==c(1,2,3,4))
## [1] TRUE
all(!is.na(c(NA,2,3,5,NA)))
## [1] FALSE
which()
returns indexes (or position) of TRUE
values. NOTE: that indexing starts at 1 in R unlike other programing languages that start at 0. Here is an example of code that returns the location of NA values.
which(is.na(c(NA,2,3,5,NA)))
## [1] 1 5
%in%
tests which items in the vector left of %in%
are within the vector right of %in%
. For example this will test if any of the fruits listed in the vector on the left are in the vector on the right.
c("Strawberry","Apple","Lychee","Pear") %in% c("Apple","Pear")
## [1] FALSE TRUE FALSE TRUE
If-Else statements
Now let’s get into conditionals! Often times, we use these logical values or logical results to do something conditioned on them. Graphically this looks like:
An example would be If it is raining, then I will bring an umbrella, otherwise I will not bring an umbrella. The condition is “presence/absence of rain”. The different actions is either “bring an umbrella” or don’t “bring an umbrella.” Coded up in R this is what it looks like:
if(weather=="rain"){
"Bring an umbrella"
}else{
"Dont bring an umbrella"
}
A more useful example would be in a function or a for loop. Let’s make a function to test if something is odd or even.
To do this we will name our function even_odd
and declare it a as a function with one input num
: function(num)
. Then we will use are if…else statements to test if num
is even or odd and return a character value indicating the result of that test. To test if num
is even we will use %%
which gives us the modulo (equivalent to the remainder for positive integers). A num
will be even if num
modulo 2 is 0, otherwise it is odd.
even_odd<-function(num){
if(num%%2==0){
return("Even")
}else{
return("Odd")
}
}
even_odd(2)
## [1] "Even"
even_odd(1)
## [1] "Odd"
However, if we run even_odd(NA)
this will throw an error. We need to add another condition to catch NA
values. But how do we add another condition?
If-Elif-Else statments
Luckily it’s super easy to add another condition with else if(){}
and we can add an arbitrary amount of conditions stringing together else if(){}
. NOTE: the order matters as in R will evalutate the first condition in the if
statement then the first else if
then the next else if
and so on until the else
block. Shown below is a diagram of what that looks like:
Let’s try our new and improved even_odd
function. Because errors happen when we try to test NA
values. Let’s make that our first condition we check.
even_odd<-function(num){
if(is.na(num)){
return("NA!")
}else if(num%%2==0){
return("Even")
}else{
return("Odd")
}
}
even_odd(2)
## [1] "Even"
even_odd(1)
## [1] "Odd"
even_odd(NA)
## [1] "NA!"
And it worked!
Closing remarks
I hoped you enjoyed this tutorial. Please shoot me an email if there on any tips for improvement or if you caught a bug! Please check out the other tutorials on my website.