<- c(5, 5, 9)
vector1 vector1
[1] 5 5 9
R has the following basic data types:
TRUE
and FALSE
"abc"
or '123'
. Note that the single or double quotes aren’t part of the value, but they are used to enclose character values to ensure they are interpreted as data rather than code, and as text rather than numbers.A vector is an ordered collection of data values that are all the same data type.
We can use the c()
(“combine”) function in R to combine a sequence of values into a vector. For example:
<- c(5, 5, 9)
vector1 vector1
[1] 5 5 9
By default, the index of each item is its numbered position (i.e.). But if we want, we can also label each position as well:
<- c(a = 5, b = 5, c = 9)
vector2 vector2
a b c
5 5 9
Remember that vectors contain values that are all the same type. What happens if we try to combine differently-typed values into a vector using c()
?
A matrix is a 2-dimensional structure that contains data that are all the same data type.
We can use the matrix()
function in R to create a matrix from a sequence of values. For example:
<- matrix(c(5, 5, 9, 7, NA, 3), nrow = 3, ncol = 2)
matrix1 matrix1
[,1] [,2]
[1,] 5 7
[2,] 5 NA
[3,] 9 3
What happens if we try to create a vector or matrix with mixed types?
<- c('a', 1, TRUE)
vector2 vector2
[1] "a" "1" "TRUE"
We see here (by the quotes) that all of the values were converted to character
(text) type. When you try to mix types, R will coerce all of the values to the “least restrictive” type that can accommodate all of the values. The order of coercion would be:
logical >> integer >> numeric >> complex >> character
We’ve used the term “coercion” to describe when a value is forced to another data type. R has many functions beginning with as.
that you can use to coerce data to other types.
For example, to coerce any value to text, we can use as.character
. For example:
as.character(32.5)
[1] "32.5"
When feasible, we can also convert data to more restrictive types. So for example, we can convert 0s and 1s to logical, using as.logical()
:
as.logical(c(1, 0, 0, NA, 1, 2))
[1] TRUE FALSE FALSE NA TRUE TRUE
Notice in this case that any number other than 0
becomes TRUE.
Another example:
as.numeric(c('32.5', '-5', 'some text'))
Warning: NAs introduced by coercion
[1] 32.5 -5.0 NA
Note the warning, caused by the fact that 'some text'
cannot be converted to numeric
.
Vectors and matrices allow us to perform mathematical operations on entire sets of values at once. For example:
<- c(1, 2, 3)
v1 <- c(4, 5, 6)
v2 + v2 v1
[1] 5 7 9
or for a matrix:
<- matrix(c(1, 2, 3, 4, 5, 6), nrow = 2, ncol = 3)
m1 **2 # square each element m1
[,1] [,2] [,3]
[1,] 1 9 25
[2,] 4 16 36
Unlike vectors and matrices, lists allow mixing of types. Not only that, lists may have any structure that you like. For example, a list may contain vectors, matrices, and even other lists.
Let’s make our own list
s using the list()
function:
<- list(company = "Apple", ticker_symbol = "APPL",
apple_info stock_price = 170.33,
employees = c("Tim Cook", "Craig Federighi", "Jony Ive"),
stock_history = data.frame(
date = as.Date(c("2024-06-01", "2024-06-02", "2024-06-03")),
price = c(168.23, 169.45, 170.33)
))
<- list(company = "Amazon", ticker_symbol = "AMZN",
amazon_info stock_price = 135.67,
employees = c("Andy Jassy", "Werner Vogels", "Adam Selipsky"),
stock_history = data.frame(
date = as.Date(c("2024-06-01", "2024-06-02", "2024-06-03")),
price = c(133.45, 134.56, 135.67)
))
Now we can combine these lists into a larger list:
<- list(apple_info,
companies_info amazon_info)
How do we access elements of a list, where the elements aren’t named? In this case, each element is assigned a number, and we use double square brackets [[ ]]
to access the elements. For example, to access the first element of companies_info
, which is apple_info
, we would use:
1]] companies_info[[
$company
[1] "Apple"
$ticker_symbol
[1] "APPL"
$stock_price
[1] 170.33
$employees
[1] "Tim Cook" "Craig Federighi" "Jony Ive"
$stock_history
date price
1 2024-06-01 168.23
2 2024-06-02 169.45
3 2024-06-03 170.33
Once we have the value of the 1st element of the list - which in our case is itself a list, we can use the $
operator to access elements of that list. For example, to get the company name of the first element of companies_info
, we would use:
1]]$company companies_info[[
[1] "Apple"
A better practice would be to uniquely name each element of the list. For example, we could use the ticker symbol for each company as its index:
<- list(AAPL = apple_info,
companies_info GOOG = amazon_info)
Now we can access the elements by name:
<- companies_info$AAPL$stock_history
apple_stock_history apple_stock_history
date price
1 2024-06-01 168.23
2 2024-06-02 169.45
3 2024-06-03 170.33