The str_split() function from the stringr package in R can be used to split a string into multiple pieces. This function uses the following syntax:
str_split(string, pattern)
where:
- string: Character vector
- pattern: Pattern to split on
Similarly, the str_split_fixed() function from the stringr package can be used to split a string into a fixed number of pieces. This function uses the following syntax:
str_split_fixed(string, pattern, n)
where:
- string: Character vector
- pattern: Pattern to split on
- n: Number of pieces to return
This tutorial provides examples of how to use each of these functions on the following data frame:
#create data frame df frame(team=c('andy & bob', 'carl & doug', 'eric & frank'), points=c(14, 17, 19)) #view data frame df team points 1 andy & bob 14 2 carl & doug 17 3 eric & frank 19
Example 1: Split String Using str_split()
The following code shows how to split the string in the “team” column using the str_split() function:
library(stringr) #split the string in the team column on " & " str_split(df$team, " & ") [[1]] [1] "andy" "bob" [[2]] [1] "carl" "doug" [[3]] [1] "eric" "frank"
The result is a list of three elements that show the individual player names on each team.
Example 2: Split String Using str_split_fixed()
The following code shows how to split the string in the “team” column into two fixed pieces using the str_split_fixed() function:
library(stringr)
#split the string in the team column on " & "
str_split_fixed(df$team, " & ", 2)
[,1] [,2]
[1,] "andy" "bob"
[2,] "carl" "doug"
[3,] "eric" "frank"
The result is a matrix with two columns and three rows.
Once useful application of the str_split_fixed() function is to append the resulting matrix to the end of the data frame. For example:
library(stringr)
#split the string in the team column and append resulting matrix to data frame
df[ , 3:4] #view data frame
df
team points V3 V4
1 andy & bob 14 andy bob
2 carl & doug 17 carl doug
3 eric & frank 19 eric frank
The column titled ‘V3’ shows the name of the first player on the team and the column titled ‘V4’ shows the name of the second player on the team.
Additional Resources
How to Use str_replace in R
How to Perform Partial String Matching in R
How to Convert Strings to Dates in R
How to Convert Character to Numeric in R