6.6 C
London
Thursday, December 26, 2024
HomeStatistics TutorialRHow to Merge Multiple Data Frames in R (With Examples)

How to Merge Multiple Data Frames in R (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use one of the following two methods to merge multiple data frames in R:

Method 1: Use Base R

#put all data frames into list
df_list #merge all data frames in list
Reduce(function(x, y) merge(x, y, all=TRUE), df_list)

Method 2: Use Tidyverse

library(tidyverse)

#put all data frames into list
df_list #merge all data frames in list
df_list %>% reduce(full_join, by='variable_name')

The following examples show how to use each method in practice.

Method 1: Merge Multiple Data Frames Using Base R

Suppose we have the following data frames in R:

#define data frames
df1 frame(id=c(1, 2, 3, 4, 5),
                  revenue=c(34, 36, 40, 49, 43))

df2 frame(id=c(1, 2, 5, 6, 7),
                  expenses=c(22, 26, 31, 40, 20))

df3 frame(id=c(1, 2, 4, 5, 7),
                  profit=c(12, 10, 14, 12, 9))

We can use the following syntax to merge all of the data frames using functions from base R:

#put all data frames into list
df_list #merge all data frames together
Reduce(function(x, y) merge(x, y, all=TRUE), df_list)  

  id revenue expenses profit
1  1      34       22     12
2  2      36       26     10
3  3      40       NA     NA
4  4      49       NA     14
5  5      43       31     12
6  6      NA       40     NA
7  7      NA       20      9

Notice that each of the “id” values from each original data frame is included in the final data frame.

Method 2: Merge Multiple Data Frames Using Tidyverse

Suppose we have the following data frames in R:

#define data frames
df1 frame(id=c(1, 2, 3, 4, 5),
                  revenue=c(34, 36, 40, 49, 43))

df2 frame(id=c(1, 2, 5, 6, 7),
                  expenses=c(22, 26, 31, 40, 20))

df3 frame(id=c(1, 2, 4, 5, 7),
                  profit=c(12, 10, 14, 12, 9))

We can use the following syntax to merge all of the data frames using functions from tidyverse – a collection of packages designed for data science in R:

library(tidyverse)

#put all data frames into list
df_list #merge all data frames together
df_list %>% reduce(full_join, by='id')

  id revenue expenses profit
1  1      34       22     12
2  2      36       26     10
3  3      40       NA     NA
4  4      49       NA     14
5  5      43       31     12
6  6      NA       40     NA
7  7      NA       20      9

Notice that the final data frame matches the data frame that we produced using the first method.

Note: The tidyverse approach will be noticeably quicker if you’re working with extremely large data frames.

Additional Resources

The following tutorials explain how to perform other common functions in R:

How to Merge Data Frames Based on Multiple Columns in R
How to Stack Data Frame Columns in R
How to Use anti_join in R

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories