19.6 C
London
Monday, July 7, 2025
HomePandas in PythonGeneral Functions in PythonHow to Perform an Outer Join in Pandas (With Example)

How to Perform an Outer Join in Pandas (With Example)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

An outer join is a type of join that returns all rows from two pandas DataFrames.

You can use the following basic syntax to perform an outer join in pandas:

import pandas as pd

df1.merge(df2, on='some_column', how='outer')

The following example shows how to use this syntax in practice.

Example: How to Perform an Outer Join in Pandas

Suppose we have the following two pandas DataFrames that contain information about various basketball teams:

import pandas as pd

#create DataFrame
df1 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                    'points': [18, 22, 19, 14, 14, 11, 20, 28]})

df2 = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'J', 'K'],
                    'assists': [4, 9, 14, 13, 10, 8]})

#view DataFrames
print(df1)

  team  points
0    A      18
1    B      22
2    C      19
3    D      14
4    E      14
5    F      11
6    G      20
7    H      28

print(df2)

  team  assists
0    A        4
1    B        9
2    C       14
3    D       13
4    J       10
5    K        8

We can use the following code to perform an outer join, matching the rows between the DataFrames based on the values in the team column and keeping all rows from both DataFrames:

#perform outer join
df1.merge(df2, on='team', how='outer')


        team	points	assists
0	A	18.0	4.0
1	B	22.0	9.0
2	C	19.0	14.0
3	D	14.0	13.0
4	E	14.0	NaN
5	F	11.0	NaN
6	G	20.0	NaN
7	H	28.0	NaN
8	J	NaN	10.0
9	K	NaN	8.0

The result is a DataFrame that contains all rows from each DataFrame.

Notice that NaN values have been filled in for each row where the value in the team column did not exist in both DataFrames.

Note: You can find the complete documentation for the merge function here.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Perform an Anti-Join in Pandas
How to Perform an Inner Join in Pandas
How to Perform a Cross Join in Pandas

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories