2.4 C
London
Friday, December 20, 2024
HomePandas in PythonDataFrame Functions in PythonPandas: How to Drop Column if it Exists

Pandas: How to Drop Column if it Exists

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to drop one or more columns in a pandas DataFrame if they exist:

df = df.drop(['column1', 'column2'], axis=1, errors='ignore')

Note: If you don’t use the argument errors=’ignore’ then you’ll receive an error if you attempt to drop a column that doesn’t exist.

The following example shows how to use this syntax in practice.

Example: Drop Column if it Exists in Pandas

Suppose we have the following pandas DataFrame that contains information about various basketball players:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D', 'E', 'F'],
                   'points': [18, 22, 19, 14, 14, 11],
                   'assists': [5, 7, 7, 9, 12, 9],
                   'minutes': [10.1, 12.0, 9.0, 8.0, 8.4, 7.5],
                   'all_star': [True, False, False, True, True, True]})

#view DataFrame
print(df)

  team  points  assists  minutes  all_star
0    A      18        5     10.1      True
1    B      22        7     12.0     False
2    C      19        7      9.0     False
3    D      14        9      8.0      True
4    E      14       12      8.4      True
5    F      11        9      7.5      True

Now suppose we attempt to drop the columns with the names minutes_played and points:

#drop minutes_played and points columns
df = df.drop(['minutes_played', 'points'], axis=1)

KeyError: "['minutes_played', 'points'] not found in axis"

We receive an error because the column minutes_played does not exist as a column name in the DataFrame.

Instead, we need to use the drop() function with the errors=’ignore’ argument:

#drop minutes_played and points columns
df = df.drop(['minutes_played', 'points'], axis=1, errors='ignore')

#view updated DataFrame
print(df)

  team  assists  minutes  all_star
0    A        5     10.1      True
1    B        7     12.0     False
2    C        7      9.0     False
3    D        9      8.0      True
4    E       12      8.4      True
5    F        9      7.5      True

Notice that the points column has been dropped from the DataFrame.

Also notice that we don’t receive any error even though we attempted to drop a column called minutes_played, which does not exist.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

Pandas: How to Drop Unnamed Columns
Pandas: How to Drop All Columns Except Specific Ones
Pandas: How to Drop All Rows Except Specific Ones

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories