2.4 C
London
Friday, December 20, 2024
HomePandas in PythonGeneral Functions in PythonPandas: How to Use loc to Select Multiple Columns

Pandas: How to Use loc to Select Multiple Columns

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the loc function in pandas to select multiple columns in a DataFrame by label.

Here are the most common ways to do so:

Method 1: Select Multiple Columns by Name

df.loc[:, ['col2', 'col4']]

Method 2: Select All Columns in Range

df.loc[:, 'col2':'col4']

The following examples show how to use each method in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'points': [5, 7, 7, 9, 12, 9, 9, 4],
                   'assists': [11, 8, 10, 6, 6, 5, 9, 12],
                   'rebounds': [6, 7, 7, 6, 10, 12, 10, 9]})

#view DataFrame
print(df)

  team  points  assists  rebounds
0    A       5       11         6
1    A       7        8         7
2    A       7       10         7
3    A       9        6         6
4    B      12        6        10
5    B       9        5        12
6    B       9        9        10
7    B       4       12         9

Example 1: Select Multiple Columns by Name

The following code shows how to use the loc function to select the ‘points’ and ‘rebounds’ columns from the DataFrame:

#select points and rebounds columns
df.loc[:, ['points', 'rebounds']]

        points	rebounds
0	5	6
1	7	7
2	7	7
3	9	6
4	12	10
5	9	12
6	9	10
7	4	9

Notice that each row from the ‘points’ and ‘rebounds’ columns are returned.

Also note that the order you specify the columns in the loc function is the order they’ll be returned in.

For example, we could return the ‘rebounds’ column first and then the ‘points’ column:

#select rebounds and points columns
df.loc[:, ['rebounds', 'points']]

	rebounds points
0	6	 5
1	7	 7
2	7	 7
3	6	 9
4	10	 12
5	12	 9
6	10	 9
7	9	 4

Example 2: Select All Columns in Range

The following code shows how to use the loc function to select all columns between the ‘points’ and ‘rebounds’ columns in the DataFrame:

#select all columns between points and rebounds columns
df.loc[:, 'points':'rebounds']

	points	assists	rebounds
0	5	11	6
1	7	8	7
2	7	10	7
3	9	6	6
4	12	6	10
5	9	5	12
6	9	9	10
7	4	12	9

Notice that all columns between the ‘points’ and ‘rebounds’ columns in the DataFrame are returned.

Note: To select columns by index position, use the iloc function instead.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Select Rows by Multiple Conditions Using Pandas loc
How to Select Rows Based on Column Values in Pandas
How to Select Rows by Index in Pandas

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories