20.2 C
London
Sunday, June 22, 2025
HomePandas in PythonDataFrame Functions in PythonHow to Perform a GroupBy Sum in Pandas (With Examples)

How to Perform a GroupBy Sum in Pandas (With Examples)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to find the sum of values by group in pandas:

df.groupby(['group1','group2'])['sum_col'].sum().reset_index()

The following examples show how to use this syntax in practice with the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
                   'position': ['G', 'G', 'F', 'C', 'G', 'F', 'F', 'C'],
                   'points': [25, 17, 14, 9, 12, 9, 6, 4],
                   'rebounds': [11, 8, 10, 6, 6, 5, 9, 12]})

#view DataFrame
df

	team	position points	rebounds
0	A	G	 25	11
1	A	G	 17	8
2	A	F	 14	10
3	A	C	 9	6
4	B	G	 12	6
5	B	F	 9	5
6	B	F	 6	9
7	B	C	 4	12

Example 1: Group by One Column, Sum One Column

The following code shows how to group by one column and sum the values in one column:

#group by team and sum the points
df.groupby(['team'])['points'].sum().reset_index()

	team	points
0	A	65
1	B	31

From the output we can see that:

  • The players on team A scored a sum of 65 points.
  • The players on team B scored a sum of 31 points.

Example 2: Group by Multiple Columns, Sum Multiple Columns

The following code shows how to group by multiple columns and sum multiple columns:

#group by team and position, sum points and rebounds
df.groupby(['team', 'position'])['points', 'rebounds'].sum().reset_index()

        team	position points	rebounds
0	A	C	 9	6
1	A	F	 14	10
2	A	G	 42	19
3	B	C	 4	12
4	B	F	 15	14
5	B	G	 12	6

From the output we can see that:

  • The players on team A in the ‘C’ position scored a sum of 9 points and 6 rebounds.
  • The players on team A in the ‘F’ position scored a sum of 14 points and 10 rebounds.
  • The players on team A in the ‘G’ position scored a sum of 42 points and 19 rebounds.

And so on.

Note that the reset_index() function prevents the grouping columns from becoming part of the index.

For example, here’s what the output looks like if we don’t use it:

#group by team and position, sum points and rebounds
df.groupby(['team', 'position'])['points', 'rebounds'].sum()

                 points	rebounds
team	position		
A	C	 9	6
F	14	 10
G	42	 19
B	C	 4	12
F	15	 14
G	12	 6

Depending on how you’d like the results to appear, you may or may not choose to use the reset_index() function.

Additional Resources

The following tutorials explain how to perform other common grouping operations in pandas:

How to Count Observations by Group in Pandas
How to Find the Max Value by Group in Pandas
How to Calculate Quantiles by Group in Pandas

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories