3.1 C
London
Friday, December 20, 2024
HomePandas in PythonGeneral Functions in PythonHow to Group by 5-Minute Intervals in Pandas

How to Group by 5-Minute Intervals in Pandas

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to group rows by 5-minute intervals in a pandas DataFrame:

df.resample('5min').sum()

This particular formula assumes that the index of your DataFrame contains datetime values and it calculates the sum of every column in the DataFrame, grouped by 5-minute intervals.

The following example shows how to use this syntax in practice.

Related: An Introduction to resample() in pandas

Example: How to Group by 5-Minute Intervals in Pandas

Suppose we have the following pandas DataFrame that shows the sales made by some company on various dates and times:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'date': pd.date_range(start='1/1/2020', freq='min', periods=12),
                   'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9, 8, 4],
                   'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5, 3, 2]})

#set 'date' column as index
df = df.set_index('date')

#view DataFrame
print(df)

                     sales  returns
date                               
2020-01-01 00:00:00      6        0
2020-01-01 00:01:00      8        3
2020-01-01 00:02:00      9        2
2020-01-01 00:03:00     11        2
2020-01-01 00:04:00     13        1
2020-01-01 00:05:00      8        3
2020-01-01 00:06:00      8        2
2020-01-01 00:07:00     15        4
2020-01-01 00:08:00     22        1
2020-01-01 00:09:00      9        5
2020-01-01 00:10:00      8        3
2020-01-01 00:11:00      4        2

Related: How to Create a Date Range in Pandas

We can use the following syntax to calculate the sum of sales grouped by 5-minute intervals:

#calculate sum of sales and returns grouped by 5-minute intervals
df.resample('5min').sum()

                     sales returns
date		
2020-01-01 00:00:00	47	 8
2020-01-01 00:05:00	62	15
2020-01-01 00:10:00	12 	 5

Here’s how to interpret the output:

  • Total sales during minutes 0-4 was 47 and total returns was 8.
  • Total sales during minutes 5-9 was 62 and total returns was 15.
  • Total sales during minutes 10-14 was 1 2and total returns was 5.

We can use similar syntax to calculate the max of the sales values and returns values, grouped by 5-minute intervals :

#calculate max of sales and max of returns grouped by 5-minute intervals
df.resample('5min').max()

	             sales  returns
date		
2020-01-01 00:00:00	13	  3
2020-01-01 00:05:00	22	  5
2020-01-01 00:10:00	8	  3

We can use similar syntax to calculate any value we’d like grouped by 5-minute intervals.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Group by Day in Pandas
How to Group by Week in Pandas
How to Group by Month in Pandas
How to Group by Quarter in Pandas

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories