17.8 C
London
Sunday, June 22, 2025
HomePandas in PythonGeneral Functions in PythonHow to Group Data by Hour in Pandas (With Example)

How to Group Data by Hour in Pandas (With Example)

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following syntax to group data by hour and perform some aggregation in pandas:

df.groupby([df['time'].dt.hour]).sales.sum()

This particular example groups the values by hour in a column called time and then calculates the sum of values in the sales column for each hour.

The following example shows how to use this syntax in practice.

Example: Group Data by Hour in Pandas

Suppose we have the following pandas DataFrame that shows the number of sales made at various times throughout the day for some store:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'time': ['2022-01-01 01:14:00', '2022-01-01 01:24:15',
                            '2022-01-01 02:52:19', '2022-01-01 02:54:00',
                            '2022-01-01 04:05:10', '2022-01-01 05:35:09'],
                   'sales': [18, 20, 15, 14, 10, 9]})

#convert date column to datetime
df['time'] = pd.to_datetime(df['time'])

#view DataFrame
print(df)

                 time  sales
0 2022-01-01 01:14:00     18
1 2022-01-01 01:24:15     20
2 2022-01-01 02:52:19     15
3 2022-01-01 02:54:00     14
4 2022-01-01 04:05:10     10
5 2022-01-01 05:35:09      9

We can use the following syntax to group the time column by hours and calculate the sum of sales for each hour:

#group by hours in time column and calculate sum of sales
df.groupby([df['time'].dt.hour]).sales.sum()

time
1    38
2    29
4    10
5     9
Name: sales, dtype: int64

From the output we can see:

  • A total of 38 sales were made during the first hour.
  • A total of 29 sales were made during the second hour.
  • A total of 10sales were made during the fourth hour.
  • A total of 9 sales were made during the fifth hour.

Note that we can also perform some other aggregation.

For example, we could calculate the mean number of sales per hour:

#group by hours in time column and calculate mean of sales
df.groupby([df['time'].dt.hour]).sales.mean()

time
1    19.0
2    14.5
4    10.0
5     9.0
Name: sales, dtype: float64

We can also group by hours and minutes if we’d like.

For example, the following code shows how to calculate the sum of sales, grouped by hours and minutes:

#group by hours and minutes in time column and calculate mean of sales
df.groupby([df['time'].dt.hour, df['time'].dt.minute]).sales.mean()

time  time
1     14      18
      24      20
2     52      15
      54      14
4     5       10
5     35       9
Name: sales, dtype: int64

From the output we can see:

  • The mean number of sales during 1:14 was 18.
  • The mean number of sales during 1:23 was 20.
  • The mean number of sales during 2:52 was 15.

And so on.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Create a Date Range in Pandas
How to Extract Month from Date in Pandas
How to Convert Timestamp to Datetime in Pandas

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories