9.1 C
London
Friday, December 20, 2024
HomePandas in PythonGeneral Functions in PythonPandas: How to Reshape DataFrame from Wide to Long

Pandas: How to Reshape DataFrame from Wide to Long

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to convert a pandas DataFrame from a wide format to a long format:

df = pd.melt(df, id_vars='col1', value_vars=['col2', 'col3', ...])

In this scenario, col1 is the column we use as an identifier and col2, col3, etc. are the columns we unpivot.

The following example shows how to use this syntax in practice.

Example: Reshape Pandas DataFrame from Wide to Long

Suppose we have the following pandas DataFrame:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['A', 'B', 'C', 'D'],
                   'points': [88, 91, 99, 94],
                   'assists': [12, 17, 24, 28],
                   'rebounds': [22, 28, 30, 31]})

#view DataFrame
df

	team	points	assists	rebounds
0	A	88	12	22
1	B	91	17	28
2	C	99	24	30
3	D	94	28	31

We can use the following syntax to reshape this DataFrame from a wide format to a long format:

#reshape DataFrame from wide format to long format
df = pd.melt(df, id_vars='team', value_vars=['points', 'assists', 'rebounds'])

#view updated DataFrame
df

	team	variable	value
0	A	points	        88
1	B	points	        91
2	C	points	        99
3	D	points	        94
4	A	assists	        12
5	B	assists	        17
6	C	assists	        24
7	D	assists	        28
8	A	rebounds	22
9	B	rebounds	28
10	C	rebounds	30
11	D	rebounds	31

The DataFrame is now in a long format.

We used the ‘team’ column as the identifier column and we unpivoted the ‘points’, ‘assists’, and ‘rebounds’ columns.

Note that we can also use the var_name and value_name arguments to specify the names of the columns in the new long DataFrame:

#reshape DataFrame from wide format to long format
df = pd.melt(df, id_vars='team', value_vars=['points', 'assists', 'rebounds'],
             var_name='metric', value_name='amount')

#view updated DataFrame
df

	team	metric	 amount
0	A	points	 88
1	B	points	 91
2	C	points	 99
3	D	points	 94
4	A	assists	 12
5	B	assists	 17
6	C	assists	 24
7	D	assists	 28
8	A	rebounds 22
9	B	rebounds 28
10	C	rebounds 30
11	D	rebounds 31

Note: You can find the complete documentation for the pandas melt() function here.

Additional Resources

The following tutorials explain how to perform other common operations in Python:

How to Add Rows to a Pandas DataFrame
How to Add Columns to a Pandas DataFrame
How to Count Occurrences of Specific Values in Pandas DataFrame

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories