3.1 C
London
Friday, December 20, 2024
HomePandas in PythonGeneral Functions in PythonPandas: How to Split a Column of Lists into Multiple Columns

Pandas: How to Split a Column of Lists into Multiple Columns

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to split a column of lists into multiple columns in a pandas DataFrame:

#split column of lists into two new columns
split = pd.DataFrame(df['my_column'].to_list(), columns = ['new1', 'new2'])

#join split columns back to original DataFrame
df = pd.concat([df, split], axis=1) 

The following example shows how to use this syntax in practice.

Example: Split Column of Lists into Multiple Columns in Pandas

Suppose we have the following pandas DataFrame in which the column called points contains lists of values:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'team': ['Mavs', 'Heat', 'Kings', 'Suns'],
                   'points': [[99, 105], [94, 113], [99, 97], [87, 95]]})

#view DataFrame
print(df)

    team     points
0   Mavs  [99, 105]
1   Heat  [94, 113]
2  Kings   [99, 97]
3   Suns   [87, 95]

We can use the following syntax to create a new DataFrame in which the points column is split into two new columns called game1 and game2:

#split column of lists into two new columns
split = pd.DataFrame(df['my_column'].to_list(), columns = ['new1', 'new2'])

#view DataFrame
print(split)

   game1  game2
0     99    105
1     94    113
2     99     97
3     87     95

If we’d like, we can then join this split DataFrame back with the original DataFrame by using the concat() function:

#join split columns back to original DataFrame
df = pd.concat([df, split], axis=1) 

#view updated DataFrame
print(df)

    team     points  game1  game2
0   Mavs  [99, 105]     99    105
1   Heat  [94, 113]     94    113
2  Kings   [99, 97]     99     97
3   Suns   [87, 95]     87     95

Lastly, we can drop the original points column from the DataFrame if we’d like:

#drop original points column
df = df.drop('points', axis=1)

#view updated DataFrame
print(df)

    team  game1  game2
0   Mavs     99    105
1   Heat     94    113
2  Kings     99     97
3   Suns     87     95

The end result is a DataFrame in which the original points column of lists is now split into two new columns called game1 and game2.

Note: If your column of lists has an uneven number of values in each list, pandas will simply fill in missing values with NaN values when splitting the lists into columns.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Print Pandas DataFrame with No Index
How to Show All Rows of a Pandas DataFrame
How to Check dtype for All Columns in Pandas DataFrame

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories