2.4 C
London
Friday, December 20, 2024
HomePandas in PythonDataFrame Functions in PythonPandas: How to Calculate Standard Deviation for Each Row

Pandas: How to Calculate Standard Deviation for Each Row

Related stories

Learn About Opening an Automobile Repair Shop in India

Starting a car repair shop is quite a good...

Unlocking the Power: Embracing the Benefits of Tax-Free Investing

  Unlocking the Power: Embracing the Benefits of Tax-Free Investing For...

Income Splitting in Canada for 2023

  Income Splitting in Canada for 2023 The federal government’s expanded...

Can I Deduct Home Office Expenses on my Tax Return 2023?

Can I Deduct Home Office Expenses on my Tax...

Canadian Tax – Personal Tax Deadline 2022

  Canadian Tax – Personal Tax Deadline 2022 Resources and Tools...

You can use the following basic syntax to calculate the standard deviation of values for each row in a pandas DataFrame:

df.std(axis=1, numeric_only=True)

The argument axis=1 tells pandas to perform the calculation for each row (instead of each column) and numeric_only=True tells pandas to only consider numeric columns when performing the calculation.

The following example shows how to use this syntax in practice.

Example: Calculate Standard Deviation for Each Row in Pandas

Suppose we have the following pandas DataFrame that contains information about the points scored by various basketball players during four different games:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'player': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
                   'game1': [18, 22, 19, 14, 14, 11, 20, 28],
                   'game2': [5, 7, 7, 9, 12, 9, 9, 4],
                   'game3': [11, 8, 10, 6, 6, 5, 9, 12],
                   'game4': [9, 8, 8, 9, 14, 15, 10, 11]})
                   
#view DataFrame
print(df)

  player  game1  game2  game3  game4
0      A     18      5     11      9
1      B     22      7      8      8
2      C     19      7     10      8
3      D     14      9      6      9
4      E     14     12      6     14
5      F     11      9      5     15
6      G     20      9      9     10
7      H     28      4     12     11

We can use the following syntax to calculate the standard deviation of points scored by each player:

#calculate standard deviation for each row
df.std(axis=1, numeric_only=True)

0     5.439056
1     7.182154
2     5.477226
3     3.316625
4     3.785939
5     4.163332
6     5.354126
7    10.144785
dtype: float64

Here’s how to interpret the output:

  • The standard deviation of points scored by player A is 5.439.
  • The standard deviation of points scored by player B is 7.182.
  • The standard deviation of points scored by player C is 5.477.

And so on.

Note that the std() function calculates the sample standard deviation by default.

If you would instead like to calculate the population standard deviation, you must use the argument ddof=0:

#calculate population standard deviation for each row
df.std(axis=1, ddof=0, numeric_only=True)

0    4.747351
1    5.881366
2    4.807037
3    3.384910
4    3.983518
5    3.915150
6    4.892772
7    8.091179
dtype: float64

Related: Population vs. Sample Standard Deviation: When to Use Each

To assign the standard deviation values to a new column, you can use the following syntax:

#add new column to display standard deviation for each row
df['points_std'] = df.std(axis=1, numeric_only=True)

#view updated DataFrame
print(df)

  player  game1  game2  game3  game4  points_std
0      A     18      5     11      9    5.439056
1      B     22      7      8      8    7.182154
2      C     19      7     10      8    5.477226
3      D     14      9      6      9    3.316625
4      E     14     12      6     14    3.785939
5      F     11      9      5     15    4.163332
6      G     20      9      9     10    5.354126
7      H     28      4     12     11   10.144785

The standard deviation of values for each row in the game1, game2, game3 and game4 columns is now shown in the points_std column.

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Get First Row of Pandas DataFrame
How to Drop First Row in Pandas DataFrame
How to Insert a Row Into a Pandas DataFrame

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories