Often you may be interested in calculating the mean of one or more columns in a pandas DataFrame. Fortunately you can do this easily in pandas using the mean() function.
This tutorial shows several examples of how to use this function.
Example 1: Find the Mean of a Single Column
Suppose we have the following pandas DataFrame:
import pandas as pd import numpy as np #create DataFrame df = pd.DataFrame({'player': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'], 'points': [25, 20, 14, 16, 27, 20, 12, 15, 14, 19], 'assists': [5, 7, 7, 8, 5, 7, 6, 9, 9, 5], 'rebounds': [np.nan, 8, 10, 6, 6, 9, 6, 10, 10, 7]}) #view DataFrame df player points assists rebounds 0 A 25 5 NaN 1 B 20 7 8.0 2 C 14 7 10.0 3 D 16 8 6.0 4 E 27 5 6.0 5 F 20 7 9.0 6 G 12 6 6.0 7 H 15 9 10.0 8 I 14 9 10.0 9 J 19 5 7.0
We can find the mean of the column titled “points” by using the following syntax:
df['points'].mean()
18.2
The mean() function will also exclude NA’s by default. For example, if we find the mean of the “rebounds” column, the first value of “NaN” will simply be excluded from the calculation:
df['rebounds'].mean()
8.0
If you attempt to find the mean of a column that is not numeric, you will receive an error:
df['player'].mean()
TypeError: Could not convert ABCDEFGHIJ to numeric
Example 2: Find the Mean of Multiple Columns
We can find the mean of multiple columns by using the following syntax:
#find mean of points and rebounds columns df[['rebounds', 'points']].mean() rebounds 8.0 points 18.2 dtype: float64
Example 3: Find the Mean of All Columns
We can find also find the mean of all numeric columns by using the following syntax:
#find mean of all numeric columns in DataFrame df.mean() points 18.2 assists 6.8 rebounds 8.0 dtype: float64
Note that the mean() function will simply skip over the columns that are not numeric.
Additional Resources
How to Calculate the Median in Pandas
How to Calculate the Sum of Columns in Pandas
How to Find the Max Value of Columns in Pandas