You can use the NumPy where() function to quickly update the values in a NumPy array using if-else logic.
For example, the following code shows how to update the values in a NumPy array that meet a certain condition:
import numpy as np #create NumPy array of values x = np.array([1, 3, 3, 6, 7, 9]) #update valuesin array based on condition x = np.where((x 8), x/2, x) #view updated array x array([0.5, 1.5, 1.5, 6. , 7. , 4.5])
If a given value in the array was less than 5 or greater than 8, we divided the value by 2.
Else, we left the value unchanged.
We can perform a similar operation in a pandas DataFrame by using the pandas where() function, but the syntax is slightly different.
Here’s the basic syntax using the NumPy where() function:
x = np.where(condition, value_if_true, value_if_false)
And here’s the basic syntax using the pandas where() function:
df['col'] = (value_if_false).where(condition, value_if_true)
The following example shows how to use the pandas where() function in practice.
Example: The Equivalent of np.where() in Pandas
Suppose we have the following pandas DataFrame:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'A': [18, 22, 19, 14, 14, 11, 20, 28],
'B': [5, 7, 7, 9, 12, 9, 9, 4]})
#view DataFrame
print(df)
A B
0 18 5
1 22 7
2 19 7
3 14 9
4 14 12
5 11 9
6 20 9
7 28 4
We can use the following pandas where() function to update the values in column A based on a specific condition:
#update values in column A based on condition
df['A'] = (df['A'] / 2).where(df['A'] A'] * 2)
#view updated DataFrame
print(df)
A B
0 9.0 5
1 44.0 7
2 9.5 7
3 7.0 9
4 7.0 12
5 5.5 9
6 40.0 9
7 56.0 4
If a given value in column A was less than 20, we multiplied the value by 2.
Else, we divided the value by 2.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
Pandas: How to Count Values in Column with Condition
Pandas: How to Drop Rows in DataFrame Based on Condition
Pandas: How to Replace Values in Column Based on Condition