Often you may want to insert a new column into a pandas DataFrame. Fortunately this is easy to do using the pandas insert() function, which uses the following syntax:
insert(loc, column, value, allow_duplicates=False)
where:
- loc: Index to insert column in. First column is 0.
- column: Name to give to new column.
- value: Array of values for the new column.
- allow_duplicates: Whether or not to allow new column name to match existing column name. Default is False.
This tutorial shows several examples of how to use this function in practice.
Example 1: Insert New Column as First Column
The following code shows how to insert a new column as the first column of an existing DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'points': [25, 12, 15, 14, 19], 'assists': [5, 7, 7, 9, 12], 'rebounds': [11, 8, 10, 6, 6]}) #view DataFrame df points assists rebounds 0 25 5 11 1 12 7 8 2 15 7 10 3 14 9 6 4 19 12 6 #insert new column 'player' as first column player_vals = ['A', 'B', 'C', 'D', 'E'] df.insert(loc=0, column='player', value=player_vals) df player points assists rebounds 0 A 25 5 11 1 B 12 7 8 2 C 15 7 10 3 D 14 9 6 4 E 19 12 6
Example 2: Insert New Column as a Middle Column
The following code shows how to insert a new column as the third column of an existing DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'points': [25, 12, 15, 14, 19], 'assists': [5, 7, 7, 9, 12], 'rebounds': [11, 8, 10, 6, 6]}) #insert new column 'player' as third column player_vals = ['A', 'B', 'C', 'D', 'E'] df.insert(loc=2, column='player', value=player_vals) df points assists player rebounds 0 25 5 A 11 1 12 7 B 8 2 15 7 C 10 3 14 9 D 6 4 19 12 E 6
Example 3: Insert New Column as Last Column
The following code shows how to insert a new column as the last column of an existing DataFrame:
import pandas as pd #create DataFrame df = pd.DataFrame({'points': [25, 12, 15, 14, 19], 'assists': [5, 7, 7, 9, 12], 'rebounds': [11, 8, 10, 6, 6]}) #insert new column 'player' as last column player_vals = ['A', 'B', 'C', 'D', 'E'] df.insert(loc=len(df.columns), column='player', value=player_vals) df points assists player rebounds 0 25 5 A 11 1 12 7 B 8 2 15 7 C 10 3 14 9 D 6 4 19 12 E 6
Note that using len(df.columns) allows you to insert a new column as the last column in any dataFrame, no matter how many columns it may have.
You can find the complete documentation for the insert() function here.