CSV (comma-separated value) files are one of the most common ways to store data.
Fortunately the pandas function read_csv() allows you to easily read in CSV files into Python in almost any format you’d like.
This tutorial explains several ways to read CSV files into Python using the following CSV file named ‘data.csv’:
playerID,team,points 1,Lakers,26 2,Mavs,19 3,Bucks,24 4,Spurs,22
Example 1: Read CSV File into pandas DataFrame
The following code shows how to read the CSV file into a pandas DataFrame:
#import CSV file as DataFrame df = pd.read_csv('data.csv') #view DataFrame df playerID team points 0 1 Lakers 26 1 2 Mavs 19 2 3 Bucks 24 3 4 Spurs 22
Example 2: Read Specific Columns from CSV File
The following code shows how to read only the columns titled ‘playerID’ and ‘points’ in the CSV file into a pandas DataFrame:
#import only specific columns from CSV file df = pd.read_csv('data.csv', usecols=['playerID', 'points']) #view DataFrame df playerID points 0 1 26 1 2 19 2 3 24 3 4 22
Alternatively you can specify column indices to read into a pandas DataFrame:
#import only specific columns from CSV file df = pd.read_csv('data.csv', usecols=[0, 1]) #view DataFrame df playerID team 0 1 Lakers 1 2 Mavs 2 3 Bucks 3 4 Spurs
Example 3: Specify Header Row when Importing CSV File
In some cases, the header row might not be the first row in a CSV file.
For example, consider the following CSV file in which the header row actually appears in the second row:
random,data,values
playerID,team,points
1,Lakers,26
2,Mavs,19
3,Bucks,24
4,Spurs,22
To read this CSV file into a pandas DataFrame, we can specify header=1 as follows:
#import from CSV file and specify that header starts on second row df = pd.read_csv('data.csv', header=1) #view DataFrame df playerID team points 0 1 Lakers 26 1 2 Mavs 19 2 3 Bucks 24 3 4 Spurs 22
Example 4: Skip Rows when Importing CSV File
You can also easily skip rows when importing a CSV file by using the skiprows argument.
For example, the following code shows how to skip the second row when importing the CSV file:
#import from CSV file and skip second row df = pd.read_csv('data.csv', skiprows=[1]) #view DataFrame df playerID team points 0 2 Mavs 19 1 3 Bucks 24 2 4 Spurs 22
And the following code shows how to skip the second and third row when importing the CSV file:
#import from CSV file and skip second and third rows df = pd.read_csv('data.csv', skiprows=[1, 2]) #view DataFrame df playerID team points 1 3 Bucks 24 2 4 Spurs 22
Example 5: Read CSV Files with Custom Delimiter
Occasionally you may have a CSV file with a delimiter that is different from a comma.
For example, suppose our CSV file has an underscore as a delimiter:
playerID_team_points
1_Lakers_26
2_Mavs_19
3_Bucks_24
4_Spurs_22
To read this CSV file into pandas, we can use the sep argument to specify the delimiter to use when reading the file:
#import from CSV file and specify delimiter to use df = pd.read_csv('data.csv', sep='_') #view DataFrame df playerID team points 0 1 Lakers 26 1 2 Mavs 19 2 3 Bucks 24 3 4 Spurs 22
Additional Resources
The following tutorials explain how to perform other common tasks in pandas:
How to Read a Text File with Pandas
How to Read Excel Files with Pandas
How to Read TSV Files with Pandas
How to Read HTML Tables with Pandas