How to Use csv.reader in Python: A Practical Guide

Learn to use csv.reader to read CSV data in Python with practical examples, delimiter handling, header control, encoding tips, and common pitfalls for analysts and developers.

MyDataTables
MyDataTables Team
·5 min read
Read CSVs with Python - MyDataTables
Quick AnswerDefinition

csv.reader is a Python standard library helper that reads CSV data one row at a time, returning each row as a list. It supports common conventions like comma delimiters, custom delimiters, and quoted fields. To use it, import csv, open the CSV file in text mode, create a csv.reader(file), and loop over the resulting iterator to access each row as a Python list.

What csv.reader does and when to use it

csv.reader reads CSV data row by row, returning each row as a list of strings. It is part of the Python standard library and is ideal for straightforward, well-formed CSV files. This section introduces the core concept and shows a few simple patterns you can adapt to your data ingestion pipeline. According to MyDataTables, csv.reader is a reliable starting point for many data loading tasks, especially when you want low overhead and immediate access to fields.

Python
import csv with open('data.csv', newline='') as f: reader = csv.reader(f) for row in reader: print(row)

A second pattern skips the header row when the file contains column names, keeping the data processing loop focused on values rather than metadata.

Python
with open('data.csv', newline='') as f: reader = csv.reader(f) header = next(reader) # skip header row for row in reader: print(row)

Reading with different delimiters and quoting

CSV files come in variants that use different delimiters or complex quoting. csv.reader accepts a delimiter parameter to handle semicolons, tabs, or other separators. It also preserves embedded quotes inside fields when you use the default quoting rules. In practice you’ll often need to validate the number of columns per row or handle missing values as you parse. The following examples illustrate typical scenarios and how to adapt the reader to your data source.

Python
import csv with open('data_semicolon.csv', newline='') as f: r = csv.reader(f, delimiter=';') for row in r: print(row)
Python
from io import StringIO csv_data = 'name,comment\n"Alice, Q","Hello ""World"""' f = StringIO(csv_data) r = csv.reader(f) for row in r: print(row)

Skipping headers and converting types

Often the first row of a CSV contains column names. You can skip it with next(reader) or by calling header = next(reader). Once you’ve isolated data rows, you may convert strings to numbers or dates. This example shows parsing a small sales CSV and converting the amount field to float for calculations.

Python
import csv with open('sales.csv', newline='') as f: r = csv.reader(f) header = next(r) for date, amount in r: amount_num = float(amount) print(date, amount_num)

You can also combine map for a compact conversion:

Python
with open('sales.csv', newline='') as f: r = csv.reader(f) header = next(r) for row in r: date, *nums = row nums = list(map(float, nums)) print(date, nums)

Handling encodings and newline handling

When dealing with non-ASCII data or files created on different platforms, specify the encoding and the newline handling to avoid extra blank lines on Windows. The standard approach is to open with newline='' and specify encoding if needed. csv.reader will then split rows correctly, even when the file contains characters such as é or ñ.

Python
import csv with open('data-utf8.csv', newline='', encoding='utf-8') as f: for row in csv.reader(f): print(row)

If you’re unsure about encoding, use a try/except block to catch UnicodeDecodeError and fall back to a common encoding.

Python
try: with open('data-utf8.csv', newline='', encoding='utf-8') as f: for row in csv.reader(f): pass except UnicodeDecodeError: with open('data-utf8.csv', newline='', encoding='latin-1') as f: for row in csv.reader(f): pass

Performance considerations: streaming vs loading

For large CSV files, streaming with csv.reader is preferable to loading the entire file into memory. Iterating rows one by one reduces peak memory usage and allows processing to begin immediately. You can compose a small generator to keep your code clean while performing transformations as you stream.

Python
import csv def stream_rows(path): with open(path, newline='') as f: for row in csv.reader(f): yield row # Example usage for row in stream_rows('large.csv'): process(row)

Compared to loading all rows at once using list(csv.reader(...)) which may exhaust memory on very large files, streaming scales better in production pipelines.

Practical example: building records with headers

A common task is to map each row to a dictionary using the header row, which makes downstream processing more readable. You can manually zip the header with each row or switch to DictReader for a similar outcome. With csv.reader you create a header list and then zip to create a dict per row.

Python
import csv with open('people.csv', newline='') as f: r = csv.reader(f) header = next(r) for row in r: rec = dict(zip(header, row)) print(rec)

If you prefer a built-in convenience, see DictReader in the next section, which handles this mapping automatically.

Common pitfalls and debugging tips

Even small CSVs can cause surprises if the file uses a non-standard delimiter or contains embedded newlines. A few pragmatic checks help you catch problems early. Always open files with newline='' and validate row lengths before processing.

Python
import csv with open('data.csv', newline='') as f: r = csv.reader(f) for i, row in enumerate(r, start=1): if len(row) < 3: raise ValueError(f"Row {i} has missing fields: {row}")

If you encounter blank rows, skip them explicitly or filter out empty strings as a quick fix in a pre-processing step.

Beyond csv.reader: when to switch to DictReader and other options

When your data has named columns, DictReader can simplify access by providing dictionary-like row objects. It reduces index-based access and improves readability in code that references fields by name.

Python
import csv with open('data.csv', newline='') as f: dict_reader = csv.DictReader(f) for row in dict_reader: print(row['name'], row['age'])

For complete control, you can also combine DictReader with custom fieldnames or handle missing fields gracefully using a try/except around row[...] lookups.

Quick recap and next steps

This section consolidates the essential patterns for using csv.reader in Python. You learned how to read rows, skip headers, handle different delimiters and encodings, and scale parsing to large files through streaming. With these patterns, you can integrate CSV ingestion into ETL scripts, data pipelines, or ad hoc analysis tasks.

Python
import csv with open('data.csv', newline='') as f: r = csv.reader(f) for row in r: print(row)

To verify correctness, run small tests on sample files, incrementally add features, and compare results against a known-good dataset.

Steps

Estimated time: 30-45 minutes

  1. 1

    Prepare environment and sample CSV

    Install Python, set up a text editor, and create a small sample data.csv to practice reading with csv.reader. Start with a simple file that has a header and a few data rows.

    Tip: Use a dedicated test directory to keep sample data organized.
  2. 2

    Open the file and create a reader

    Use open with newline='' to avoid extra blank lines on Windows and create a csv.reader instance to begin streaming rows.

    Tip: Prefer a with statement to ensure the file is closed automatically.
  3. 3

    Iterate rows and access fields

    Loop over the reader and treat each row as a list of strings. Access columns by index or zip with a header for readability.

    Tip: Print a few rows to verify parsing before integrating into pipelines.
  4. 4

    Skip headers when needed

    If the first row contains column names, skip it with next(reader) or header = next(reader). This keeps downstream logic clean.

    Tip: Store the header separately if you plan to map values by name later.
  5. 5

    Convert strings to numbers safely

    Convert numeric fields with int() or float() and handle ValueError with try/except blocks when data quality is uncertain.

    Tip: Consider a helper function to centralize type conversions.
  6. 6

    Handle alternate delimiters and quotes

    Pass delimiter and appropriate quote handling to csv.reader to parse nonstandard CSV formats.

    Tip: Test with edges cases such as embedded delimiters inside quoted fields.
  7. 7

    Validate and guard against bad rows

    Add checks for row length and unexpected field counts to avoid downstream errors.

    Tip: Fail fast with descriptive errors to simplify debugging.
  8. 8

    Optional: compare with DictReader

    If you need field-name access, consider DictReader for convenience and readability.

    Tip: Use DictReader when you know the schema; it reduces index-based bugs.
Pro Tip: Open files with newline='' to avoid extra blank lines on Windows during parsing.
Warning: Always validate row lengths before processing to prevent index errors.
Note: Use a sample dataset that includes edge cases like quoted fields and embedded delimiters for robust tests.
Pro Tip: Prefer streaming for large files to keep memory usage low.

Prerequisites

Required

Optional

  • Optional: knowledge of DictReader for comparison
    Optional

Keyboard Shortcuts

ActionShortcut
CopyCopy selected text in editor or terminalCtrl+C
PastePaste text into editor or terminalCtrl+V
SaveSave current fileCtrl+S
FindSearch within the fileCtrl+F

People Also Ask

What is csv.reader and how is it different from csv.DictReader?

csv.reader yields each row as a list of strings, using zero-based indices to access fields. DictReader returns each row as an ordered dictionary with keys from the header row, which improves readability when you know the field names.

csv.reader gives you lists, which you access by position. DictReader gives you dictionaries keyed by column names, which is often easier to read.

How do I skip the header row in a CSV file?

Skip the first row by calling next(reader) or by assigning header = next(reader). This allows you to process only the data rows without repeating column names.

Skip the header with next(reader) so your loop handles just the data rows.

Can I read CSVs with different delimiters?

Yes. Pass a delimiter parameter to csv.reader, for example delimiter=',' for comma-delimited files or delimiter=';' for semicolon-delimited files.

You can specify the delimiter to parse various CSV formats correctly.

How do I handle quoted fields that contain delimiters?

csv.reader handles quoted fields by default. If you encounter edge cases, adjust quotechar and quoting options to ensure embedded delimiters are treated as data, not separators.

Quoting lets fields contain commas or other separators inside the data.

What encodings does csv.reader support?

The encoding parameter (e.g., encoding='utf-8') controls how bytes are converted to text. If you encounter decoding errors, you may try alternative encodings such as 'latin-1' or 'utf-8-sig'.

Choose encoding to match your CSV bytes, or fall back gracefully if decoding fails.

What are common mistakes when using csv.reader?

Common mistakes include forgetting newline='' on Windows, assuming all rows have the same length, and overreliance on index-based access without handling missing fields.

Watch out for newline handling and inconsistent row lengths.

Main Points

  • Read CSVs row by row with csv.reader for memory efficiency
  • Skip headers using next(reader) to focus on data
  • Specify delimiter to support nonstandard CSV formats
  • Open files with newline='' to avoid blank rows on Windows
  • Use DictReader for named-field access when appropriate

Related Articles