Apache Commons CSV + Maven: A Practical Guide for Java

A comprehensive, code-rich guide to using Apache Commons CSV with Maven for reading and writing CSV data in Java, including headers, delimiters, and best practices.

MyDataTables Team

March 25, 2026·5 min read

CSV Delimiter MyDataTables CSV Parser CSV Writer

CSV + Maven - MyDataTables — Photo by StockSnapvia Pixabay

Quick AnswerSteps

To use Apache Commons CSV with Maven, declare the commons-csv dependency in your pom.xml, then read or write CSV data with CSVParser and CSVPrinter. Start by choosing a delimiter (comma by default), enabling header support if present, and handling quotes and escapes correctly. This approach keeps parsing robust, testable, and easy to extend with custom formats.

What is Apache Commons CSV and why use it with Maven

Apache Commons CSV provides a simple, robust API for reading and writing CSV data in Java. It handles common edge cases like quoted fields, embedded newlines, and escaped delimiters so you don’t manually parse lines. When you combine it with Maven, you gain repeatable builds and centralized dependency management across teams. The canonical approach is to declare a dependency in pom.xml and then use a few concise classes to parse or generate CSV. In this section, we review core concepts and show a minimal end-to-end example that emphasizes readability and maintainability.

Java

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;

import java.nio.file.Paths;
import java.nio.file.Path;
import java.io.IOException;

public class ReadCsvExample {
  public static void main(String[] args) throws IOException {
    Path path = Paths.get("data/products.csv");
    try (CSVParser parser = CSVFormat.DEFAULT
        .withFirstRecordAsHeader()
        .withIgnoreSurroundingSpaces()
        .parse(java.nio.file.Files.newBufferedReader(path))) {
      for (CSVRecord record : parser) {
        String id = record.get("id");
        String name = record.get("name");
        String price = record.get("price");
        System.out.printf("Product %s: %s costs %s%n", id, name, price);
      }
    }
  }
}

Core concepts: CSVFormat configuration, header handling, and safe resource management.
Variations: use withFirstRecordAsHeader() for header rows, or parse without headers and access by index.

Maven setup: dependency management

Before you can parse or write CSVs, you must add the Apache Commons CSV dependency to your Maven project. The dependency coordinates keep your project aligned with the library across environments and teams. In many teams, a property-driven version is preferred to simplify upgrades:

XML

<dependencies>
  <dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-csv</artifactId>
    <version>${commons.csv.version}</version>
  </dependency>
</dependencies>

The version uses a property so you can upgrade in one place. Check your organization’s policy for version management and pinning.
If you prefer explicit versions, replace the property with a specific version like 1.x.y, ensuring compatibility with your Java runtime.

In addition, verify that your Maven project is using a compatible Java version and that your IDE refreshes dependencies when pom.xml changes.

Reading CSV data with headers and without headers

Reading CSV data is straightforward once you configure the format. If your file includes a header row, you should enable header mapping to access fields by name. If there is no header, access fields by index. The following examples show both approaches:

Java

// With headers
try (CSVParser parser = CSVFormat.DEFAULT
    .withFirstRecordAsHeader()
    .parse(java.nio.file.Files.newBufferedReader(Paths.get("data/products.csv")))) {
  for (CSVRecord record : parser) {
    String id = record.get("id");
    String name = record.get("name");
    System.out.printf("%s - %s%n", id, name);
  }
}

// Without headers
try (CSVParser parser = CSVFormat.DEFAULT
    .withSkipHeaderRecord()
    .parse(java.nio.file.Files.newBufferedReader(Paths.get("data/products_no_header.csv")))) {
  for (CSVRecord record : parser) {
    String a = record.get(0);
    String b = record.get(1);
    System.out.println(a + ":" + b);
  }
}

Notes: withFirstRecordAsHeader() binds column names; withSkipHeaderRecord() treats the first row as data. If you mix quoted fields and embedded newlines, CSVFormat handles these correctly when configured.

Writing CSV with CSVPrinter and headers

Writing CSV data is equally ergonomic with CSVPrinter. You can write headers once and then stream rows, which is ideal for log-like outputs or exporting large datasets. The example shows both: printing to a file and to a string in memory for small reports.

Java

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVPrinter;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.io.IOException;
import java.util.Arrays;

public class WriteCsvExample {
  public static void main(String[] args) throws IOException {
    Path path = Paths.get("data/output.csv");
    try (CSVPrinter printer = new CSVPrinter(Files.newBufferedWriter(path),
        CSVFormat.DEFAULT.withHeader("id", "name", "price"))) {
      printer.printRecord(101, "Widget", 9.99);
      printer.printRecord(102, "Gadget", 14.5);
    }
  }
}

The withHeader() variant writes the header row; you can omit it if you already have headers in your source. CSVPrinter supports printing complex records with quotes and escapes automatically.

Custom formats: delimiters, quotes, and escaping

Apache Commons CSV supports customizing delimiters, quote characters, and escape characters to handle unusual data formats. If you consume CSV from systems that use semicolons or pipes, you can configure the formatter accordingly. This flexibility also helps when exporting data to external tools that expect specific formats.

Java

CSVFormat format = CSVFormat.DEFAULT
  .withDelimiter(';')
  .withQuote('|')
  .withEscape('\\')
  .withRecordSeparator("\n");

try (CSVPrinter printer = new CSVPrinter(Files.newBufferedWriter(Paths.get("data/delimited.csv")), format)) {
  printer.printRecord(1, "Alice", 12.3);
  printer.printRecord(2, "Bob", 4.56);
}

Delimiter control is essential for interoperability. Depending on your data source, you may also need to trim whitespace or ignore surrounding spaces.

Streaming large CSV files to avoid OOM and improve performance

When CSV files are large, loading the entire file into memory is impractical. Streaming parsing with a buffered reader and a streaming API is the preferred approach. You can process each record as it arrives, perform transformations, and write results incrementally. This approach reduces peak memory usage and makes the pipeline more resilient to data size.

Java

try (CSVParser parser = CSVFormat.DEFAULT
    .withFirstRecordAsHeader()
    .parse(Files.newBufferedReader(Paths.get("data/large.csv")))) {
  for (CSVRecord record : parser) {
    // Process each row on the fly
    String status = record.get("status");
    // ... business logic
  }
}

For writing, consider streaming builders or flushing intermittently to a destination rather than buffering all rows in memory.
If you must accumulate results, use a bounded collection with backpressure to avoid memory pressure.

Testing and validation patterns for CSV parsing and writing

Tests are crucial to ensure your CSV handling remains correct across schema changes and format variants. Use a mix of unit tests and property-based tests to cover edge cases: quoted fields, embedded newlines, empty rows, and mixed delimited formats. Validate both parsed values and the generated CSV string against expected outputs.

Java

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
import org.apache.commons.csv.CSVRecord;
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.*;

import java.nio.file.Paths;

public class CsvTest {
  @Test
  public void testParseWithHeader() throws Exception {
    try (CSVParser p = CSVFormat.DEFAULT
        .withFirstRecordAsHeader()
        .parse(java.nio.file.Paths.get("data/products.csv"))) {
      assertTrue(p.iterator().hasNext());
    }
  }
}

Tests verify behavior under header presence, escaping, and large rows.
Consider snapshot tests for complex records to detect regressions.

Common pitfalls and debugging tips for Apache Commons CSV usage

Even experienced developers encounter pitfalls when parsing CSV data. Common issues include assuming a fixed number of columns, mishandling quotes, or ignoring missing headers. A robust approach is to enable header mapping and validate records against a schema. When debugging, print representative samples and the full header map to confirm field names.

Java

try (CSVParser parser = CSVFormat.DEFAULT
    .withFirstRecordAsHeader()
    .withIgnoreEmptyLines()
    .parse(Files.newBufferedReader(Paths.get("data/sample.csv")))) {
  for (CSVRecord record : parser) {
    // Quick validation: ensure required fields exist
    String id = record.get("id");
    if (id == null || id.isEmpty()) {
      System.err.println("Missing id on line " + record.getRecordNumber());
    }
  }
}

If you encounter parsing errors, enable verbose logging for the CSV library and inspect the failing lines to determine whether the issue is a format mismatch or corrupted data.

Practical tips for production use and maintenance

In production, prefer a single source of truth for your CSV format: standardized delimeters, consistent header names, and explicit quoting rules. Keep your code resilient by handling exceptions gracefully, and document the expected CSV schema in your repository. Regularly refresh dependencies and test against representative data samples to catch regressions early.

How to integrate Apache Commons CSV into a larger data pipeline

CSV processing often sits at the edge of data pipelines. A common pattern is to place the CSV parsing component behind a simple interface that accepts a path or stream and returns a list of domain objects or a streaming iterator. This isolation makes it easier to swap the underlying CSV library or add additional data transformations later, without touching downstream components.

Java

public interface CsvReader<T> {
  Iterable<T> read(Path path) throws IOException;
}

public class Product {
  String id; String name; double price;
  // constructor, getters, setters
}

public class ProductReader implements CsvReader<Product> {
  @Override
  public Iterable<Product> read(Path path) throws IOException {
    // Implement using CSVParser to map to Product
  }
}

This approach supports testability and clean separation of concerns in data workflows.

Steps

Estimated time: 60-90 minutes

1
Create a new Maven project
Generate a small Maven project to host CSV parsing code. Initialize basic package structure and a main class you will expand.
Tip: Use the quickstart archetype to save boilerplate.
2
Add commons-csv dependency
Add the Apache Commons CSV dependency to pom.xml using a version property to simplify upgrades across teams.
Tip: Coordinate with your build team to align the version policy.
3
Prepare sample CSV data
Create a sample CSV under data/ with headers to exercise read/write paths and test edge cases.
Tip: Include quotes and embedded newlines to test robustness.
4
Implement CSV reading
Write a small class that uses CSVFormat.DEFAULT.withFirstRecordAsHeader() to parse records by header name.
Tip: Wrap IO in try-with-resources to guarantee closure.
5
Implement CSV writing
Add a CSVPrinter-based writer that outputs headers and rows with proper quoting.
Tip: Use withHeader to ensure output schema matches input.
6
Run and verify
Build, run, and compare parsed results against the expected values in your test data.
Tip: Run dependency:tree first to confirm dependency resolution.

Pro Tip: Always close CSVParser and CSVPrinter resources with try-with-resources.

Warning: For large files, stream data row-by-row instead of loading entire content into memory.

Note: Prefer explicit headers in CSVFormat to avoid misaligned fields after edits.

Prerequisites

Required

Java JDK 8 or higher↗
Required
Maven 3.6+↗
Required
Basic knowledge of Java and Maven
Required
An IDE or code editor (IntelliJ IDEA, Eclipse, VS Code)
Required
Access to the internet to fetch dependencies
Required

Optional

Optional: JUnit or similar test framework for validation
Optional

Commands

Action	Command
Create new Maven projectGenerates a basic Java project structure suitable for CSV work	—
Build and packageCompiles sources and creates a runnable JAR in target/	—
List dependenciesVerify transitive dependencies and ensure compatibility	—
Run testsExecute unit tests for CSV parsing logic	—
Run a Java class from the built artifactManual testing of CSV processing in a running app	—

Main Points

Add the commons-csv dependency via Maven
Configure CSVFormat for headers, delimiters, and quoting
Parse with CSVParser and access by header name
Write with CSVPrinter using a header row
Test edge cases like embedded newlines and quotes

← More in CSV Tools & Apps

Apache Commons CSV + Maven: A Practical Guide for Java

What is Apache Commons CSV and why use it with Maven

Maven setup: dependency management

Reading CSV data with headers and without headers

Writing CSV with CSVPrinter and headers

Custom formats: delimiters, quotes, and escaping

Streaming large CSV files to avoid OOM and improve performance

Testing and validation patterns for CSV parsing and writing

Common pitfalls and debugging tips for Apache Commons CSV usage

Practical tips for production use and maintenance

How to integrate Apache Commons CSV into a larger data pipeline

Steps

Create a new Maven project

Add commons-csv dependency

Prepare sample CSV data

Implement CSV reading

Implement CSV writing

Run and verify

Prerequisites

Commands

People Also Ask

Main Points

Related Articles