R Import CSV File: Practical Guide
Learn how to import CSV files in R using read.csv and read_csv, with encoding, separators, NA handling, and column types. Practical examples, performance tips, and best practices for reproducible data work.

In R, importing CSV files means reading tabular data from comma-separated values into R objects like data.frame or tibble. The classic base R approach uses read.csv(), while tidyverse users prefer read_csv() for speed and consistent parsing. You can control encoding, separators, NA values, and column types during import in scripts and projects.
Import basics in R: read.csv vs readr::read_csv
Whether you are a data analyst or a developer, importing CSV files is a daily task in R. In practice you will choose between base R's read.csv and the faster, tidyverse-oriented readr::read_csv. This section demonstrates pros, cons, and basic usage for both approaches. According to MyDataTables, CSV import forms the backbone of reproducible data pipelines, so understanding each option helps you pick the right tool for the job. We'll walk through a straightforward example using a sample CSV path, then compare results and performance in typical scenarios.
# Base R
df <- read.csv('data/sample.csv', header = TRUE, sep = ',', na.strings = c('', 'NA'))
# Quick check
head(df)# Tidyverse approach (readr)
library(readr)
df2 <- read_csv('data/sample.csv')
# Quick check
head(df2)Notes:
- read_csv returns a tibble and tends to parse types more aggressively; read.csv returns a data.frame.
- For large datasets, read_csv is generally faster and more consistent with tidyverse workflows.
- If your CSV uses a non-standard delimiter, you can adjust the sep/locale settings accordingly.
In practice, you’ll often choose read_csv when building pipelines with dplyr and tidyr, and read.csv for quick ad-hoc analyses in base R. The MyDataTables team emphasizes aligning the import method with your project’s tooling to maximize reproducibility.
Steps
Estimated time: 45-60 minutes
- 1
Prepare your environment
Install R and an IDE if you don’t have them. Set up a sample CSV path and ensure you have read.csv and read_csv available. Create a small example file to test imports.
Tip: Use a dedicated folder for test CSVs to avoid path issues. - 2
Choose your import method
Decide between base R read.csv and tidyverse read_csv based on your workflow. Consider the size of the data and downstream packages you plan to use.
Tip: If using tidyverse downstream, lean toward read_csv for consistency. - 3
Import with base R
Import a test CSV using read.csv with header and na strings. Inspect the result with head() and str().
Tip: Check for correct row count and column types after import. - 4
Import with readr
Use read_csv from readr. Compare performance and inferred types with the base approach.
Tip: Disable or enable show_col_types as needed for readability. - 5
Tune encoding and separators
If you encounter encoding issues, specify fileEncoding or locale encoding. Adjust delimiter if your file isn’t comma-delimited.
Tip: UTF-8 is a common default; ensure your terminal uses compatible encoding. - 6
Validate and clean
Run quick checks: summary(), str(), and head(). Clean up data types if necessary (e.g., convert to factors or dates).
Tip: Explicitly set types to avoid automatic conversions later in your pipeline. - 7
Persist or reuse
Save your imported data to R objects or write to a new file. Consider caching results for repeated runs.
Tip: Use file.path for portable paths across OSs.
Prerequisites
Required
- Required
- Required
- Understanding file paths and CSV basicsRequired
Optional
- Optional
Keyboard Shortcuts
| Action | Shortcut |
|---|---|
| Run current lineIn RStudio or compatible IDE | Ctrl+↵ |
| Comment/uncomment linesToggle line comments in code editor | Ctrl+⇧+C |
| New scriptCreate a new R script | Ctrl+⇧+N |
People Also Ask
What is the difference between read.csv and read_csv?
read.csv is part of base R and reads into a data.frame, with optional defaults like specifying na.strings and separators. read_csv is from the tidyverse (readr package) and returns a tibble with faster parsing and stricter typing by default. The choice depends on your overall toolkit and performance needs.
read.csv is base R’s classic import, while read_csv is the faster tidyverse option. Choose based on your workflow.
How do I handle UTF-8 encoding when importing?
For base R, use fileEncoding = 'UTF-8' in read.csv. For read_csv, set locale(encoding = 'UTF-8'). This helps prevent garbled characters when data contains non-ASCII text.
Handle encoding explicitly with fileEncoding or locale to avoid garbled text.
Can I use non-standard delimiters like semicolons?
Yes. In base R, set sep = ';' in read.csv. In read_csv, use locale( delim = ';') or process with a pre-cleaning step. Always verify with str() after import.
Yes, you can use semicolons by adjusting the delimiter and then checking the data structure.
What if I get file not found errors during import?
Check the file path, working directory, and file name. Use here::here() or setwd() to confirm location. Use file.path() to construct OS-agnostic paths.
Double-check the path and directory structure; use portable path helpers.
Is it possible to read only specific columns?
In base R, you can use colClasses to skip or coerce columns. In read_csv, use cols() to specify which columns to read with explicit types. This saves memory when loading large files.
Yes—define which columns to read to save memory.
Which approach is best for very large CSV files?
For very large files, consider data.table::fread for speed or chunked reads with read_csv and selective columns. Streaming imports reduce memory pressure and speed up processing.
Fread is often fastest for huge files; you can also read in chunks.
Main Points
- Choose base R or tidyverse based on workflow
- Explicitly set encoding and NA handling
- Define column types to prevent misreads
- For large CSVs, consider fread or streaming
- Validate data with quick checks after import