5 Reading in data
5.1 Before you start
1. Use a Project, not setwd() Open your .Rproj file before doing anything else. R will automatically look in that project folder for your data files — no need to tell it where to look. Just make sure your data file is saved in the same folder as the .Rproj.
2. Don’t use File > Import Always use code to read in data (not the menu). The menu won’t work when you knit your document.
5.2 Reading in your data
Find your file type below and copy the command into a code chunk.
5.2.1 Spreadsheets and tables
5.2.1.1 CSV files (.csv) — read.csv()
No packages needed. It assumes that your data has headers/column names.
mydata <- read.csv("filename.csv")If your data has no column names, add header = FALSE:
mydata <- read.csv("filename.csv", header = FALSE)This will name each column V1, V2 etc. You can then rename your columns using the names() command.
Note: If you are working with very large files (>100MB or over 100,000 rows), look up the fread() function from the data.table package. It’s more complex but MUCH faster.
5.2.1.2 Excel files (.xlsx, .xls) — read_excel()
Requires the readxl package. Install it once via the Packages tab, then add library(readxl) to your library code chunk.
mydata <- read_excel("filename.xlsx")If you want to read in a specific sheet from your Excel file, add the sheet argument:
mydata <- read_excel("filename.xlsx", sheet = "SheetName")
5.2.1.3 TXT files (.txt) — read.table()
No packages needed, but you have to explicitly tell R that the data has headers/column names and that it’s “tab separated” (tab character between entries). For weird data-files, can also adjust sep to whatever your actual separater is.
mydata <- read.table("filename.txt", header=TRUE, sep="\t")Note: If you are working with very large files (>100MB or over 100,000 rows), look up the fread() function from the data.table package. It’s more complex but MUCH faster.
5.2.2 R-specific data
R has its own file formats for saving and reloading R objects. These are useful when you want to save your work mid-analysis and pick it up later, or share processed data with someone else who uses R.
5.2.2.1 Built-in package datasets — data()
Many R packages come with example datasets ready to load. In this case you don’t need the file containing the data. Instead add the relevant library() to your library code chunk, then use data() to load the dataset. This will show up as a “promise”. Then run any other command to load it fully. Sometimes you won’t see its name in the environment.
Important: Just running
library(somepackage)makes its datasets accessible to R, but they live in the package environment — not your global environment. This means you can accidentally use them without them showing up in your Environment pane or inls(). Always usedata()explicitly to load a dataset into your global environment, where you can see and work with it properly.
5.2.2.2 RDS files (.rds) — readRDS()
This is R’s own format to store a single variable as a file. RDS saves a single R object (a data frame, model output, list, etc.) and reads it back in exactly as you left it. You can assign it any name on import. No packages needed.
mydata <- readRDS("filename.rds")
5.2.2.3 RData files (.RData, .rda) — load()
This is R’s own format to store many variables as a single file. RData files contain multiple R objects at once. Loading it is different from other formats — you use load() without <-, and the objects reappear in your environment automatically with their original names.
load("filename.RData")Note: Because the object names are baked into the file, you cannot rename them on import the way you can with
readRDS(). After loading, check your Environment pane to see what objects have appeared.
5.2.3 Spatial data
5.2.3.1 Vector files (.shp, .geojson, .gpkg) — st_read()
Vector spatial data includes points, lines, and polygons (e.g. country borders, survey locations, river networks). Requires the sf package. Install it once via the Packages tab, then add library(sf) to your library code chunk.
mydata <- st_read("filename.shp") # shapefile
mydata <- st_read("filename.geojson") # GeoJSON
mydata <- st_read("filename.gpkg") # GeoPackagest_read() works the same way regardless of format — just change the filename and extension.
Note for shapefiles: A shapefile is actually several files sharing the same name (e.g.
.shp,.dbf,.shx,.prj). All of them need to be in the same folder. You only type the.shpfilename in your code — R finds the rest automatically.
5.2.3.2 Raster files (.tif, .nc, .img) — rast()
Raster data is grid-based (e.g. satellite imagery, elevation models, climate surfaces). Requires the terra package. Install it once via the Packages tab, then add library(terra) to your library code chunk.
mydata <- rast("filename.tif") # GeoTIFF (most common)
mydata <- rast("filename.nc") # NetCDF (common for climate data)If your raster file contains multiple layers (e.g. monthly temperature across 12 months), rast() will load them all as a multi-layer object. You can check how many layers you have with nlyr(mydata).
5.3 Troubleshooting
If you get an error when reading in data, work through this checklist before asking for help.
Is the filename exactly right? R is case-sensitive. Mydata.csv and mydata.csv are different files. Always include the file extension (.csv, .xlsx, .shp, etc.).
Is the file in the right place? The data file needs to be in the same folder as your .Rproj file, not in Downloads or on your Desktop.
Are you running your project? Check the top-right corner of RStudio — it should show your project name, not “Project: (None)”. If it says None, go to File > Open Project and open your .Rproj file.
Have you installed and loaded the right package? read.csv(), readRDS(), and load() need no package. Everything else requires one:
| Function | Package |
|---|---|
read_excel() |
readxl |
st_read() |
sf |
rast() / vect()
|
terra |
Installing a package (via the Packages tab) only needs to happen once. Loading it with library() needs to happen every session — put your library() calls in your setup code chunk at the top of your script.
For shapefiles: are ALL the component files present? A shapefile is made up of multiple files (.shp, .dbf, .shx, and usually .prj). If any are missing — for example if you only copied the .shp — R will throw an error. Make sure the entire set of files with the same base name are all in your project folder.
Your data loaded but looks wrong? Run head(mydata) or glimpse(mydata) straight after reading in your data. This helps you catch problems early — for example, all your data ending up in one column is common with CSVs that use ; instead of , as a separator. If that happens, try:
mydata <- read.csv("filename.csv", sep = ";")