Load eBird data files
read.ebd.Rd
Reads a .txt eBird data file and creates a data frame from it, with cases corresponding to lines (rows) and variables to fields (columns) in the file.
The most commonly used types of eBird data files are the eBird Basic Dataset (EBD; which may contain three subtypes of files) and the My Data download (which contains all data associated with a specific eBird account). The two differ in their download file type, column naming format, available columns, etc.
read.ebd
and read.mydata
import the EBD and My Data files respectively. Since EBD contains several columns, which may not all be required for a given usecase,
cols_sel
can be used to import only a subset of the columns. To see the list of all
columns names to choose from, run read.ebd(ebd_path, cols_print_only = TRUE)
.
This function is a wrapper around utils::read.delim()
, which is considerably faster
than the readr::read_delim()
used in auk::read_ebd()
. Moreover, unlike the latter
which uses snake case for column names, this function uses uppercase with period separators.
Usage
read.ebd(path, cols_sel = "all", cols_print_only = FALSE)
read.mydata(
path = "MyEBirdData.csv",
cols_sel = "all",
cols_print_only = FALSE,
cols_style_ebd = FALSE
)
Arguments
- path
character; the path to the downloaded EBD .txt file
- cols_sel
character; vector of column names to be imported from the dataset
- cols_print_only
logical; whether or not to only print the full set of column names
- cols_style_ebd
logical; if
TRUE
(default), change column names in My Data to uppercase and separated by period (COLUMN.STYLE), as inread.ebd()
Value
A data frame (cols_print_only == FALSE
), or a character vector of column names
(cols_print_only == TRUE
)
Examples
# to see list of column names before choosing
test1 <- data.frame(SAMPLING.EVENT.IDENTIFIER = "S0000001", COMMON.NAME = "Indian Peafowl")
tf <- tempfile()
write.table(test1, file = tf, col.names = TRUE, row.names = FALSE, sep = "\t",
quote = FALSE) # quote = TRUE surrounds column names by quotes
read.ebd(tf, cols_print_only = TRUE)
#> [1] "SAMPLING.EVENT.IDENTIFIER" "COMMON.NAME"
# select columns and import data
read.ebd(tf, cols_sel = c("SAMPLING.EVENT.IDENTIFIER", "COMMON.NAME"))
#> SAMPLING.EVENT.IDENTIFIER COMMON.NAME
#> 1 S0000001 Indian Peafowl