Functions for downloading BLS flat files into R
BLSloadR is a packages designed to streamline access to the time
series database downloads from the U.S. Bureau of Labor Statistics, made
available at https://download.bls.gov/pub/time.series/. It is focused on
accessing series that are frequently used by states to get state-level
estimates, but includes the load_bls_dataset() and
bls_overview() functions to provide generalized access to
the other databases at this website within an R environment.
The primary functions in this package all begin with get_ and are listed below:
-get_ces() - This accesses data from the Current
Employment Statistics (CES) program at the state and metropolitan area
levels. This provides employer-based estimates of employment, wages, and
hours worked. This is the “SM” database.
-get_national_ces() - This accesses national data from
the CES program at the national level, which does not include
state-level breakouts. This is the “CE” database.
-get_laus() - This accesses data from the Local Area
Unemployment Statistics (LAUS) program at a regional, state, and several
sub-state levels. This is a localized version of the Current Population
Survey (CPS) which is used to drive household-based estiamtes of
employment and unemployment. This is the “LA” database. Note that
because of the volume of data here, there are several different
geographies that may be specified to pull the appropriate data file from
BLS.
-get_oews() - This access the Occupational Employment
and Wage Statistics (OEWS) data. This data provides survey-based
estimates of employmen and wages by occupation at state and sub-state
levels. This is the “OE” database. Note that only current-year data is
available for OEWS in this database, as it is not built as a time
series.
-get_salt() - This data is not actually loated within
the time.series folder, but instead is sourced from
https://www.bls.gov/lau/stalt.htm. These Alternative Measures of Labor
Underutilization for States are 12-month averages built from CPS data
which provide more expansive or restrictive definitions of unemployment
to measure the labor force, known as U1 through U6. This function also
includes the optional geometry argument. If set to TRUE, this will use
tigris::states() and tigris::shift_geometry()
to provide state polygons for convenient mapping of the output.
These optional helper functions can aid the user of this package by providing ways to summarize and explore all the time.series databases. These functions are a bit different than the specific functions above, as they implement a general way to merge and import BLS time.series databases, but do not manually specify the data, series, and lookup files to be joined. As such, they return a bls_data_collection object which includes the joined data as well as diagnostic results including dropped columns, unexpected join results, and other tools to help review the data before use. Further, when multiple data or series files are present, the user is prompted to choose one, so these tools are not suitable for a typical piped script.
bls_overview() - this function utilizes the standard
structure of the time.series databases, which has a simple text file
explaining the database structure that always follows the structure
id.txt where id is the two-character database identification code.
load_bls_dataset() - this function attempts to read and
join all the relevant files in a BLS database, and will sometimes prompt
the user for additional input. For example, many databases have multiple
data files available (such as “AllItems” and “Current”) and may have old
series files as well (to manage historical coding changes). Because
these joins are performed automatically, the object returned by this
function is a more robust diagnostic object included the joined data
table as well as information about the joins. Use Caution! BLS data
structures are not always consistent. There may be anomalies in the
structure of individual databases, such as missing column headers, that
will degrade the ability of this function to read the data.