Package 'MassWateR' reference manual

Title:	Quality Control and Analysis of Massachusetts Water Quality Data
Description:	Methods for quality control and exploratory analysis of surface water quality data collected in Massachusetts, USA. Functions are developed to facilitate data formatting for the Water Quality Exchange Network <https://www.epa.gov/waterdata/water-quality-data-upload-wqx> and reporting of data quality objectives to state agencies. Quality control methods are from Massachusetts Department of Environmental Protection (2020) <https://www.mass.gov/orgs/massachusetts-department-of-environmental-protection>.
Authors:	Marcus Beck [aut, cre] , Jill Carr [aut], Ben Wetherill [aut]
Maintainer:	Marcus Beck <[email protected]>
License:	CC0
Version:	2.1.5
Built:	2025-03-21 12:40:21 UTC
Source:	https://github.com/massbays-tech/MassWateR

Analyze trends by date in results file

Description

Analyze trends by date in results file

Usage

anlzMWRdate(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  thresh,
  group = c("site", "locgroup", "all"),
  threshlab = NULL,
  threshcol = "tan",
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  ptsize = 2,
  repel = FALSE,
  labsize = 3,
  expand = c(0.05, 0.1),
  confint = FALSE,
  palcol = "Set2",
  yscl = "auto",
  sumfun = yscl,
  colleg = FALSE,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
anlzMWRdate(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  thresh,
  group = c("site", "locgroup", "all"),
  threshlab = NULL,
  threshcol = "tan",
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  ptsize = 2,
  repel = FALSE,
  labsize = 3,
  expand = c(0.05, 0.1),
  confint = FALSE,
  palcol = "Set2",
  yscl = "auto",
  sumfun = yscl,
  colleg = FALSE,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`param`	character string of the parameter to plot, must conform to entries in the `"Simple Parameter"` column of `paramsMWR`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`sit`	optional character string of path to the site metadata file or `data.frame` of site metadata returned by `readMWRsites`, required if `locgroup` is not `NULL`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`thresh`	character indicating if relevant freshwater or marine threshold lines are included, one of `"fresh"`, `"marine"`, or `"none"`, or a single numeric value to override the values included with the package
`group`	character indicating whether the results are grouped by site (default), combined across location groups, or combined across sites, see details
`threshlab`	optional character string indicating legend label for the threshold, required only if `thresh` is numeric
`threshcol`	character indicating color of threshold lines if available
`site`	character string of sites to include, default all
`resultatt`	character string of result attributes to plot, default all
`locgroup`	character string of location groups to plot from the `"Location Group"` column in the site metadata file, optional and only if `sit` is not `NULL`
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD, default all
`ptsize`	numeric indicating size of the points
`repel`	logical indicating if overlapping site labels are offset, default `FALSE`
`labsize`	numeric indicating font size for the site labels, only if `group = "site"` or `group = "locgroup"`
`expand`	numeric of length two indicating expansion proportions on the x-axis to include labels outside of the plot range if `repel = F` and `group = "site"` or `group = "locgroup"`
`confint`	logical indicating if confidence intervals are shown, only applies if data are summarized using `group` as `"locgroup"` or `"all"`
`palcol`	character string indicating the color palette for points and lines from RColorBrewer, see details
`yscl`	character indicating one of `"auto"` (default), `"log"`, or `"linear"`, see details
`sumfun`	character indicating one of `"auto"`, `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`, see details
`colleg`	logical indicating if a color legend for sites or location groups is included if `group = "site"` or `group = "locgroup"`
`ttlsize`	numeric value indicating font size of the title relative to other text in the plot
`bssize`	numeric for overall plot text scaling, passed to `theme_minimal`
`runchk`	logical to run data checks with `checkMWRresults` or `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)

Details

Results are shown for the selected parameter as continuous line plots over time. Specifying group = "site" plot a separate line for each site. Specifying group = "locgroup" will summarize results across sites in the locgroup argument based on the value passed to sumfun or yscl if no value is passed to sumfun. The site metadata file must be passed to the `sit` argument to use this option. Specifying group = "all" will summarize results across sites for each date based on the value passed to sumfun or yscl if no value is passed to sumfun. Summarized results will include confidence intervals if confint = TRUE and they can be calculated (i.e., more than one point is used in the summary and data are summarized using group as "locgroup" or "all").

Threshold lines applicable to marine or freshwater environments can be included in the plot by using the thresh argument. These thresholds are specific to each parameter and can be found in the thresholdMWR file. Threshold lines are plotted only for those parameters with entries in thresholdMWR and only if the value in `Result Unit` matches those in thresholdMWR. The threshold lines can be suppressed by setting thresh = 'none'. A user-supplied numeric value can also be used for the thresh argument to override the default values. An appropriate label must also be supplied to threshlab if thresh is numeric.

Any acceptable color palette for from RColorBrewer for the points and lines can be used for palcol, which is passed to the palette argument in scale_color_brewer. These could include any of the qualitative color palettes, e.g., "Set1", "Set2", etc. The continuous and diverging palettes will also work, but may return color scales for points and lines that are difficult to distinguish. The palcol argument does not apply if group = "all".

The y-axis scaling as arithmetic (linear) or logarithmic can be set with the yscl argument. If yscl = "auto" (default), the scaling is determined automatically from the data quality objective file for accuracy, i.e., parameters with "log" in any of the columns are plotted on log10-scale, otherwise arithmetic. Setting yscl = "linear" or yscl = "log" will set the axis as linear or log10-scale, respectively, regardless of the information in the data quality objective file for accuracy.

Similarly, the data will be summarized appropriately for group (only applies if group is not site) based on the value passed to sumfun. The default if no value is provided to sumfun is to use the appropriate summary based on the value provided to yscl. If yscl = "auto" (default), then sumfun = "auto", and the mean or geometric mean is used for the summary based on information in the data quality objective file for accuracy. Using yscl = "linear" or yscl = "log" will default to the mean or geometric mean summary if no value is provided to sumfun. Any other appropriate value passed to sumfun will override the value passed to yscl. Valid summary functions for sumfun include "auto", "mean", "geomean", "median", "min", or "max").

Any entries in resdat in the "Result Value" column as "BDL" or "AQL" are replaced with appropriate values in the "Quantitation Limit" column, if present, otherwise the "MDL" or "UQL" columns from the data quality objectives file for accuracy are used. Values as "BDL" use one half of the appropriate limit.

Value

A ggplot object that can be further modified.

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# select sites
anlzMWRdate(res = resdat, param = 'DO', acc = accdat, group = 'site', thresh = 'fresh',
     site = c("ABT-026", "ABT-077"))
# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# select sites
anlzMWRdate(res = resdat, param = 'DO', acc = accdat, group = 'site', thresh = 'fresh',
     site = c("ABT-026", "ABT-077"))

Analyze results with maps

Description

Analyze results with maps

Usage

anlzMWRmap(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  ptsize = 4,
  repel = TRUE,
  labsize = 3,
  palcol = "Greens",
  palcolrev = FALSE,
  sumfun = "auto",
  crs = 4326,
  zoom = 11,
  addwater = "medium",
  watercol = "lightblue",
  maptype = NULL,
  buffdist = 2,
  scaledist = "km",
  northloc = "tl",
  scaleloc = "br",
  latlon = TRUE,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
anlzMWRmap(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  ptsize = 4,
  repel = TRUE,
  labsize = 3,
  palcol = "Greens",
  palcolrev = FALSE,
  sumfun = "auto",
  crs = 4326,
  zoom = 11,
  addwater = "medium",
  watercol = "lightblue",
  maptype = NULL,
  buffdist = 2,
  scaledist = "km",
  northloc = "tl",
  scaleloc = "br",
  latlon = TRUE,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`param`	character string of the parameter to plot, must conform to entries in the `"Simple Parameter"` column of `paramsMWR`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`sit`	character string of path to the site metadata file or `data.frame` of site metadata returned by `readMWRsites`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`site`	character string of sites to include, default all
`resultatt`	character string of result attributes to plot, default all
`locgroup`	character string of location groups to plot from the `"Location Group"` column in the site metadata file, default all
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD, default all
`ptsize`	numeric for size of the points, use a negative value to omit the points
`repel`	logical indicating if overlapping site labels are offset
`labsize`	numeric for size of the site labels
`palcol`	character string indicating the color palette to be used from RColorBrewer, see details
`palcolrev`	logical indicating if color palette in `palcol` is reversed
`sumfun`	character indicating one of `"auto"` (default), `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`, see details
`crs`	numeric as a four-digit EPSG number for the coordinate reference system, see details
`zoom`	numeric indicating resolution of the base map, see details
`addwater`	character string as `"low"`, `"medium"` (default), `"high"`, or `NULL` (to suppress) to include water features with varying detail from the National Hydrography dataset, see details
`watercol`	character string of color for water objects if `addwater` is not `NULL`
`maptype`	character string indicating the basemap type, see details
`buffdist`	numeric for buffer around the bounding box for the selected sites in kilometers, see details
`scaledist`	character string indicating distance unit for the scale bar, `"km"` or `"mi"`
`northloc`	character string indicating location of the north arrow, see details
`scaleloc`	character string indicating location of the scale bar, see details
`latlon`	logical to include latitude and longitude labels on the plot, default `TRUE`
`ttlsize`	numeric value indicating font size of the title relative to other text in the plot
`bssize`	numeric for overall plot text scaling, passed to `theme_gray`
`runchk`	logical to run data checks with `checkMWRresults`, `checkMWRacc`, or `checkMWRsites`, applies only if `res`, `acc`, or `sit` are file paths
`warn`	logical to return warnings to the console (default)

Details

This function creates a map of summarized results for a selected parameter at each monitoring site. By default, all dates for the parameter are averaged. Options to filter by site, date range, and result attribute are provided. Only sites with spatial information in the site metadata file are plotted and a warning is returned for those that do not have this information. The site labels are also plotted next to each point. The labels can be suppressed by setting labsize = NULL.

Any acceptable color palette from RColorBrewer can be used for palcol, which is passed to the palette argument in scale_fill_distiller. These could include any of the sequential color palettes, e.g., "Greens", "Blues", etc. The diverging and qualitative palettes will also work, but may return uninterpretable color scales. The palette can be reversed by setting palcolrev = TRUE.

The default value for crs is EPSG 4326 for the WGS 84 projection in decimal degrees. The crs argument is passed to st_as_sf and any acceptable CRS appropriate for the data can be used.

The results shown on the map represent the parameter summary for each site within the date range provided by dtrng. If sumfun = "auto" (default), the mean is used where the distribution is determined automatically from the data quality objective file for accuracy, i.e., parameters with "log" in any of the columns are summarized with the geometric mean, otherwise arithmetic. Any other valid summary function will be applied if passed to sumfun ("mean", "geomean", "median", "min", "max"), regardless of the information in the data quality objective file for accuracy.

Using addwater = "medium" (default) will include lines and polygons of natural water bodies defined using the National Hydrography Dataset (NHD). The level of detail can be changed to low or high using addwater = "low" or addwater = "high", respectively. Use addwater = NULL to not show any water features.

A base map can be plotted using the maptype argument. The zoom value specifies the resolution of the map. Use higher values to download map tiles with greater resolution, although this increases the download time. The maptype argument describes the type of base map to download. Acceptable options include "OpenStreetMap", "OpenStreetMap.DE", "OpenStreetMap.France", "OpenStreetMap.HOT", "OpenTopoMap", "Esri.WorldStreetMap", "Esri.DeLorme", "Esri.WorldTopoMap", "Esri.WorldImagery", "Esri.WorldTerrain", "Esri.WorldShadedRelief", "Esri.OceanBasemap", "Esri.NatGeoWorldMap", "Esri.WorldGrayCanvas", "CartoDB.Positron", "CartoDB.PositronNoLabels", "CartoDB.PositronOnlyLabels", "CartoDB.DarkMatter", "CartoDB.DarkMatterNoLabels", "CartoDB.DarkMatterOnlyLabels", "CartoDB.Voyager", "CartoDB.VoyagerNoLabels", or "CartoDB.VoyagerOnlyLabels". Use maptype = NULL to suppress the base map.

The area around the summarized points can be increased or decreased using the buffdist argument. This creates a buffered area around the bounding box for the points, where the units are kilometers.

A north arrow and scale bar are also placed on the map as defined by the northloc and scaleloc arguments. The placement for both can be chosen as "tl", "tr", "bl", or "br" for top-left, top-right, bottom-left, or bottom-right respectively. Setting either of the arguments to NULL will suppress the placement on the map.

Value

A ggplot object that can be further modified.

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)


# map with NHD water bodies
anlzMWRmap(res = resdat, param = 'DO', acc = accdat, sit = sitdat, addwater = 'medium')

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)


# map with NHD water bodies
anlzMWRmap(res = resdat, param = 'DO', acc = accdat, sit = sitdat, addwater = 'medium')

Analyze outliers in results file

Description

Analyze outliers in results file

Usage

anlzMWRoutlier(
  res = NULL,
  param,
  acc = NULL,
  fset = NULL,
  type = c("box", "jitterbox", "jitter"),
  group,
  dtrng = NULL,
  repel = TRUE,
  outliers = FALSE,
  labsize = 3,
  fill = "lightgrey",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
anlzMWRoutlier(
  res = NULL,
  param,
  acc = NULL,
  fset = NULL,
  type = c("box", "jitterbox", "jitter"),
  group,
  dtrng = NULL,
  repel = TRUE,
  outliers = FALSE,
  labsize = 3,
  fill = "lightgrey",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`param`	character string of the parameter to plot, must conform to entries in the `"Simple Parameter"` column of `paramsMWR`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`type`	character indicating `"box"`, `"jitterbox"`, or `"jitter"`, see details
`group`	character indicating whether the summaries are grouped by month, site, or week of year
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD, optional
`repel`	logical indicating if overlapping outlier labels are offset
`outliers`	logical indicating if outliers are returned to the console instead of plotting
`labsize`	numeric indicating font size for the outlier labels
`fill`	numeric indicating fill color for boxplots
`alpha`	numeric from 0 to 1 indicating transparency of fill color
`width`	numeric for width of boxplots
`yscl`	character indicating one of `"auto"` (default), `"log"`, or `"linear"`, see details
`ttlsize`	numeric value indicating font size of the title relative to other text in the plot
`bssize`	numeric for overall plot text scaling, passed to `theme_minimal`
`runchk`	logical to run data checks with `checkMWRresults` or `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)

Details

Outliers are defined following the standard ggplot definition as 1.5 times the inter-quartile range of each boxplot. The data frame returned if outliers = TRUE may vary based on the boxplot groupings defined by group.

Specifying type = "box" (default) will produce standard boxplots. Specifying type = "jitterbox" will produce boxplots with non-outlier observations jittered on top. Specifying type = "jitter" will suppress the boxplots and show only the jittered points and the outliers.

Specifying group = "week" will group the samples by week of year using an integer specifying the week. Note that there can be no common month/day indicating the start of the week between years and an integer is the only way to compare summaries if the results data span multiple years.

Value

A ggplot object that can be further modified if outliers = FALSE, otherwise a data frame of outliers is returned.

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# outliers by month
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month')

# outliers by site
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site')

# outliers by site, May through July 2021 only
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site', 
     dtrng = c('2022-05-01', '2022-07-31'))

# outliers by month, type as jitterbox
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitterbox')

# outliers by month, type as jitter
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitter')

# data frame output
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', outliers = TRUE)

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# outliers by month
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month')

# outliers by site
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site')

# outliers by site, May through July 2021 only
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'site', 
     dtrng = c('2022-05-01', '2022-07-31'))

# outliers by month, type as jitterbox
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitterbox')

# outliers by month, type as jitter
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', type = 'jitter')

# data frame output
anlzMWRoutlier(res = resdat, param = 'DO', acc = accdat, group = 'month', outliers = TRUE)

Analyze outliers in results file for all parameters

Description

Analyze outliers in results file for all parameters

Usage

anlzMWRoutlierall(
  res = NULL,
  acc = NULL,
  fset = NULL,
  fig_height = 4,
  fig_width = 8,
  format = c("word", "png", "zip"),
  output_dir,
  output_file = NULL,
  type = c("box", "jitterbox", "jitter"),
  group,
  dtrng = NULL,
  repel = TRUE,
  outliers = FALSE,
  labsize = 3,
  fill = "lightgrey",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
anlzMWRoutlierall(
  res = NULL,
  acc = NULL,
  fset = NULL,
  fig_height = 4,
  fig_width = 8,
  format = c("word", "png", "zip"),
  output_dir,
  output_file = NULL,
  type = c("box", "jitterbox", "jitter"),
  group,
  dtrng = NULL,
  repel = TRUE,
  outliers = FALSE,
  labsize = 3,
  fill = "lightgrey",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`fig_height`	numeric for plot heights in inches
`fig_width`	numeric for plot width in inches
`format`	character string indicating if results are placed in a word file, as separate png files, or as a zipped file of separate png files in `output_dir`
`output_dir`	character string of the output directory for the results
`output_file`	optional character string for the file name if `format = "word"`
`type`	character indicating `"box"`, `"jitterbox"`, or `"jitter"`, see details
`group`	character indicating whether the summaries are grouped by month, site, or week of year
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD, optional
`repel`	logical indicating if overlapping outlier labels are offset
`outliers`	logical indicating if outliers are returned to the console instead of plotting
`labsize`	numeric indicating font size for the outlier labels
`fill`	numeric indicating fill color for boxplots
`alpha`	numeric from 0 to 1 indicating transparency of fill color
`width`	numeric for width of boxplots
`yscl`	character indicating one of `"auto"` (default), `"log"`, or `"linear"`, see details
`ttlsize`	numeric value indicating font size of the title relative to other text in the plot
`bssize`	numeric for overall plot text scaling, passed to `theme_minimal`
`runchk`	logical to run data checks with `checkMWRresults` or `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)

Details

This function is a wrapper to anlzMWRoutlier to create plots for all parameters with appropriate data in the water quality monitoring results

Value

A word document named outlierall.docx (or name passed to output_file) if format = "word" or separate png files for each parameter if format = "png" will be saved in the directory specified by output_dir

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)


# create word output
anlzMWRoutlierall(resdat, accdat, group = 'month', format = 'word', output_dir = tempdir())

# create png output
anlzMWRoutlierall(resdat, accdat, group = 'month', format = 'png', output_dir = tempdir())

# create zipped png output
anlzMWRoutlierall(resdat, accdat, group = 'month', format = 'zip', output_dir = tempdir())

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)


# create word output
anlzMWRoutlierall(resdat, accdat, group = 'month', format = 'word', output_dir = tempdir())

# create png output
anlzMWRoutlierall(resdat, accdat, group = 'month', format = 'png', output_dir = tempdir())

# create zipped png output
anlzMWRoutlierall(resdat, accdat, group = 'month', format = 'zip', output_dir = tempdir())

Analyze seasonal trends in results file

Description

Analyze seasonal trends in results file

Usage

anlzMWRseason(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  thresh,
  group = c("month", "week"),
  type = c("box", "jitterbox", "bar", "jitterbar", "jitter"),
  threshlab = NULL,
  threshcol = "tan",
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  confint = FALSE,
  fill = "lightblue",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  sumfun = yscl,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
anlzMWRseason(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  thresh,
  group = c("month", "week"),
  type = c("box", "jitterbox", "bar", "jitterbar", "jitter"),
  threshlab = NULL,
  threshcol = "tan",
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  confint = FALSE,
  fill = "lightblue",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  sumfun = yscl,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`param`	character string of the parameter to plot, must conform to entries in the `"Simple Parameter"` column of `paramsMWR`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`sit`	optional character string of path to the site metadata file or `data.frame` of site metadata returned by `readMWRsites`, required if `locgroup` is not `NULL`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`thresh`	character indicating if relevant freshwater or marine threshold lines are included, one of `"fresh"`, `"marine"`, or `"none"`, or a single numeric value to override the values included with the package
`group`	character indicating whether the summaries are grouped by month (default) or week of year
`type`	character indicating `"box"`, `"jitterbox"`, `"bar"`, `"jitterbar"` or `"jitter"`, see details
`threshlab`	optional character string indicating legend label for the threshold, required only if `thresh` is numeric
`threshcol`	character indicating color of threshold lines if available
`site`	character string of sites to include, default all
`resultatt`	character string of result attributes to plot, default all
`locgroup`	character string of location groups to plot from the `"Location Group"` column in the site metadata file, optional and only if `sit` is not `NULL`
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD, default all
`confint`	logical indicating if confidence intervals are shown, only applies if `type = "bar"`
`fill`	numeric indicating fill color for boxplots or barplots
`alpha`	numeric from 0 to 1 indicating transparency of fill color
`width`	numeric for width of boxplots or barplots
`yscl`	character indicating one of `"auto"` (default), `"log"`, or `"linear"`, see details
`sumfun`	character indicating one of `"auto"`, `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`, see details
`ttlsize`	numeric value indicating font size of the title relative to other text in the plot
`bssize`	numeric for overall plot text scaling, passed to `theme_minimal`
`runchk`	logical to run data checks with `checkMWRresults` or `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)

Details

Summaries of a parameter are shown as boxplots if type = "box" or as barplots if type = "bar". Points can be jittered over the boxplots by setting type = "jitterbox" or jittered over the barplots by setting type = "jitterbar". Setting type = "jitter" will show only the jittered points. For type = "bar" or type = "jitterbar", 95% confidence intervals can also be shown if confint = TRUE and they can be estimated (i.e., more than one result value per bar and sumfun is "auto", "mean", or "geomean").

Similarly, the data will be summarized if type is "bar" or "jitterbar" based on the value passed to sumfun. The default if no value is provided to sumfun is to use the appropriate summary based on the value provided to yscl. If yscl = "auto" (default), then sumfun = "auto", and the mean or geometric mean is used for the summary based on information in the data quality objective file for accuracy. Using yscl = "linear" or yscl = "log" will default to the mean or geometric mean summary if no value is provided to sumfun. Any other appropriate value passed to sumfun will override the value passed to yscl. Valid summary functions for sumfun include "auto", "mean", "geomean", "median", "min", or "max").

Value

A ggplot object that can be further modified.

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# seasonal trends by month, boxplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'month', 
     type = 'box')

# seasonal trends by week, boxplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'week', 
     type = 'box')

# seasonal trends by month, May to July only
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'month', 
     type = 'bar', dtrng = c('2022-05-01', '2022-07-31'))
     
# seasonal trends by month, barplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'month', 
     type = 'bar')

# seasonal trends by week, barplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'week', 
     type = 'bar')
     
# seasonal trends by location group, requires sitdat
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, sit = sitdat, thresh = 'fresh', 
     group = 'month', type = 'box', locgroup = 'Assabet')
# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# seasonal trends by month, boxplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'month', 
     type = 'box')

# seasonal trends by week, boxplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'week', 
     type = 'box')

# seasonal trends by month, May to July only
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'month', 
     type = 'bar', dtrng = c('2022-05-01', '2022-07-31'))
     
# seasonal trends by month, barplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'month', 
     type = 'bar')

# seasonal trends by week, barplot
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, thresh = 'fresh', group = 'week', 
     type = 'bar')
     
# seasonal trends by location group, requires sitdat
anlzMWRseason(res = resdat, param = 'DO', acc = accdat, sit = sitdat, thresh = 'fresh', 
     group = 'month', type = 'box', locgroup = 'Assabet')

Analyze data by sites in results file

Description

Analyze data by sites in results file

Usage

anlzMWRsite(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  type = c("box", "jitterbox", "bar", "jitterbar", "jitter"),
  thresh,
  threshlab = NULL,
  threshcol = "tan",
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  confint = FALSE,
  fill = "lightgreen",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  sumfun = yscl,
  byresultatt = FALSE,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)
anlzMWRsite(
  res = NULL,
  param,
  acc = NULL,
  sit = NULL,
  fset = NULL,
  type = c("box", "jitterbox", "bar", "jitterbar", "jitter"),
  thresh,
  threshlab = NULL,
  threshcol = "tan",
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  dtrng = NULL,
  confint = FALSE,
  fill = "lightgreen",
  alpha = 0.8,
  width = 0.8,
  yscl = "auto",
  sumfun = yscl,
  byresultatt = FALSE,
  ttlsize = 1.2,
  bssize = 11,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`param`	character string of the parameter to plot, must conform to entries in the `"Simple Parameter"` column of `paramsMWR`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`sit`	optional character string of path to the site metadata file or `data.frame` of site metadata returned by `readMWRsites`, required if `locgroup` is not `NULL`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`type`	character indicating `"box"`, `"jitterbox"`, `"bar"`, `"jitterbar"` or `"jitter"`, see details
`thresh`	character indicating if relevant freshwater or marine threshold lines are included, one of `"fresh"`, `"marine"`, or `"none"`, or a single numeric value to override the values included with the package
`threshlab`	optional character string indicating legend label for the threshold, required only if `thresh` is numeric
`threshcol`	character indicating color of threshold lines if available
`site`	character string of sites to include, default all
`resultatt`	character string of result attributes to plot, default all
`locgroup`	character string of location groups to plot from the `"Location Group"` column in the site metadata file, optional and only if `sit` is not `NULL`
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD, default all
`confint`	logical indicating if confidence intervals are shown, only applies if `type` is `"bar"` or `"jitterbar"`
`fill`	numeric indicating fill color for boxplots or barplots
`alpha`	numeric from 0 to 1 indicating transparency of fill color
`width`	numeric for width of boxplots or barplots
`yscl`	character indicating one of `"auto"` (default), `"log"`, or `"linear"`, see details
`sumfun`	character indicating one of `"auto"`, `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`, see details
`byresultatt`	logical indicating if the plot has sites grouped separately by result attributes, see details
`ttlsize`	numeric value indicating font size of the title relative to other text in the plot
`bssize`	numeric for overall plot text scaling, passed to `theme_minimal`
`runchk`	logical to run data checks with `checkMWRresults` or `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)

Details

Summaries of a parameter for each site are shown as boxplots if type = "box" or as barplots if type = "bar". Points can be jittered over the boxplots by setting type = "jitterbox" or jittered over the barplots by setting type = "jitterbar". Setting type = "jitter" will show only the jittered points. For type = "bar" or type = "jitterbar", 95% confidence intervals can also be shown if confint = TRUE and they can be estimated (i.e., more than one result value per bar and sumfun is "auto", "mean", or "geomean").

The byresultatt argument can be used to group sites separately by result attributes. For example, sites with E. coli samples can be grouped by "Dry" or "Wet" conditions if present in the "Result Attribute" column. Filtering by sites first using the site argument is advised to reduce the amount of data that are plotted. The grouping can be filtered further by passing appropriate values in the "Result Attribute" column to the resultatt argument. Note that specifying result attributes with resultatt and setting byresultatt = FALSE will filter the plot data by the result attributes but will not plot the results separately.

Value

A ggplot object that can be further modified.

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# site trends, boxplot
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'box', thresh = 'fresh')

# site trends, barplot
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'bar', thresh = 'fresh')

# site trends, May to July only
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'box', thresh = 'fresh',
     dtrng = c('2022-05-01', '2022-07-31'))
     
# grouping by result attribute
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'box', thresh = 'fresh',
     site = c('ABT-062', 'ABT-077'), byresultatt = TRUE)
     
# site trends by location group, requires sitdat
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, sit = sitdat, type = 'box', 
     thresh = 'fresh', locgroup = 'Assabet')
     
# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# site trends, boxplot
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'box', thresh = 'fresh')

# site trends, barplot
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'bar', thresh = 'fresh')

# site trends, May to July only
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'box', thresh = 'fresh',
     dtrng = c('2022-05-01', '2022-07-31'))
     
# grouping by result attribute
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, type = 'box', thresh = 'fresh',
     site = c('ABT-062', 'ABT-077'), byresultatt = TRUE)
     
# site trends by location group, requires sitdat
anlzMWRsite(res = resdat, param = 'DO', acc = accdat, sit = sitdat, type = 'box', 
     thresh = 'fresh', locgroup = 'Assabet')

Check data quality objective accuracy data

Description

Check data quality objective accuracy data

Usage

checkMWRacc(accdat, warn = TRUE)
checkMWRacc(accdat, warn = TRUE)

Arguments

`accdat`	input data frame
`warn`	logical to return warnings to the console (default)

Details

This function is used internally within readMWRacc to run several checks on the input data for completeness and conformance to WQX requirements

The following checks are made:

Column name spelling: Should be the following: Parameter, uom, MDL, UQL, Value Range, Field Duplicate, Lab Duplicate, Field Blank, Lab Blank, Spike/Check Accuracy
Columns present: All columns from the previous check should be present
Column types: All columns should be characters/text, except for MDL and UQL
Value Range column na check: The character string "na" should not be in the Value Range column, "all" should be used if the entire range applies
Unrecognized characters: Fields describing accuracy checks should not include symbols or text other than $<=$ , $\leq$ , $<$ , $>=$ , $\geq$ , $>$ , $\pm$ , "%", "BDL", "AQL", "log", or "all"
Overlap in Value Range column: Entries in Value Range should not overlap for a parameter (excludes ascending ranges)
Gap in Value Range column: Entries in Value Range should not include a gap for a parameter, warning only
Parameter: Should match parameter names in the Simple Parameter or WQX Parameter columns of the paramsMWR data
Units: No missing entries in units (uom), except pH which can be blank
Single unit: Each unique Parameter should have only one type for the units (uom)
Correct units: Each unique Parameter should have an entry in the units (uom) that matches one of the acceptable values in the Units of measure column of the paramsMWR data
Empty columns: Columns with all missing or NA values will return a warning

Value

accdat is returned as is if no errors are found, otherwise an informative error message is returned prompting the user to make the required correction to the raw data before proceeding.

Examples

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data with no checks
accdat <- readxl::read_excel(accpth, na = c('NA', ''), col_types = 'text')
accdat <- dplyr::mutate(accdat, dplyr::across(-c(`Value Range`), ~ dplyr::na_if(.x, 'na'))) 
      
checkMWRacc(accdat)
# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data with no checks
accdat <- readxl::read_excel(accpth, na = c('NA', ''), col_types = 'text')
accdat <- dplyr::mutate(accdat, dplyr::across(-c(`Value Range`), ~ dplyr::na_if(.x, 'na'))) 
      
checkMWRacc(accdat)

Check censored data

Description

Check censored data

Usage

checkMWRcens(censdat, warn = TRUE)
checkMWRcens(censdat, warn = TRUE)

Arguments

`censdat`	input data frame for results
`warn`	logical to return warnings to the console (default)

Details

This function is used internally within readMWRcens to run several checks on the input data for completeness and conformance.

The following checks are made:

Column name spelling: Should be the following: Parameter, Missed and Censored Records
Columns present: All columns from the previous check should be present
Non-numeric or empty entries in Missed and Censored Records: All values should be numbers
Negative Missed and Censored Records: All values should be greater than or equal to zero
Parameter: Should match parameter names in the Simple Parameter or WQX Parameter columns of the paramsMWR data (warning only)

Value

censdat is returned as is if no errors are found, otherwise an informative error message is returned prompting the user to make the required correction to the raw data before proceeding. Checks with warnings can be fixed at the discretion of the user before proceeding.

Examples

censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

censdat <- suppressWarnings(readxl::read_excel(censpth, na = c('NA', 'na', ''), guess_max = Inf)) 
             
checkMWRcens(censdat)
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

censdat <- suppressWarnings(readxl::read_excel(censpth, na = c('NA', 'na', ''), guess_max = Inf)) 
             
checkMWRcens(censdat)

Check data quality objective frequency and completeness data

Description

Check data quality objective frequency and completeness data

Usage

checkMWRfrecom(frecomdat, warn = TRUE)
checkMWRfrecom(frecomdat, warn = TRUE)

Arguments

`frecomdat`	input data frame
`warn`	logical to return warnings to the console (default)

Details

This function is used internally within readMWRfrecom to run several checks on the input data for frequency and completeness and conformance to WQX requirements

The following checks are made:

Column name spelling: Should be the following: Parameter, Field Duplicate, Lab Duplicate, Field Blank, Lab Blank, Spike/Check Accuracy, % Completeness
Columns present: All columns from the previous check should be present
Non-numeric values: Values entered in columns other than the first should be numeric
Values outside of 0 - 100: Values entered in columns other than the first should not be outside of 0 and 100
Parameter: Should match parameter names in the Simple Parameter or WQX Parameter columns of the paramsMWR data
Empty columns: Columns with all missing or NA values will return a warning

Value

frecomdat is returned as is if no errors are found, otherwise an informative error message is returned prompting the user to make the required correction to the raw data before proceeding.

Examples

library(dplyr)

frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

frecomdat <- suppressMessages(readxl::read_excel(frecompth, 
      skip = 1, na = c('NA', 'na', ''), 
      col_types = c('text', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric')
    )) %>% 
    rename(`% Completeness` = `...7`)
    
checkMWRfrecom(frecomdat)
library(dplyr)

frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

frecomdat <- suppressMessages(readxl::read_excel(frecompth, 
      skip = 1, na = c('NA', 'na', ''), 
      col_types = c('text', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric')
    )) %>% 
    rename(`% Completeness` = `...7`)
    
checkMWRfrecom(frecomdat)

Check water quality monitoring results

Description

Check water quality monitoring results

Usage

checkMWRresults(resdat, warn = TRUE)
checkMWRresults(resdat, warn = TRUE)

Arguments

`resdat`	input data frame for results
`warn`	logical to return warnings to the console (default)

Details

This function is used internally within readMWRresults to run several checks on the input data for completeness and conformance to WQX requirements.

The following checks are made:

Column name spelling: Should be the following: Monitoring Location ID, Activity Type, Activity Start Date, Activity Start Time, Activity Depth/Height Measure, Activity Depth/Height Unit, Activity Relative Depth Name, Characteristic Name, Result Value, Result Unit, Quantitation Limit, QC Reference Value, Result Measure Qualifier, Result Attribute, Sample Collection Method ID, Project ID, Local Record ID, Result Comment
Columns present: All columns from the previous check should be present
Activity Type: Should be one of Field Msr/Obs, Sample-Routine, Quality Control Sample-Field Blank, Quality Control Sample-Lab Blank, Quality Control Sample-Lab Duplicate, Quality Control Sample-Lab Spike, Quality Control-Calibration Check, Quality Control-Meter Lab Duplicate, Quality Control-Meter Lab Blank
Date formats: Should be mm/dd/yyyy and parsed correctly on import
Depth data present: Depth data should be included in Activity Depth/Height Measure or Activity Relative Depth Name for all rows where Activity Type is Field Msr/Obs or Sample-Routine
Non-numeric Activity Depth/Height Measure: All depth values should be numbers, excluding missing values
Activity Depth/Height Unit: All entries should be ft, m, or blank
Activity Relative Depth Name: Should be either Surface, Bottom, Midwater, Near Bottom, or blank (warning only)
Activity Depth/Height Measure out of range: All depth values should be less than or equal to 1 meter / 3.3 feet or entered as Surface in the Activity Relative Depth Name column (warning only)
Characteristic Name: Should match parameter names in the Simple Parameter or WQX Parameter columns of the paramsMWR data (warning only)
Result Value: Should be a numeric value or a text value as AQL or BDL
Non-numeric Quantitation Limit: All values should be numbers, excluding missing values
QC Reference Value: Should be a numeric value or a text value as AQL or BDL
Result Unit: No missing entries in Result Unit, except pH which can be blank
Single Result Unit: Each unique parameter in Characteristic Name should have only one entry in Result Unit (excludes entries for lab spikes reported as % or % recovery)
Correct Result Unit: Each unique parameter in Characteristic Name should have an entry in Result Unit that matches one of the acceptable values in the Units of measure column of the paramsMWR data (excludes entries for lab spikes reported as % or % recovery)

Value

resdat is returned as is if no errors are found, otherwise an informative error message is returned prompting the user to make the required correction to the raw data before proceeding. Checks with warnings can be fixed at the discretion of the user before proceeding.

Examples

library(dplyr)

respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

resdat <- suppressWarnings(readxl::read_excel(respth, na = c('NA', 'na', ''), guess_max = Inf)) %>% 
  dplyr::mutate_if(function(x) !lubridate::is.POSIXct(x), as.character)
             
checkMWRresults(resdat)
library(dplyr)

respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

resdat <- suppressWarnings(readxl::read_excel(respth, na = c('NA', 'na', ''), guess_max = Inf)) %>% 
  dplyr::mutate_if(function(x) !lubridate::is.POSIXct(x), as.character)
             
checkMWRresults(resdat)

Check site metadata file

Description

Check site metadata file

Usage

checkMWRsites(sitdat)
checkMWRsites(sitdat)

Arguments

sitdat

input data frame

Details

This function is used internally within readMWRsites to run several checks on the input data for completeness and conformance to WQX requirements

The following checks are made:

Column name spelling: Should be the following: Monitoring Location ID, Monitoring Location Name, Monitoring Location Latitude, Monitoring Location Longitude, Location Group
Columns present: All columns from the previous check should be present
Missing longitude or latitude: No missing entries in Monitoring Location Latitude or Monitoring Location Longitude
Non-numeric latitude values: Values entered in Monitoring Location Latitude must be numeric
Non-numeric longitude values: Values entered in Monitoring Location Longitude must be numeric
Positive longitude values: Values in Monitoring Location Longitude must be negative
Missing Location ID: No missing entries for Monitoring Location ID

Value

sitdat is returned as is if no errors are found, otherwise an informative error message is returned prompting the user to make the required correction to the raw data before proceeding.

Examples

library(dplyr)

sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

sitdat <- readxl::read_excel(sitpth, na = c('NA', 'na', ''))
    
checkMWRsites(sitdat)
library(dplyr)

sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

sitdat <- readxl::read_excel(sitpth, na = c('NA', 'na', ''))
    
checkMWRsites(sitdat)

Check water quality exchange (wqx) metadata input

Description

Check water quality exchange (wqx) metadata input

Usage

checkMWRwqx(wqxdat, warn = TRUE)
checkMWRwqx(wqxdat, warn = TRUE)

Arguments

`wqxdat`	input data frame
`warn`	logical to return warnings to the console (default)

Details

This function is used internally within readMWRwqx to run several checks on the input data for conformance with downstream functions

The following checks are made:

Column name spelling: Should be the following: Parameter, Sampling Method Context, Method Speciation, Result Sample Fraction, Analytical Method, Analytical Method Context
Columns present: All columns from the previous check should be present
Unique parameters: Values in Parameter should be unique (no duplicates)
Parameter: Should match parameter names in the Simple Parameter or WQX Parameter columns of the paramsMWR data (warning only)

Value

wqxdat is returned as is if no errors are found, otherwise an informative error message is returned prompting the user to make the required correction to the raw data before proceeding. Checks with warnings can be fixed at the discretion of the user before proceeding.

Examples

library(dplyr)

wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

wqxdat <- readxl::read_excel(wqxpth, na = c('NA', 'na', ''), col_types = 'text')
    
checkMWRwqx(wqxdat)
library(dplyr)

wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

wqxdat <- readxl::read_excel(wqxpth, na = c('NA', 'na', ''), col_types = 'text')
    
checkMWRwqx(wqxdat)

Format data quality objective accuracy data

Description

Format data quality objective accuracy data

Usage

formMWRacc(accdat)
formMWRacc(accdat)

Arguments

accdat

input data fram

Details

This function is used internally within readMWRacc to format the input data for downstream analysis. The formatting includes:

Minor formatting for units: For conformance to WQX, e.g., ppt is changed to ppth, s.u. is changed to NA in uom
Convert Parameter: All parameters are converted to Simple Parameter in paramsMWR as needed
Remove unicode: Remove or replace unicode characters with those that can be used in logical expressions in qcMWRacc, e.g., replace $\geq$ with $>=$
Convert limits to numeric: Convert MDL and UQL columns to numeric

Value

A formatted data frame of the data quality objectives file for accuracy

Examples

accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

accdat <- readxl::read_excel(accpth, na = c('NA', ''))
accdat <- dplyr::mutate(accdat, dplyr::across(-c(`Value Range`), ~ dplyr::na_if(.x, 'na')))

formMWRacc(accdat)
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

accdat <- readxl::read_excel(accpth, na = c('NA', ''))
accdat <- dplyr::mutate(accdat, dplyr::across(-c(`Value Range`), ~ dplyr::na_if(.x, 'na')))

formMWRacc(accdat)

Format censored data

Description

Format censored data

Usage

formMWRcens(censdat)
formMWRcens(censdat)

Arguments

censdat

input data frame

Details

This function is used internally within readMWRcens to format the input data for downstream analysis. The formatting includes:

Convert Parameter: All parameters are converted to Simple Parameter in paramsMWR as needed,
Convert Missed and Censored Records: All values are converted to numeric

Value

A formatted data frame of the censored data

Examples

library(dplyr)

censpth <- system.file('extdata/ExampleCensored.xlsx', 
     package = 'MassWateR')

censdat <- suppressMessages(readxl::read_excel(censpth, 
      na = c('NA', 'na', '')
    )) 
    
formMWRcens(censdat)
library(dplyr)

censpth <- system.file('extdata/ExampleCensored.xlsx', 
     package = 'MassWateR')

censdat <- suppressMessages(readxl::read_excel(censpth, 
      na = c('NA', 'na', '')
    )) 
    
formMWRcens(censdat)

Format data quality objective frequency and completeness data

Description

Format data quality objective frequency and completeness data

Usage

formMWRfrecom(frecomdat)
formMWRfrecom(frecomdat)

Arguments

frecomdat

input data frame

Details

This function is used internally within readMWRfrecom to format the input data for downstream analysis. The formatting includes:

Convert Parameter: All parameters are converted to Simple Parameter in paramsMWR as needed

Value

A formatted data frame of the data quality objectives file for frequency and completeness

Examples

library(dplyr)

frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

frecomdat <- suppressMessages(readxl::read_excel(frecompth, 
      skip = 1, na = c('NA', 'na', ''), 
      col_types = c('text', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric')
    )) %>% 
    rename(`% Completeness` = `...7`)
    
formMWRfrecom(frecomdat)
library(dplyr)

frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

frecomdat <- suppressMessages(readxl::read_excel(frecompth, 
      skip = 1, na = c('NA', 'na', ''), 
      col_types = c('text', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric', 'numeric')
    )) %>% 
    rename(`% Completeness` = `...7`)
    
formMWRfrecom(frecomdat)

Format water quality monitoring results

Description

Format water quality monitoring results

Usage

formMWRresults(resdat, tzone = "America/Jamaica")
formMWRresults(resdat, tzone = "America/Jamaica")

Arguments

`resdat`	input data frame for results
`tzone`	character string for time zone

Details

This function is used internally within readMWRresults to format the input data for downstream analysis. The formatting includes:

Fix date and time inputs: Activity Start Date is converted to YYYY-MM-DD as a date object, Actvity Start Time is convered to HH:MM as a character to fix artifacts from Excel import
Minor formatting for Result Unit: For conformance to WQX, e.g., ppt is changed to ppth, s.u. is changed to NA
Convert characteristic names: All parameters in Characteristic Name are converted to Simple Parameter in paramsMWR as needed

Value

A formatted data frame of the water quality monitoring results file

Examples

library(dplyr)

respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

resdat <- suppressWarnings(readxl::read_excel(respth, na = c('NA', 'na', ''), guess_max = Inf)) %>% 
  dplyr::mutate_if(function(x) !lubridate::is.POSIXct(x), as.character)
  
formMWRresults(resdat)
library(dplyr)

respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

resdat <- suppressWarnings(readxl::read_excel(respth, na = c('NA', 'na', ''), guess_max = Inf)) %>% 
  dplyr::mutate_if(function(x) !lubridate::is.POSIXct(x), as.character)
  
formMWRresults(resdat)

Format WQX metadata input

Description

Format WQX metadata input

Usage

formMWRwqx(wqxdat)
formMWRwqx(wqxdat)

Arguments

wqxdat

input data frame for wqx metadata

Details

This function is used internally within readMWRwqx to format the input data for downstream analysis. The formatting includes:

Convert characteristic names: All parameters in Characteristic Name are converted to Simple Parameter in paramsMWR as needed

Value

A formatted data frame of the WQX metadata file

Examples

library(dplyr)

wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

wqxdat <- suppressWarnings(readxl::read_excel(wqxpth, na = c('NA', 'na', ''), col_types = 'text'))
  
formMWRwqx(wqxdat)
library(dplyr)

wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

wqxdat <- suppressWarnings(readxl::read_excel(wqxpth, na = c('NA', 'na', ''), col_types = 'text'))
  
formMWRwqx(wqxdat)

Master parameter list and units for Characteristic Name column in results data

Description

Master parameter list and units for Characteristic Name column in results data

Usage

paramsMWR
paramsMWR

Format

A data.frame

Details

This information is used to verify the correct format of input data and for formatting output data for upload to WQX. A column showing the corresponding WQX names is also included.

Examples

paramsMWR
paramsMWR

Run quality control accuracy checks for water quality monitoring results

Description

Run quality control accuracy checks for water quality monitoring results

Usage

qcMWRacc(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  accchk = c("Field Blanks", "Lab Blanks", "Field Duplicates", "Lab Duplicates",
    "Lab Spikes / Instrument Checks"),
  suffix = "%"
)
qcMWRacc(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  accchk = c("Field Blanks", "Lab Blanks", "Field Duplicates", "Lab Duplicates",
    "Lab Spikes / Instrument Checks"),
  suffix = "%"
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`runchk`	logical to run data checks with `checkMWRresults` and `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)
`accchk`	character string indicating which accuracy check to return, one to any of `"Field Blanks"`, `"Lab Blanks"`, `"Field Duplicates"`, `"Lab Duplicates"`, or `"Lab Spikes / Instrument Checks"`
`suffix`	character string indicating suffix to append to percentage values

Details

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults and readMWRacc. For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

Note that accuracy is only evaluated on parameters in the Parameter column in the data quality objectives accuracy file. A warning is returned if there are parameters in Parameter in the accuracy file that are not in Characteristic Name in the results file.

Similarly, parameters in the results file in the Characteristic Name column that are not found in the data quality objectives accuracy file are not evaluated. A warning is returned if there are parameters in Characteristic Name in the results file that are not in Parameter in the accuracy file.

The data quality objectives file for frequency and completeness is used to screen parameters in the results file for inclusion in the accuracy tables. Parameters with empty values in the frequency and completeness table are not returned.

Value

The output shows the accuracy checks from the input files returned as a list, with each element of the list corresponding to a specific accuracy check specified with accchk.

Examples

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

qcMWRacc(res = respth, acc = accpth, frecom = frecompth)

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

qcMWRacc(res = respth, acc = accpth, frecom = frecompth)

Run quality control completeness checks for water quality monitoring results

Description

Run quality control completeness checks for water quality monitoring results

Usage

qcMWRcom(
  res = NULL,
  frecom = NULL,
  cens = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE
)
qcMWRcom(
  res = NULL,
  frecom = NULL,
  cens = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`cens`	character string of path to the censored data file or `data.frame` returned by `readMWRcens`, optional
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`runchk`	logical to run data checks with `checkMWRresults` and `checkMWRfrecom`, applies only if `res` or `frecom` are file paths
`warn`	logical to return warnings to the console (default)

Details

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults, readMWRfrecom, and readMWRcens (optional). For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

Note that frequency is only evaluated on parameters in the Parameter column in the data quality objectives frequency and completeness file. A warning is returned if there are parameters in Parameter in the frequency and completeness file that are not in Characteristic Name in the results file.

Similarly, parameters in the results file in the Characteristic Name column that are not found in the data quality objectives frequency and completeness file are not evaluated. A warning is returned if there are parameters in Characteristic Name in the results file that are not in Parameter in the frequency and completeness file.

A similar warning is returned if there are parameters in the censored data, if provided, that are not in the results file. However, an error is returned if there are parameters in the data quality objectives frequency and completeness file that are not in the censored data file.

All warnings can be suppressed by setting warn = FALSE.

Value

The output shows the completeness checks from the combined files. Each row applies to a completeness check for a parameter. The datarec and qualrec columns show the number of data records and qualified records, respectively. The datarec column specifically shows only records not for quality control by excluding those as duplicates, blanks, or spikes in the count. The standard column shows the relevant percentage required for the quality control check from the quality control objectives file, the complete column shows the calculated completeness taken from the input data, and the met column shows if the standard was met by comparing if complete is greater than or equal to standard.

Examples

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# censored path
censpth <- system.file('extdata/ExampleCensored.xlsx', 
     package = 'MassWateR')

qcMWRcom(res = respth, frecom = frecompth, cens = censpth)

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)

qcMWRcom(res = resdat, frecom = frecomdat, cens = censdat)

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# censored path
censpth <- system.file('extdata/ExampleCensored.xlsx', 
     package = 'MassWateR')

qcMWRcom(res = respth, frecom = frecompth, cens = censpth)

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)

qcMWRcom(res = resdat, frecom = frecomdat, cens = censdat)

Run quality control frequency checks for water quality monitoring results

Description

Run quality control frequency checks for water quality monitoring results

Usage

qcMWRfre(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE
)
qcMWRfre(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`runchk`	logical to run data checks with `checkMWRresults` and `checkMWRfrecom`, applies only if `res` or `frecom` are file paths
`warn`	logical to return warnings to the console (default)

Details

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults, readMWRacc, and readMWRfrecom. For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

Value

The output shows the frequency checks from the input files. Each row applies to a frequency check for a parameter. The Parameter column shows the parameter, the obs column shows the total records that apply to regular activity types, the check column shows the relevant activity type for each frequency check, the count column shows the number of records that apply to a check, the standard column shows the relevant percentage required for the quality control check from the quality control objectives file, and the met column shows if the standard was met by comparing if percent is greater than or equal to standard.

Examples

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

qcMWRfre(res = respth, acc = accpth, frecom = frecompth)

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

qcMWRfre(res = resdat, acc = accdat, frecom = frecomdat)
##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

qcMWRfre(res = respth, acc = accpth, frecom = frecompth)

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

qcMWRfre(res = resdat, acc = accdat, frecom = frecomdat)

Create the quality control review report

Description

Create the quality control review report

Usage

qcMWRreview(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  cens = NULL,
  fset = NULL,
  output_dir,
  output_file = NULL,
  savesheet = FALSE,
  rawdata = TRUE,
  dqofontsize = 7.5,
  tabfontsize = 9,
  padding = 0,
  warn = TRUE,
  runchk = TRUE
)
qcMWRreview(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  cens = NULL,
  fset = NULL,
  output_dir,
  output_file = NULL,
  savesheet = FALSE,
  rawdata = TRUE,
  dqofontsize = 7.5,
  tabfontsize = 9,
  padding = 0,
  warn = TRUE,
  runchk = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`cens`	character string of path to the censored data file or `data.frame` returned by `readMWRcens`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, and `cens`, overrides the other arguments
`output_dir`	character string of the output directory for the rendered file
`output_file`	optional character string for the file name
`savesheet`	logical indicating if a spreadsheet of the tables in the report is also saved (default `FALSE`)
`rawdata`	logical to include quality control accuracy summaries for raw data, e.g., field blanks, etc.
`dqofontsize`	numeric for font size in the data quality objective tables in the first page of the review
`tabfontsize`	numeric for font size in the review tables
`padding`	numeric for row padding for table output
`warn`	logical indicating if warnings from the table functions are included in the file output
`runchk`	logical to run data checks with `checkMWRresults`, `checkMWRacc`, `checkMWRfrecom`, applies only if `res`, `acc`, or `frecom` are file paths

Details

The function compiles a review report as a Word document for all quality control checks included in the MassWateR package. The report shows several tables, including the data quality objectives files for accuracy, frequency, and completeness, summary results for all accuracy checks, summary results for all frequency checks, summary results for all completeness checks, and individual results for all accuracy checks. The report uses the individual table functions (which can be used separately) to return the results, which include tabMWRacc, tabMWRfre, and tabMWRcom. The help files for each of these functions can be consulted for a more detailed explanation of the quality control checks.

The workflow for using this function is to import the required data (results, data quality objective, and censored files) and to fix any errors noted on import prior to creating the review report. Additional warnings that may be of interest as returned by the individual table functions can be returned in the console by setting warn = TRUE.

Optional arguments that can be changed as needed include specifying the file name with output_file, suppressing the raw data summaries at the end of the report with rawdata = FALSE, and changing the table font sizes (dqofontsize for the data quality objectives on the first page, tabfontsize for the remainder). Set savesheet = TRUE to also save a spreadsheet of the tables in the report.

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults, readMWRacc, readMWRfrecom, and readMWRcens (optional). For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F, as explained in the relevant help files. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

Value

A compiled review report named qcreview.docx (or name passed to output_file) will be saved in the directory specified by output_dir

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# dqo completeness data path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', package = 'MassWateR')

# censored data path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)


# create report
qcMWRreview(res = resdat, acc = accdat, frecom = frecomdat, cens = censdat, output_dir = tempdir())

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# dqo completeness data path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', package = 'MassWateR')

# censored data path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)


# create report
qcMWRreview(res = resdat, acc = accdat, frecom = frecomdat, cens = censdat, output_dir = tempdir())

Read data quality objectives for accuracy from an external file

Description

Read data quality objectives for accuracy from an external file

Usage

readMWRacc(accpth, runchk = TRUE, warn = TRUE)
readMWRacc(accpth, runchk = TRUE, warn = TRUE)

Arguments

`accpth`	character string of path to the data quality objectives file for accuracy
`runchk`	logical to run data checks with `checkMWRacc`
`warn`	logical to return warnings to the console (default)

Details

Data are imported with read_excel and checked with checkMWRacc.

Value

A formatted data frame of data quality objectives for completeness that can be used for downstream analysis

Examples

accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

accdat <- readMWRacc(accpth)
head(accdat)
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

accdat <- readMWRacc(accpth)
head(accdat)

Read censored data from an external file

Description

Read censored data from an external file

Usage

readMWRcens(censpth, runchk = TRUE, warn = TRUE)
readMWRcens(censpth, runchk = TRUE, warn = TRUE)

Arguments

`censpth`	character string of path to the censored file
`runchk`	logical to run data checks with `checkMWRcens`
`warn`	logical to return warnings to the console (default)

Details

Data are imported with read_excel, checked with checkMWRcens, and formatted with formMWRcens. The input file includes rows for each parameter and two columns indicating the parameter name and number of missed or censored records for that parameter. The data are used to complete the number of missed and censored records column for the completeness table created with tabMWRcom and is an optional input. The parameters in this file must match those in the data quality objectives file for frequency and completeness.

Value

A formatted censored data frame that can be used for downstream analysis

Examples

censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

censdat <- readMWRcens(censpth)
head(censdat)
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

censdat <- readMWRcens(censpth)
head(censdat)

Read data quality objectives for frequency and completeness from an external file

Description

Read data quality objectives for frequency and completeness from an external file

Usage

readMWRfrecom(frecompth, runchk = TRUE, warn = TRUE)
readMWRfrecom(frecompth, runchk = TRUE, warn = TRUE)

Arguments

`frecompth`	character string of path to the data quality objectives file for frequency and completeness
`runchk`	logical to run data checks with `checkMWRfrecom`
`warn`	logical to return warnings to the console (default)

Details

Data are imported with read_excel and checked with checkMWRfrecom.

Value

A formatted data frame of data quality objectives for frequency and completeness that can be used for downstream analysis

Examples

frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

frecomdat <- readMWRfrecom(frecompth)
head(frecomdat)
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

frecomdat <- readMWRfrecom(frecompth)
head(frecomdat)

Read water quality monitoring results from an external file

Description

Read water quality monitoring results from an external file

Usage

readMWRresults(respth, runchk = TRUE, warn = TRUE, tzone = "America/Jamaica")
readMWRresults(respth, runchk = TRUE, warn = TRUE, tzone = "America/Jamaica")

Arguments

`respth`	character string of path to the results file
`runchk`	logical to run data checks with `checkMWRresults`
`warn`	logical to return warnings to the console (default)
`tzone`	character string for time zone, passed to `formMWRresults`

Details

Date are imported with read_excel, checked with checkMWRresults, and formatted with formMWRresults.

Value

A formatted water quality monitoring results data frame that can be used for downstream analysis

Examples

respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

resdat <- readMWRresults(respth)
head(resdat)
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

resdat <- readMWRresults(respth)
head(resdat)

Create summary spreadsheet of the water quality monitoring results

Description

Create summary spreadsheet of unique values for each column in the water quality results file to check for data mistakes prior to running the readMWRresults function

Usage

readMWRresultsview(
  respth,
  columns = NULL,
  output_dir,
  output_file = NULL,
  maxlen = 8
)
readMWRresultsview(
  respth,
  columns = NULL,
  output_dir,
  output_file = NULL,
  maxlen = 8
)

Arguments

`respth`	character string of path to the results file
`columns`	character string indicating which columns to view, defaults to all
`output_dir`	character string of the output directory for the rendered file
`output_file`	optional character string for the name of the .csv file output, must include the file extension
`maxlen`	numeric to truncate numeric values to the specified length

Details

Acceptable options for the columns argument include any of the column names in the results file. The default setting (NULL) will show every column in the results file.

The output of this function can be useful to troubleshoot the checks when importing the water quality monitoring result file with readMWRresults (see https://massbays-tech.github.io/MassWateR/articles/MassWateR.html#data-import-and-checks).

Unique entries for the Result Value column will include NA entries if present, all other columns will not.

Value

Creates a spreadsheet at the location specified by output_dir. Each column shows the unique values.

Examples

respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# all columns
readMWRresultsview(respth, output_dir = tempdir())

# parameters and units
readMWRresultsview(respth, columns = c('Characteristic Name', 'Result Unit'),
   output_dir = tempdir())
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# all columns
readMWRresultsview(respth, output_dir = tempdir())

# parameters and units
readMWRresultsview(respth, columns = c('Characteristic Name', 'Result Unit'),
   output_dir = tempdir())

Read site metadata from an external file

Description

Read site metadata from an external file

Usage

readMWRsites(sitpth, runchk = TRUE)
readMWRsites(sitpth, runchk = TRUE)

Arguments

`sitpth`	character string of path to the site metadata file
`runchk`	logical to run data checks with `checkMWRsites`

Details

Data are imported with read_excel and checked with checkMWRsites.

Value

A formatted data frame of site metadata that can be used for downstream analysis

Examples

sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

sitdat <- readMWRsites(sitpth)
head(sitdat)
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

sitdat <- readMWRsites(sitpth)
head(sitdat)

Read water quality exchange (wqx) metadata input from an external file

Description

Read water quality exchange (wqx) metadata input from an external file

Usage

readMWRwqx(wqxpth, runchk = TRUE, warn = TRUE)
readMWRwqx(wqxpth, runchk = TRUE, warn = TRUE)

Arguments

`wqxpth`	character string of path to the wqx metadata file
`runchk`	logical to run data checks with `checkMWRwqx`
`warn`	logical to return warnings to the console (default)

Details

Date are imported with read_excel, checked with checkMWRwqx.

Value

A formatted data frame that can be used for downstream analysis

Examples

wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

wqxdat <- readMWRwqx(wqxpth)
head(wqxdat)
wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

wqxdat <- readMWRwqx(wqxpth)
head(wqxdat)

Create a formatted table of quality control accuracy checks

Description

Create a formatted table of quality control accuracy checks

Usage

tabMWRacc(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  accchk = c("Field Blanks", "Lab Blanks", "Field Duplicates", "Lab Duplicates",
    "Lab Spikes / Instrument Checks"),
  type = c("individual", "summary", "percent"),
  pass_col = "#57C4AD",
  fail_col = "#DB4325",
  suffix = "%",
  caption = TRUE
)
tabMWRacc(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  accchk = c("Field Blanks", "Lab Blanks", "Field Duplicates", "Lab Duplicates",
    "Lab Spikes / Instrument Checks"),
  type = c("individual", "summary", "percent"),
  pass_col = "#57C4AD",
  fail_col = "#DB4325",
  suffix = "%",
  caption = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`, applies only if `type = "summary"` or `type = "percent"`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`runchk`	logical to run data checks with `checkMWRresults` and `checkMWRacc`, applies only if `res` or `acc` are file paths
`warn`	logical to return warnings to the console (default)
`accchk`	character string indicating which accuracy check to return, one to any of `"Field Blanks"`, `"Lab Blanks"`, `"Field Duplicates"`, `"Lab Duplicates"`, or `"Lab Spikes / Instrument Checks"`
`type`	character string indicating `individual`, `summary` or `percent` tabular output, see details
`pass_col`	character string (as hex code) for the cell color of checks that pass, applies only if `type = 'percent'`
`fail_col`	character string (as hex code) for the cell color of checks that fail, applies only if `type = 'percent'`
`suffix`	character string indicating suffix to append to percentage values
`caption`	logical to include a caption from `accchk`, only applies if `type = "individual"`

Details

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults and readMWRacc. For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F, as explained in the relevant help files. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

Also note that accuracy is only evaluated on parameters that are shared between the results file and data quality objectives file for accuracy. A warning is returned for parameters that do not match between the files. This warning can be suppressed by setting warn = FALSE.

The function can return three types of tables as specified with the type argument: "individual", "summary", or "percent". The individual tables are specific to each type of accuracy check for each parameter (e.g., field blanks, lab blanks, etc.). The summary table summarizes all accuracy checks by the number of checks and how many hit/misses are returned for each across all parameters. The percent table is similar to the summary table, but showing only percentages with appropriate color-coding for hit/misses. The data quality objectives file for frequency and completeness is required if type = "summary" or type = "percent".

For type = "individual", the quality control tables for accuracy are retrieved by specifying the check with the accchk argument. The accchk argument can be used to specify one of the following values to retrieve the relevant tables: "Field Blanks", "Lab Blanks", "Field Duplicates", "Lab Duplicates", or "Lab Spikes / Instrument Checks".

For type = "summary", the function summarizes all accuracy checks by counting the number of quality control checks, number of misses, and percent acceptance for each parameter. All accuracy checks are used and the accchk argument does not apply.

For type = "percent", the function returns a similar table as for the summary option, except only the percentage of checks that pass for each parameter are shown in wide format. Cells are color-coded based on the percentage of checks that have passed using the percent thresholds from the % Completeness column of the data quality objectives file for frequency and completeness. Parameters without an entry for % Completeness are not color-coded and an appropriate warning is returned. All accuracy checks are used and the accchk argument does not apply.

Inputs for the results and data quality objectives for accuracy are processed internally with qcMWRacc and the same arguments are accepted for this function, in addition to others listed above.

Value

A flextable object with formatted results.

Examples


##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# table as individual
tabMWRacc(res = respth, acc = accpth, frecom = frecompth, type = 'individual', 
     accchk = 'Field Blanks')
##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# table as individual
tabMWRacc(res = respth, acc = accpth, frecom = frecompth, type = 'individual', 
     accchk = 'Field Blanks')

Create a formatted table of quality control completeness checks

Description

Create a formatted table of quality control completeness checks

Usage

tabMWRcom(
  res = NULL,
  frecom = NULL,
  cens = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  pass_col = "#57C4AD",
  fail_col = "#DB4325",
  digits = 0,
  suffix = "%",
  parameterwd = 1.15,
  noteswd = 3
)
tabMWRcom(
  res = NULL,
  frecom = NULL,
  cens = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  pass_col = "#57C4AD",
  fail_col = "#DB4325",
  digits = 0,
  suffix = "%",
  parameterwd = 1.15,
  noteswd = 3
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`cens`	character string of path to the censored data file or `data.frame` returned by `readMWRcens`, optional
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`runchk`	logical to run data checks with `checkMWRresults` and `checkMWRfrecom`, applies only if `res` or `frecom` are file paths
`warn`	logical to return warnings to the console (default)
`pass_col`	character string (as hex code) for the cell color of checks that pass
`fail_col`	character string (as hex code) for the cell color of checks that fail
`digits`	numeric indicating number of significant digits to report for percentages
`suffix`	character string indicating suffix to append to percentage values
`parameterwd`	numeric indicating width of the parameter column
`noteswd`	numeric indicating width of notes column

Details

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults, readMWRfrecom, and readMWRcens (optional). For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F, as explained in the relevant help files. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

Also note that completeness is only evaluated on parameters that are shared between the results file and data quality objectives file for frequency and completeness. A warning is returned for parameters that do not match between the files. A similar warning is returned if there are parameters in the censored data, if provided, that are not in the results file and vice versa. These warnings can be suppressed by setting warn = FALSE.

A summary table showing the number of data records, number of qualified records, and percent completeness is created. The % Completeness column shows cells as green or red if the required percentage of observations for completeness are present as specified in the data quality objectives file. The Hit/ Miss column shows similar information but in text format, i.e., MISS is shown if the quality control standard for completeness is not met.

Inputs for the results and data quality objectives for frequency and completeness are processed internally with qcMWRcom and the same arguments are accepted for this function, in addition to others listed above.

Value

A flextable object with formatted results showing summary counts for all completeness checks for each parameter.

Examples

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')
     
# censored path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

tabMWRcom(res = respth, frecom = frecompth, cens = censpth)

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)

tabMWRcom(res = resdat, frecom = frecomdat, cens = censdat)

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')
     
# censored path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

tabMWRcom(res = respth, frecom = frecompth, cens = censpth)

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)

tabMWRcom(res = resdat, frecom = frecomdat, cens = censdat)

Create a formatted table of quality control frequency checks

Description

Create a formatted table of quality control frequency checks

Usage

tabMWRfre(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  type = c("summary", "percent"),
  pass_col = "#57C4AD",
  fail_col = "#DB4325",
  digits = 0,
  suffix = "%"
)
tabMWRfre(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE,
  type = c("summary", "percent"),
  pass_col = "#57C4AD",
  fail_col = "#DB4325",
  digits = 0,
  suffix = "%"
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`runchk`	logical to run data checks with `checkMWRresults` and `checkMWRfrecom`, applies only if `res` or `frecom` are file paths
`warn`	logical to return warnings to the console (default)
`type`	character string indicating `summary` or `percent` tabular output, see datails
`pass_col`	character string (as hex code) for the cell color of checks that pass, applies only if `type = 'percent'`
`fail_col`	character string (as hex code) for the cell color of checks that fail, applies only if `type = 'percent'`
`digits`	numeric indicating number of significant digits to report for percentages
`suffix`	character string indicating suffix to append to percentage values

Details

The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults, readMWRacc, and readMWRfrecom. For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F, as explained in the relevant help files. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

The quality control tables for frequency show the number of records that apply to a given check (e.g., Lab Blank, Field Blank, etc.) relative to the number of "regular" data records (e.g., field samples or measures) for each parameter. A summary of all frequency checks for each parameter is provided if type = "summary" or a color-coded table showing similar information as percentages for each parameter is provided if type = "percent".

Inputs for the results and data quality objectives for accuracy and frequency and completeness are processed internally with qcMWRcom and the same arguments are accepted for this function, in addition to others listed above.

Value

A flextable object with formatted results.

Examples


##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# table as summary
tabMWRfre(res = respth, acc = accpth, frecom = frecompth, type = 'summary')

# table as percent
tabMWRfre(res = respth, acc = accpth, frecom = frecompth, type = 'percent')

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# table as summary
tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'summary')

# table as percent
tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'percent')
##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# table as summary
tabMWRfre(res = respth, acc = accpth, frecom = frecompth, type = 'summary')

# table as percent
tabMWRfre(res = respth, acc = accpth, frecom = frecompth, type = 'percent')

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# table as summary
tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'summary')

# table as percent
tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'percent')

Create and save tables in a single workbook for WQX upload

Description

Create and save tables in a single workbook for WQX upload

Usage

tabMWRwqx(
  res = NULL,
  acc = NULL,
  sit = NULL,
  wqx = NULL,
  fset = NULL,
  output_dir,
  output_file = NULL,
  listout = FALSE,
  warn = TRUE,
  runchk = TRUE
)
tabMWRwqx(
  res = NULL,
  acc = NULL,
  sit = NULL,
  wqx = NULL,
  fset = NULL,
  output_dir,
  output_file = NULL,
  listout = FALSE,
  warn = TRUE,
  runchk = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`sit`	character string of path to the site metadata file or `data.frame` for site metadata returned by `readMWRsites`
`wqx`	character string of path to the wqx metadata file or `data.frame` for wqx metadata returned by `readMWRwqx`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, or `wqx` overrides the other arguments
`output_dir`	character string of the output directory for the results
`output_file`	optional character string for the file name, must include .xlsx suffix
`listout`	logical to return a list of the output for each sheet of the workbook (default is `FALSE`)
`warn`	logical to return warnings to the console (default)
`runchk`	logical to run data checks with `checkMWRresults`, `checkMWRacc`, `checkMWRsites`, `checkMWRwqx`, applies only if `res`, `acc`, `sit`, or `wqx` are file paths

Details

This function will export a single Excel workbook with three sheets, named "Project", "Locations", and "Results". The output is populated with as much content as possible based on information in the input files. The remainder of the information not included in the output will need to be manually entered before uploading the data to WQX. All required columns are present, but individual rows will need to be verified for completeness. It is the responsibility of the user to verify this information is complete and correct before uploading the data.

The workflow for using this function is to import the required data (results, data quality objectives file for accuracy, site metadata, and wqx metadata) and to fix any errors noted on import prior to creating the output. The function can be used with inputs as paths to the relevant files or as data frames returned by readMWRresults, readMWRacc, readMWRsites, and readMWRwqx. For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F, as explained in the relevant help files. In the latter case, downstream analyses may not work if data are formatted incorrectly. For convenience, a named list with the input arguments as paths or data frames can be passed to the fset argument instead. See the help file for utilMWRinput.

The name of the output file can also be changed using the output_file argument, the default being wqxtab.xlsx. Warnings can also be turned off or on (default) using the warn argument. This returns any warnings when data are imported and only applies if the file inputs are paths.

Value

An Excel workbook named wqxtab.xlsx (or name passed to output_file) will be saved in the directory specified by output_dir. The workbook will include three sheets names "Projects", "Locations", and "Results".

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# wqx data path
wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# site data
sitdat <- readMWRsites(sitpth)

# wqx data
wqxdat <- readMWRwqx(wqxpth)

# create workbook
tabMWRwqx(res = resdat, acc = accdat, sit = sitdat, wqx = wqxdat, output_dir = tempdir())
# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# wqx data path
wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# site data
sitdat <- readMWRsites(sitpth)

# wqx data
wqxdat <- readMWRwqx(wqxpth)

# create workbook
tabMWRwqx(res = resdat, acc = accdat, sit = sitdat, wqx = wqxdat, output_dir = tempdir())

Master thresholds list for analysis of results data

Description

Master thresholds list for analysis of results data

Usage

thresholdMWR
thresholdMWR

Format

A data.frame of 28 rows and 10 columns

Details

This file includes appropriate threshold values of water quality parameters for marine and freshwater environments based on state standards or typical ranges in Massachusetts.

Examples

thresholdMWR
thresholdMWR

Filter results data by parameter, date range, site, result attributes, and/or location group

Description

Filter results data by parameter, date range, site, result attributes, and/or location group

Usage

utilMWRfilter(
  resdat,
  sitdat = NULL,
  param,
  dtrng = NULL,
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  alllocgroup = FALSE,
  allresultatt = FALSE
)
utilMWRfilter(
  resdat,
  sitdat = NULL,
  param,
  dtrng = NULL,
  site = NULL,
  resultatt = NULL,
  locgroup = NULL,
  alllocgroup = FALSE,
  allresultatt = FALSE
)

Arguments

`resdat`	results data as returned by `readMWRresults`
`sitdat`	site metadata file as returned by `readMWRresults`
`param`	character string to filter results by a parameter in `"Characteristic Name"`
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD
`site`	character string of sites to include, default all
`resultatt`	character string of result attributes to include, default all
`locgroup`	character string of location groups to include from the `"Location Group"` column in the site metadata file
`alllocgroup`	logical indicating if results data are filtered by all location groups in `"Location Group"` in the site metadata file if `locgroup = NULL`, used only in `anlzMWRdate`
`allresultatt`	logical indicating if results data are filtered by all result attributes if `resultatt = NULL`, used only in `anlzMWRsite`

Value

resdat filtered by param, dtrng, site, resultatt, and/or locgroup, otherwise resdat filtered only by param if other arguments are NULL

Examples

# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# filter by parameter, date range
utilMWRfilter(resdat, param = 'DO', dtrng = c('2022-06-01', '2022-06-30'))

# filter by parameter, site
utilMWRfilter(resdat, param = 'DO', site = c('ABT-026', 'ABT-062', 'ABT-077'))

# filter by parameter, result attribute
utilMWRfilter(resdat, param = 'DO', resultatt = 'DRY')

# filter by parameter, location group, date range
utilMWRfilter(resdat, param = 'DO', sitdat = sitdat, 
     locgroup = 'Assabet', dtrng = c('2022-06-01', '2022-06-30'))
# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# site data path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# site data
sitdat <- readMWRsites(sitpth)

# filter by parameter, date range
utilMWRfilter(resdat, param = 'DO', dtrng = c('2022-06-01', '2022-06-30'))

# filter by parameter, site
utilMWRfilter(resdat, param = 'DO', site = c('ABT-026', 'ABT-062', 'ABT-077'))

# filter by parameter, result attribute
utilMWRfilter(resdat, param = 'DO', resultatt = 'DRY')

# filter by parameter, location group, date range
utilMWRfilter(resdat, param = 'DO', sitdat = sitdat, 
     locgroup = 'Assabet', dtrng = c('2022-06-01', '2022-06-30'))

Filter results data to surface measurements

Description

Filter results data to surface measurements

Usage

utilMWRfiltersurface(resdat)
utilMWRfiltersurface(resdat)

Arguments

resdat

results data as returned by readMWRresults

Details

This function is used internally for all analysis functions

Value

resdat filtered by Activity Depth/Height Measure less than or equal to 1 meter or 3.3 feet or Activity Relative Depth Name as "Surface"

Examples

# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# filter surface data
utilMWRfiltersurface(resdat)
# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# filter surface data
utilMWRfiltersurface(resdat)

Prep results data for frequency checks

Description

Prep results data for frequency checks

Usage

utilMWRfre(resdat, param, accdat, warn = TRUE)
utilMWRfre(resdat, param, accdat, warn = TRUE)

Arguments

`resdat`	results data as returned by `readMWRresults`
`param`	character string to filter results and check if a parameter in the `"Characteristic Name"` column in the results file is also found in the data quality objectives file for accuracy, see details
`accdat`	`data.frame` for data quality objectives file for accuracy as returned by `readMWRacc`
`warn`	logical to return warnings to the console (default)

Details

This function is similar to utilMWRlimits with some additional processing appropriate for creating the frequency table in tabMWRfree. The param argument is used to identify the appropriate "MDL" or "UQL" values in the data quality objectives file for accuracy. A warning is returned to the console if the accuracy file does not contain the appropriate information for the parameter. Results will be filtered by param regardless of any warning.

Value

resdat filtered by param with any entries in "Result Value" as "BDL" or "AQL" replaced with appropriate values in the "Quantitation Limit" column, if present, otherwise the "MDL" or "UQL" columns from the data quality objectives file for accuracy are used. Values as "BDL" use one half of the appropriate limit. Values not in the "Value Range" column of the accuracy file are removed from the output.

Examples

# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# apply to total phosphorus
utilMWRfre(resdat, accdat, param = 'TP')

# apply to E.coli
utilMWRfre(resdat, accdat, param = 'E.coli')
# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# apply to total phosphorus
utilMWRfre(resdat, accdat, param = 'TP')

# apply to E.coli
utilMWRfre(resdat, accdat, param = 'E.coli')

Load external file from remote source, fail gracefully

Description

Load external file from remote source, fail gracefully

Usage

utilMWRhttpgrace(remote_file)
utilMWRhttpgrace(remote_file)

Arguments

remote_file

URL of the external file

Value

The external file as an RData object

Examples

# fails gracefully
utilMWRhttpgrace('http://httpbin.org/status/404')

# imports data or fails gracefully
fl <- 'https://github.com/massbays-tech/MassWateRdata/raw/main/data/streamsMWR.RData'
utilMWRhttpgrace(fl)

# fails gracefully
utilMWRhttpgrace('http://httpbin.org/status/404')

# imports data or fails gracefully
fl <- 'https://github.com/massbays-tech/MassWateRdata/raw/main/data/streamsMWR.RData'
utilMWRhttpgrace(fl)

Utility function to import data as paths or data frames

Description

Utility function to import data as paths or data frames

Usage

utilMWRinput(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  sit = NULL,
  wqx = NULL,
  cens = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE
)
utilMWRinput(
  res = NULL,
  acc = NULL,
  frecom = NULL,
  sit = NULL,
  wqx = NULL,
  cens = NULL,
  fset = NULL,
  runchk = TRUE,
  warn = TRUE
)

Arguments

`res`	character string of path to the results file or `data.frame` for results returned by `readMWRresults`
`acc`	character string of path to the data quality objectives file for accuracy or `data.frame` returned by `readMWRacc`
`frecom`	character string of path to the data quality objectives file for frequency and completeness or `data.frame` returned by `readMWRfrecom`
`sit`	character string of path to the site metadata file or `data.frame` for site metadata returned by `readMWRsites`
`wqx`	character string of path to the wqx metadata file or `data.frame` for wqx metadata returned by `readMWRwqx`
`cens`	character string of path to the censored data file or `data.frame` for censored data returned by `readMWRcens`
`fset`	optional list of inputs with elements named `res`, `acc`, `frecom`, `sit`, `wqx`, or `cens`, overrides the other arguments, see details
`runchk`	logical to run data checks with `checkMWRresults`, `checkMWRacc`, `checkMWRfrecom`, `checkMWRsites`, `checkMWRwqx`, or `checkMWRcens`, applies only if `res`, `acc`, `frecom`, `sit`, `wqx`, or `cens` are file paths
`warn`	logical to return warnings to the console (default)

Details

The function is used internally by others to import data from paths to the relevant files or as data frames returned by readMWRresults, readMWRacc, readMWRfrecom, readMWRsites, readMWRwqx, or readMWRcens. For the former, the full suite of data checks can be evaluated with runkchk = T (default) or suppressed with runchk = F.

The fset argument can used in place of the preceding arguments. The argument accepts a list with named elements as res, acc, frecom, sit, wqx, or cens, where the elements are either character strings of the path or data frames to the corresponding inputs. Missing elements will be interpreted as NULL values. This argument is provided as convenience to apply a single list as input versus separate inputs for each argument.

Any of the arguments for the data files can be NULL, used as a convenience for downstream functions that do not require all.

Value

A six element list with the imported results, data quality objective files, site metadata, wqx metadata, and censored data named "resdat", "accdat", "frecomdat", "sitdat", "wqxdat", and "censdat" respectively.

Examples

##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# site path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# wqx path
wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

# censored path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

inp <- utilMWRinput(res = respth, acc = accpth, frecom = frecompth, sit = sitpth, 
  wqx = wqxpth, cens = censpth)
inp$resdat
inp$accdat
inp$frecomdat
inp$sitdat
inp$wqxdat
inp$censdat

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# site data
sitdat <- readMWRsites(sitpth)

# wqx data
wqxdat <- readMWRwqx(wqxpth)

# censored data
censdat <- readMWRcens(censpth)

inp <- utilMWRinput(res = resdat, acc = accdat, frecom = frecomdat, sit = sitdat, 
   wqx = wqxdat, cens = censdat)
inp$resdat
inp$accdat
inp$frecomdat
inp$sitdat
inp$wqxdat
inp$censdat

##
# using fset as list input

# input with paths to files
fset <- list(
  res = respth, 
  acc = accpth, 
  frecom = frecompth,
  sit = sitpth, 
  wqx = wqxpth, 
  cens = censpth
)
utilMWRinput(fset = fset)
##
# using file paths

# results path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# frequency and completeness path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
     package = 'MassWateR')

# site path
sitpth <- system.file('extdata/ExampleSites.xlsx', package = 'MassWateR')

# wqx path
wqxpth <- system.file('extdata/ExampleWQX.xlsx', package = 'MassWateR')

# censored path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

inp <- utilMWRinput(res = respth, acc = accpth, frecom = frecompth, sit = sitpth, 
  wqx = wqxpth, cens = censpth)
inp$resdat
inp$accdat
inp$frecomdat
inp$sitdat
inp$wqxdat
inp$censdat

##
# using data frames

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# site data
sitdat <- readMWRsites(sitpth)

# wqx data
wqxdat <- readMWRwqx(wqxpth)

# censored data
censdat <- readMWRcens(censpth)

inp <- utilMWRinput(res = resdat, acc = accdat, frecom = frecomdat, sit = sitdat, 
   wqx = wqxdat, cens = censdat)
inp$resdat
inp$accdat
inp$frecomdat
inp$sitdat
inp$wqxdat
inp$censdat

##
# using fset as list input

# input with paths to files
fset <- list(
  res = respth, 
  acc = accpth, 
  frecom = frecompth,
  sit = sitpth, 
  wqx = wqxpth, 
  cens = censpth
)
utilMWRinput(fset = fset)

Check if required inputs are present for a function

Description

Check if required inputs are present for a function

Usage

utilMWRinputcheck(inputs, nocheck = NULL)
utilMWRinputcheck(inputs, nocheck = NULL)

Arguments

`inputs`	list of arguments passed from the parent function
`nocheck`	optional character vector of inputs not to check, allows for optional inputs

Value

NULL if all inputs are present, otherwise an error message indicating which inputs are missing

Examples

inputchk <- formals(tabMWRcom)
inputchk$res <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')
inputchk$frecom <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
  package = 'MassWateR')
inputchk$cens <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

utilMWRinputcheck(inputchk)
inputchk <- formals(tabMWRcom)
inputchk$res <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')
inputchk$frecom <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', 
  package = 'MassWateR')
inputchk$cens <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

utilMWRinputcheck(inputchk)

Fill results data as BDL or AQL with appropriate values

Description

Fill results data as BDL or AQL with appropriate values

Usage

utilMWRlimits(resdat, param, accdat, warn = TRUE)
utilMWRlimits(resdat, param, accdat, warn = TRUE)

Arguments

`resdat`	results data as returned by `readMWRresults`
`param`	character string to filter results and check if a parameter in the `"Characteristic Name"` column in the results file is also found in the data quality objectives file for accuracy, see details
`accdat`	`data.frame` for data quality objectives file for accuracy as returned by `readMWRacc`
`warn`	logical to return warnings to the console (default)

Details

The param argument is used to identify the appropriate "MDL" or "UQL" values in the data quality objectives file for accuracy. A warning is returned to the console if the accuracy file does not contain the appropriate information for the parameter. Results will be filtered by param regardless of any warning.

Value

resdat filtered by param with any entries in "Result Value" as "BDL" or "AQL" replaced with appropriate values in the "Quantitation Limit" column, if present, otherwise the "MDL" or "UQL" columns from the data quality objectives file for accuracy are used. Values as "BDL" use one half of the appropriate limit. Output only includes rows with the activity type as "Field Msr/Obs" or "Sample-Routine".

Examples

# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# apply to total phosphorus
utilMWRlimits(resdat, accdat, param = 'TP')

# apply to E.coli
utilMWRlimits(resdat, accdat, param = 'E.coli')
# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# apply to total phosphorus
utilMWRlimits(resdat, accdat, param = 'TP')

# apply to E.coli
utilMWRlimits(resdat, accdat, param = 'E.coli')

Identify outliers in a numeric vector

Description

Identify outliers in a numeric vector

Usage

utilMWRoutlier(x, logscl)
utilMWRoutlier(x, logscl)

Arguments

`x`	numeric vector of any length
`logscl`	logical to indicate if vector should be log10-transformed first

Details

Outliers are identified as 1.5 times the interquartile range

Value

A logical vector equal in length to x indicating TRUE for outliers or FALSE for within normal range

Examples

x <- rnorm(20)
utilMWRoutlier(x, logscl = FALSE)
x <- rnorm(20)
utilMWRoutlier(x, logscl = FALSE)

Format a list of QC tables for spreadsheet export

Description

Format a list of QC tables for spreadsheet export

Usage

utilMWRsheet(datin, rawdata = TRUE)
utilMWRsheet(datin, rawdata = TRUE)

Arguments

`datin`	list of input QC tables
`rawdata`	logical to include quality control accuracy summaries for raw data, e.g., field blanks, etc.

Details

The function is used internally with qcMWRreview to format data quality objective and quality control tables for export into an Excel spreadsheet. These changes are specific to the spreadsheet format and may not reflect the formatting in the Word document produced by qcMWRreview.

The datin list is expected to contain the following elements:

frecomdat Data Quality Objectives for frequency and completeness data frame as returned by readMWRfrecom
accdat Data Quality Objectives for accuracy data frame as returned by readMWRacc
tabfreper Frequency checks percent table, created with tabMWRfre
tabfresum Frequency checks summary table, created with tabMWRfre
tabaccper Accuracy checks percent table, created with tabMWRacc
tabaccsum Accuracy checks summary table, created with tabMWRacc
tabcom Completeness table, created with tabMWRcom
indflddup Individual accuracy checks table for field duplicates, created with tabMWRacc, can be NULL
indlabdup Individual accuracy checks table for lab duplicates, created with tabMWRacc, can be NULL
indfldblk Individual accuracy checks table for field blanks, created with tabMWRacc, can be NULL
indlabblk Individual accuracy checks table for lab blanks, created with tabMWRacc, can be NULL
indlabins Individual accuracy checks table for lab spikes and instrument checks, created with tabMWRacc, can be NULL

Value

A list similar to the input with formatting applied

Examples

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# dqo completeness data path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', package = 'MassWateR')

# censored data path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)

# frequency table percent
tabfreper <- tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'percent', 
  warn = FALSE) 

# frequency summary table
tabfresum <- tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'summary', 
  warn = FALSE)

# accuracy table percent
tabaccper <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'percent', 
  warn = FALSE)

# accuracy table summary
tabaccsum <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'summary', 
  warn = FALSE)
  
# completeness table
tabcom <- tabMWRcom(res = resdat, frecom = frecomdat, cens = censdat, warn = FALSE, 
  parameterwd = 1.15, noteswd = 2)

# individual accuracy checks for raw data
indflddup <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Field Duplicates', warn = FALSE, caption = FALSE)
indlabdup <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Lab Duplicates', warn = FALSE, caption = FALSE)
indfldblk <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Field Blanks', warn = FALSE, caption = FALSE)
indlabblk <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Lab Blanks', warn = FALSE, caption = FALSE)
indlabins <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Lab Spikes / Instrument Checks', warn = FALSE, caption = FALSE)

# input  
datin <- list(
  frecomdat = frecomdat, 
  accdat = accdat,
  tabfreper = tabfreper,
  tabfresum = tabfresum,
  tabaccper = tabaccper,
  tabaccsum = tabaccsum,
  tabcom = tabcom,
  indflddup = indflddup,
  indlabdup = indlabdup,
  indfldblk = indfldblk,
  indlabblk = indlabblk,
  indlabins = indlabins
)

utilMWRsheet(datin)
# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# dqo accuracy data path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')

# dqo completeness data path
frecompth <- system.file('extdata/ExampleDQOFrequencyCompleteness.xlsx', package = 'MassWateR')

# censored data path
censpth <- system.file('extdata/ExampleCensored.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy data
accdat <- readMWRacc(accpth)

# frequency and completeness data
frecomdat <- readMWRfrecom(frecompth)

# censored data
censdat <- readMWRcens(censpth)

# frequency table percent
tabfreper <- tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'percent', 
  warn = FALSE) 

# frequency summary table
tabfresum <- tabMWRfre(res = resdat, acc = accdat, frecom = frecomdat, type = 'summary', 
  warn = FALSE)

# accuracy table percent
tabaccper <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'percent', 
  warn = FALSE)

# accuracy table summary
tabaccsum <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'summary', 
  warn = FALSE)
  
# completeness table
tabcom <- tabMWRcom(res = resdat, frecom = frecomdat, cens = censdat, warn = FALSE, 
  parameterwd = 1.15, noteswd = 2)

# individual accuracy checks for raw data
indflddup <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Field Duplicates', warn = FALSE, caption = FALSE)
indlabdup <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Lab Duplicates', warn = FALSE, caption = FALSE)
indfldblk <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Field Blanks', warn = FALSE, caption = FALSE)
indlabblk <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Lab Blanks', warn = FALSE, caption = FALSE)
indlabins <- tabMWRacc(res = resdat, acc = accdat, frecom = frecomdat, type = 'individual', 
  accchk = 'Lab Spikes / Instrument Checks', warn = FALSE, caption = FALSE)

# input  
datin <- list(
  frecomdat = frecomdat, 
  accdat = accdat,
  tabfreper = tabfreper,
  tabfresum = tabfresum,
  tabaccper = tabaccper,
  tabaccsum = tabaccsum,
  tabcom = tabcom,
  indflddup = indflddup,
  indlabdup = indlabdup,
  indfldblk = indfldblk,
  indlabblk = indlabblk,
  indlabins = indlabins
)

utilMWRsheet(datin)

Verify summary function

Description

Verify summary function

Usage

utilMWRsumfun(accdat, param, sumfun = "auto")
utilMWRsumfun(accdat, param, sumfun = "auto")

Arguments

`accdat`	`data.frame` for data quality objectives file for accuracy as returned by `readMWRacc`
`param`	character string for the parameter to evaluate as provided in the `"Parameter"` column of `"accdat"`
`sumfun`	character indicating one of `"auto"` (default), `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`, see details

Details

This function verifies appropriate summary functions are passed from sumfun. The mean or geometric mean output is used for sumfun = "auto" based on information in the data quality objective file for accuracy, i.e., parameters with "log" in any of the columns are summarized with the geometric mean, otherwise arithmetic. Using "mean" or "geomean" for sumfun will apply the appropriate function regardless of information in the data quality objective file for accuracy.

Value

Character indicating the appropriate summary function based on the value passed to sumfun.

Examples

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# geomean auto
utilMWRsumfun(accdat, param = 'E.coli')

# mean force
utilMWRsumfun(accdat, param = 'E.coli', sumfun = 'mean')

# mean auto
utilMWRsumfun(accdat, param = 'DO')

# geomean force
utilMWRsumfun(accdat, param = 'DO', sumfun = 'geomean')
# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# geomean auto
utilMWRsumfun(accdat, param = 'E.coli')

# mean force
utilMWRsumfun(accdat, param = 'E.coli', sumfun = 'mean')

# mean auto
utilMWRsumfun(accdat, param = 'DO')

# geomean force
utilMWRsumfun(accdat, param = 'DO', sumfun = 'geomean')

Summarize a results data frame by a grouping variable

Description

Summarize a results data frame by a grouping variable

Usage

utilMWRsummary(dat, accdat, param, sumfun = "auto", confint)
utilMWRsummary(dat, accdat, param, sumfun = "auto", confint)

Arguments

`dat`	input data frame
`accdat`	`data.frame` for data quality objectives file for accuracy as returned by `readMWRacc`
`param`	character string for the parameter to evaluate as provided in the `"Parameter"` column of `"accdat"`
`sumfun`	character indicating one of `"auto"` (default), `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`, see details
`confint`	logical if user expects a confidence interval to be returned with the summary

Details

This function summarizes a results data frame by an existing grouping variable using the function supplied to sumfun. The mean or geometric mean is used for sumfun = "auto" based on information in the data quality objective file for accuracy, i.e., parameters with "log" in any of the columns are summarized with the geometric mean, otherwise arithmetic. Using "mean" or "geomean" for sumfun will apply the appropriate function regardless of information in the data quality objective file for accuracy.

Value

A summarized data frame, a warning will be returned if the confidence interval cannot be estimated and confint = TRUE

Examples

library(dplyr)

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# fill BDL, AQL
resdat <- utilMWRlimits(resdat = resdat, accdat = accdat, param = "DO")

dat <- resdat %>% 
  group_by(`Monitoring Location ID`)
 
# summarize sites by mean 
utilMWRsummary(dat, accdat, param = 'DO', sumfun = 'auto', confint = TRUE)

# summarize sites by minimum
utilMWRsummary(dat, accdat, param = 'DO', sumfun = 'min', confint = FALSE)
library(dplyr)

# results data path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# fill BDL, AQL
resdat <- utilMWRlimits(resdat = resdat, accdat = accdat, param = "DO")

dat <- resdat %>% 
  group_by(`Monitoring Location ID`)
 
# summarize sites by mean 
utilMWRsummary(dat, accdat, param = 'DO', sumfun = 'auto', confint = TRUE)

# summarize sites by minimum
utilMWRsummary(dat, accdat, param = 'DO', sumfun = 'min', confint = FALSE)

Get threshold lines from thresholdMWR

Description

Get threshold lines from thresholdMWR

Usage

utilMWRthresh(resdat, param, thresh, threshlab = NULL)
utilMWRthresh(resdat, param, thresh, threshlab = NULL)

Arguments

`resdat`	results data as returned by `readMWRresults`
`param`	character string to first filter results by a parameter in `"Characteristic Name"`
`thresh`	character indicating if relevant freshwater or marine threshold lines are included, one of `"fresh"`, `"marine"`, or `"none"`, or a single numeric value to override the values included with the package
`threshlab`	optional character string indicating legend label for the threshold, required only if `thresh` is numeric

Value

If thresh is not numeric and thresholds are available for param, a data.frame of relevant marine or freshwater thresholds, otherwise NULL. If thresh is numeric, a data.frame of the threshold with the appropriate label from threshlabel.

Examples

# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# get threshold lines
utilMWRthresh(resdat = resdat, param = 'E.coli', thresh = 'fresh')

# user-defined numeric threshold line
utilMWRthresh(resdat = resdat, param = 'TP', thresh = 5, threshlab = 'My threshold')
# results file path
respth <- system.file('extdata/ExampleResults.xlsx', package = 'MassWateR')

# results data
resdat <- readMWRresults(respth)

# get threshold lines
utilMWRthresh(resdat = resdat, param = 'E.coli', thresh = 'fresh')

# user-defined numeric threshold line
utilMWRthresh(resdat = resdat, param = 'TP', thresh = 5, threshlab = 'My threshold')

Format the title for analyze functions

Description

Format the title for analyze functions

Usage

utilMWRtitle(
  param,
  accdat = NULL,
  sumfun = NULL,
  site = NULL,
  dtrng = NULL,
  resultatt = NULL,
  locgroup = NULL
)
utilMWRtitle(
  param,
  accdat = NULL,
  sumfun = NULL,
  site = NULL,
  dtrng = NULL,
  resultatt = NULL,
  locgroup = NULL
)

Arguments

`param`	character string of the parameter to plot
`accdat`	optional `data.frame` for data quality objectives file for accuracy as returned by `readMWRacc`
`sumfun`	optional character indicating one of `"auto"`, `"mean"`, `"geomean"`, `"median"`, `"min"`, or `"max"`
`site`	character string of sites to include
`dtrng`	character string of length two for the date ranges as YYYY-MM-DD
`resultatt`	character string of result attributes to plot
`locgroup`	character string of location groups to plot from the `"Location Group"` column in the site metadata file

Details

All arguments are optional except param, appropriate text strings are appended to the param argument for all other optional arguments indicating the level of filtering used in the plot and data summary if appropriate

Value

A formatted character string used for the title in analysis plots

Examples

# no filters
utilMWRtitle(param = 'DO')

# filter by date only
utilMWRtitle(param = 'DO', dtrng = c('2021-05-01', '2021-07-31'))

# filter by all
utilMWRtitle(param = 'DO', site = 'test', dtrng = c('2021-05-01', '2021-07-31'), 
     resultatt = 'test', locgroup = 'test')
     
# title using summary 
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')
accdat <- readMWRacc(accpth, runchk = FALSE)
utilMWRtitle(param = 'DO', accdat = accdat, sumfun = 'auto', site = 'test', 
     dtrng = c('2021-05-01', '2021-07-31'), resultatt = 'test', locgroup = 'test')
# no filters
utilMWRtitle(param = 'DO')

# filter by date only
utilMWRtitle(param = 'DO', dtrng = c('2021-05-01', '2021-07-31'))

# filter by all
utilMWRtitle(param = 'DO', site = 'test', dtrng = c('2021-05-01', '2021-07-31'), 
     resultatt = 'test', locgroup = 'test')
     
# title using summary 
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', package = 'MassWateR')
accdat <- readMWRacc(accpth, runchk = FALSE)
utilMWRtitle(param = 'DO', accdat = accdat, sumfun = 'auto', site = 'test', 
     dtrng = c('2021-05-01', '2021-07-31'), resultatt = 'test', locgroup = 'test')

Check if incomplete range in `Value Range` column

Description

Check if incomplete range in Value Range column

Usage

utilMWRvaluerange(accdat)
utilMWRvaluerange(accdat)

Arguments

accdat

data.frame for data quality objectives file for accuracy as returned by readMWRacc

Details

The function evaluates if an incomplete or overlapping range is present in the Value Range column of the data quality objectives file for accuracy

Value

A named vector of "gap", "nogap", or "overlap" indicating if a gap is present, no gap is present, or an overlap is present in the ranges provided by the value range for each parameter. The names correspond to the parameters.

Examples

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data with no checks
accdat <- readxl::read_excel(accpth, na = c('NA', ''), col_types = 'text')
accdat <- dplyr::mutate(accdat, dplyr::across(-c(`Value Range`), ~ dplyr::na_if(.x, 'na'))) 

utilMWRvaluerange(accdat)
# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data with no checks
accdat <- readxl::read_excel(accpth, na = c('NA', ''), col_types = 'text')
accdat <- dplyr::mutate(accdat, dplyr::across(-c(`Value Range`), ~ dplyr::na_if(.x, 'na'))) 

utilMWRvaluerange(accdat)

Get logical value for y axis scaling

Description

Get logical value for y axis scaling

Usage

utilMWRyscale(accdat, param, yscl = "auto")
utilMWRyscale(accdat, param, yscl = "auto")

Arguments

`accdat`	`data.frame` for data quality objectives file for accuracy as returned by `readMWRacc`
`param`	character string for the parameter to evaluate as provided in the `"Parameter"` column of `"accdat"`
`yscl`	character indicating one of `"auto"` (default), `"log"`, or `"linear"`

Value

A logical value indicating TRUE for log10-scale, FALSE for arithmetic (linear)

Examples

# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# log auto
utilMWRyscale(accdat, param = 'E.coli')

# linear force
utilMWRyscale(accdat, param = 'E.coli', yscl = 'linear')

# linear auto
utilMWRyscale(accdat, param = 'DO')

# log force
utilMWRyscale(accdat, param = 'DO', yscl = 'log')
# accuracy path
accpth <- system.file('extdata/ExampleDQOAccuracy.xlsx', 
     package = 'MassWateR')

# accuracy data
accdat <- readMWRacc(accpth)

# log auto
utilMWRyscale(accdat, param = 'E.coli')

# linear force
utilMWRyscale(accdat, param = 'E.coli', yscl = 'linear')

# linear auto
utilMWRyscale(accdat, param = 'DO')

# log force
utilMWRyscale(accdat, param = 'DO', yscl = 'log')

Package 'MassWateR'

Help Index

Analyze trends by date in results file

Description

Usage

Arguments

Details

Value

Examples

Analyze results with maps

Description

Usage

Arguments

Details

Value

Examples

Analyze outliers in results file

Description

Usage

Arguments

Details

Value

Examples

Analyze outliers in results file for all parameters

Description

Usage

Arguments

Details

Value

Examples

Analyze seasonal trends in results file

Description

Usage

Arguments

Details

Value

Examples

Analyze data by sites in results file

Description

Usage

Arguments

Details

Value

Examples

Check data quality objective accuracy data

Description

Usage

Arguments

Details

Value

Examples

Check censored data

Description

Usage

Arguments

Details

Value

Examples

Check data quality objective frequency and completeness data

Description

Usage

Arguments

Details

Value

Examples

Check water quality monitoring results

Description

Usage

Arguments

Details

Value

Examples

Check site metadata file

Description

Usage

Arguments

Details

Value

Examples

Check water quality exchange (wqx) metadata input