The goal of fecalcanuga is to provide data representing 5 months of field work from April - June 2023 in Southern Coastal BC (Canada) and in Kampala (Uganda) from January - February 2024 where household and commercial non-sewered containment data was collected, characterizing demographic, technical, environmental, physical, chemical and biological data for 22 non-sewered sanitation sites in Southern Coastal British Columbia, Canada and 19 sites in Kampala, Uganda.
Installation
You can install the development version of fecalcanuga from GitHub with:
# install.packages("devtools")
devtools::install_github("openwashdata/fecalcanuga")
## Run the following code in console if you don't have the packages
## install.packages(c("dplyr", "knitr", "readr", "stringr", "gt", "kableExtra"))
library(dplyr)
library(knitr)
library(readr)
library(stringr)
library(gt)
library(kableExtra)Alternatively, you can download the individual datasets as a CSV file from the table below.
| dataset | CSV |
|---|---|
| containment | Download CSV |
| household_survey | Download CSV |
| phys_chem_parameter | Download CSV |
Data
The package provides access to 3 datasets: containment, household_survey, and phys_chem_parameter.
containment
The dataset containment contains data about the technical data (containment size, volume, fecal sludge depth, etc.) for each non-sewered sanitation system containment that was sampled as a part of this field work. It has 41 observations and 10 variables
containment |>
head(3) |>
gt::gt() |>
gt::as_raw_html()| sample_id | location_id | date | area | sludge_depth | containment_depth | containment_volume | fs_volume | accumulation_rate | scum_depth |
|---|---|---|---|---|---|---|---|---|---|
For an overview of the variable names, see the following table.
| variable_name | variable_type | description |
|---|---|---|
| sample_id | factor | unique identifier for each sampling location in the format ##-Location-ddmmyyyy |
| location_id | factor | unique location identifier for each sampling day in the format of (#)-day of sampling (a) - location in that day |
| date | c(“POSIXct”, “POSIXt”) | date of sampling |
| area | numeric | surface area of liquid surface in containment m^2 |
| sludge_depth | numeric | depth in meters, of where from the top of the sludge the sample was taken |
| containment_depth | numeric | depth, in meters, from the top of the containment to the bottom of the containment |
| containment_volume | numeric | total volume of the containment in m^3 |
| fs_volume | numeric | total volume of the sludge in containment in m^3 |
| accumulation_rate | numeric | the accumulation rate in litres/capita -year (based on total volume and number of users) |
| scum_depth | numeric | the depth of the top scum layer in centimeters |
household_survey
The dataset household_survey contains data about the collected household / institutional survey data for each location where a non-sewered sanitation containment was sampled. This includes demographic infromation, operational and maintenance information and some technical and environmental parameters. It has 41 observations and 39 variables
household_survey |>
head(3) |>
gt::gt() |>
gt::as_raw_html()| sample_id | location_id | date | local_area_name | establishment_type | users | last_emptied | shared_toilet | rent_or_own | containment | lining | lining_material | change_in_liquid_level | baffles | outflow | outflow_location | toilet_type | anal_cleansing_material | paper | water | additives | frequency | chemicals | wastewater_type | toilet | bathing | laundry | kitchen | water_connection | tap_inside_building | standpipe | containment_age | containment_constructed | containment_volume | fully_emptied | emptying_interval | rainy_season | solid_waste | type |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
For an overview of the variable names, see the following table.
| variable_name | variable_type | description |
|---|---|---|
| sample_id | factor | unique identifier for each sampling location in the format ##-Location-ddmmyyyy |
| location_id | factor | unique location identifier for each sampling day in the format of (#)-day of sampling (a) - location in that day |
| date | c(“POSIXct”, “POSIXt”) | date of sampling |
| local_area_name | factor | community / area name that locals commonly use where samples were taken |
| establishment_type | factor | type of establishment where sampling occurred (household, commercial, school, office building, other) |
| users | numeric | average number of users for the sanitation system that was sampled |
| last_emptied | numeric | in years how long ago from sampling date the containment was last emptied |
| shared_toilet | factor | Is the toilet(s) connected to the sanitation containment shared or not (Yes/No) |
| rent_or_own | factor | Is the establishment where the containment is rented or owned by the head of household? |
| containment | factor | The type of containment (septic tank, pit latrine, treatment plant unit, etc.) |
| lining | factor | The type of containment lining (fully lined, unlined, don’t know) |
| lining_material | factor | The material the lining is made of if it is lined (fiberglass, concrete, PVC, unlined, etc.) |
| change_in_liquid_level | factor | Is there a change in liquid level due to seasonal variation (Y/N) |
| baffles | factor | Are there baffles in the containment (Y/N/I don’t know) |
| outflow | factor | Is there an outflow for the containment (Y/N/I don’t know) |
| outflow_location | factor | If there is an outflow, where is it located (i.e., leech field) |
| toilet_type | factor | What is the type of toilet connected to the sanitation system (cistern flush, pour-flush, etc.) |
| anal_cleansing_material | factor | What type of material is used for anal cleansing (water, paper, both) |
| paper | numeric | Is paper used for anal cleansing (Yes = 1 | No = 0) |
| water | numeric | Is water used for anal cleansing (Yes = 1 | No = 0) |
| additives | factor | Are there additives added to the containment (Y/N/I don’t know) |
| frequency | factor | If additives are added, in what frequency? |
| chemicals | factor | What chemicals are added to the sanitation system? |
| wastewater_type | factor | What are the types of wastewater that the containment collects (toilet, kitchen, laundry, bathing) |
| toilet | numeric | Does the containment collect this type of wastewater (Yes = 1 | No = 0) |
| bathing | numeric | Does the containment collect this type of wastewater (Yes = 1 | No = 0) |
| laundry | numeric | Does the containment collect this type of wastewater (Yes = 1 | No = 0) |
| kitchen | numeric | Does the containment collect this type of wastewater (Yes = 1 | No = 0) |
| water_connection | factor | What type of water connection does the establishment where sampling is occurring have (tap inside building, standpipe, other)? |
| tap_inside_building | numeric | Does the establishment have this type of water connection (Yes = 1 | No = 0) |
| standpipe | numeric | Does the establishment have this type of water connection (Yes = 1 | No = 0) |
| containment_age | factor | What is the age (in years) of the containment system? |
| containment_constructed | factor | Who constructed the containment system (Technician, professional engineering, myself, don’t know)? |
| containment_volume | numeric | What (in m^3) is the total volume of the containment? |
| fully_emptied | factor | When the system was last emptied was it emptied fully? (Y/N/I don’t know) |
| emptying_interval | numeric | What is the average / typical emptying interval (in years) of the containment? |
| rainy_season | factor | Is it currently the rainy season? (Y/N) |
| solid_waste | factor | Does the containment contain solid waste (i.e., garbage) (Y/N) |
| type | factor | If there is solid waste, what type is it? |
phys_chem_parameter
The dataset phys_chem_parameter contains data about the measured in situ and analyzed in laboratory physical, chemical and biological parameters pertaining to each containment sampled as well as at different veritcal locations in each containment (i.e., Top of containment, middle and bottom). It has 119 observations and 29 variables
phys_chem_parameter |>
head(3) |>
gt::gt() |>
gt::as_raw_html()| sample_id | location_id | date | depth_id | sludge_depth | temperature | DO | pH | ORP | EC | COD | soluble_COD | sulphide | total_nitrogen | nitrite | nitrate | ammonia | TKN | ortho_phosphorous | total_phosphorous | BOD | TOC | ts | vs | vs_percent | sand_content | tss | vss | vss_tss |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
For an overview of the variable names, see the following table.
| variable_name | variable_type | description |
|---|---|---|
| sample_id | factor | unique identifier for each sampling location in the format ##-Location-ddmmyyyy |
| location_id | factor | unique location identifier for each sampling day in the format of (#)-day of sampling (a) - location in that day |
| date | c(“POSIXct”, “POSIXt”) | date of sampling |
| depth_id | factor | unique identifier for each sample indicating the location in the containment (T = top, M = middle, B = bottom) |
| sludge_depth | numeric | depth in meters, of where from the top of the sludge the sample was taken |
| temperature | numeric | Temperature of the sludge at the insitu sampling location (in degrees Celsius) |
| DO | numeric | dissolved oxygen concentration in mg/L at the insitu sampling location |
| pH | numeric | pH value at the insitu sampling location |
| ORP | numeric | oxidation reduction potential, measured in millivolts, at the insitu sampling locations |
| EC | numeric | electrical conductivity, measured in microSiemens, at the insitu sampling locations |
| COD | numeric | Chemical oxygen demand, mg/L |
| soluble_COD | numeric | Soluble chemical oxygen demand, mg/L |
| sulphide | numeric | Sulphide, mg/L |
| total_nitrogen | numeric | Total Nitrogen, mg/L |
| nitrite | numeric | Nitrite, mg/L |
| nitrate | numeric | Nitrate, mg/L |
| ammonia | numeric | Ammonia, mg/L |
| TKN | numeric | Total Kjeldahl Nitrogen, mg/L |
| ortho_phosphorous | numeric | Ortho Phosphorous, mg/L |
| total_phosphorous | numeric | Total Phosphorous, mg/L |
| BOD | numeric | Biochemical oxygen demand, mg/L |
| TOC | numeric | Total organic carbon, mg/L |
| ts | numeric | Total solids, g/L |
| vs | numeric | Volatile Solids, g/L |
| vs_percent | numeric | Volatile Solids, % total solids |
| sand_content | numeric | Sand content, g/L |
| tss | numeric | Total suspended solids, g/L |
| vss | numeric | Volatile suspended solids, g/L |
| vss_tss | numeric | VSS/TSS (ratio) |
Example
# Plot of Volatile Suspended Solids (VSS) versus Chemical Oxygen Demand (COD) for all samples. This plot demonstrates how changes in organic loads (COD) can influence biomass (VSS)
library(fecalcanuga)
library(ggplot2)
# Filter out rows with NA values in COD or vss
filtered_data <- phys_chem_parameter %>%
filter(!is.na(COD) & !is.na(vss))
# Plot VSS vs COD
ggplot(filtered_data, aes(x = vss, y = COD)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
theme_minimal() +
labs(title = "VSS vs COD",
x = "VSS (mg/L)",
y = "COD (mg/L)")
License
Data are available as CC-BY.
Citation
Please cite this package using:
citation("fecalcanuga")
#> To cite package 'fecalcanuga' in publications use:
#>
#> Shaw K, Dorea C, Strande L, Niwagaba C, Zhong M (2026). "fecalcanuga:
#> Demographic, Environmental, Technical and Physio-chemical Data on Non
#> Sewered Sanitation Containments in Rural Canada and Urban Uganda."
#> doi:10.5281/zenodo.20795917
#> <https://doi.org/10.5281/zenodo.20795917>.
#> <https://github.com/openwashdata/fecalcanuga>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Misc{shaw_etall:2026,
#> title = {fecalcanuga: Demographic, Environmental, Technical and Physio-chemical Data on Non Sewered Sanitation Containments in Rural Canada and Urban Uganda},
#> author = {Kelsey Shaw and Caetano Dorea and Linda Strande and Charles Niwagaba and Mian Zhong},
#> year = {2026},
#> doi = {10.5281/zenodo.20795917},
#> url = {https://github.com/openwashdata/fecalcanuga},
#> abstract = {This data package contains data representing 5 months of field work from April - June 2023 (Canada) and January - February 2024 (Uganda) collecting household and commercial wastewater containment data and characterizing physical, chemical and greenhouse gas data for 22 non-sewered sanitation sites on southern Vancouver island and the southern gulf islands in British Columbia, Canada and 19 sites in Kampala, Uganda.},
#> version = {0.1.0},
#> }