The goal of waschoolpiracema
is to describe the profile of schools from the basic education system in the municipality of Piracema (Minas Gerais, Brazil). Moreover, the data compare the characteristics of schools, with a special concern to WASH, pre- (2020), peri- (2021) and post-COVID-19 pandemic (2022) to evaluate to what extend schools in Piracema made progress in providing WASH since the beginning of the COVID-19 pandemic.
Installation
You can install the development version of waschoolpiracema from GitHub with:
# install.packages("devtools")
devtools::install_github("openwashdata/waschoolpiracema")
## Run the following code in console if you don't have the packages
## install.packages(c("dplyr", "knitr", "readr", "stringr", "gt", "kableExtra"))
library(dplyr)
library(knitr)
library(readr)
library(stringr)
library(gt)
library(kableExtra)
Alternatively, you can download the individual datasets as a CSV or XLSX file from the table below.
dataset | CSV | XLSX |
---|---|---|
waschoolpiracema | Download CSV | Download XLSX |
Data
The municipality of Piracema is located in the southeast region of Brazil, in the state of Minas Gerais. Piracema is a small size city adding up to 6,700 inhabitants (IBGE, 2023). Among all Brazilian territories, it ranks as the 3,734º smallest municipality out of 5,570 and in Minas Gerais it ranks as the 492º out of 853 (IBGE, 2023).
Piracema is located approximately 120 km away from the capital of its state (Belo Horizonte), and it is inaccessible by public transportation (IBGE, 2023). Piracema will be the study area for the next phase of the research (collection of primary data).
waschoolpiracema
The dataset waschoolpiracema
contains data about the water supply, the sewage disposal, the waste collection and the sanitary equipment of the schools in Piracema. It also provides information about gender, race and education levels of the school’s students. It has 21 observations and 36 variables
waschoolpiracema |>
head(3) |>
gt::gt() |>
gt::as_raw_html()
year | sch_id | admin | loc | drink_water | public_water | borehole_water | well_water | surface_water | no_water | sewage_rede_publica | sewage_fossa_septica | waste_servico_coleta | waste_queima | waste_enterra | waste_destino_final_publico | waste_descarta_outra_area | sanitary | sanitary_ei | sanitary_pne | sanitary_funcionarios | sanitary_chuveiro | qt_mat_bas | pc_girl | pc_boy | pc_white | pc_brown | pc_black | pc_indian | pc_asian | pc_nd | pc_cre | pc_pre | pc_prim_1 | pc_prim_2 | pc_sec |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
For an overview of the variable names, see the following table.
variable_name | variable_type | description |
---|---|---|
year | double | Year of Survey |
sch_id | double | Numerical code of the school |
admin | factor | The administration of the school is federal (1); state (2); municipal (3), or private (4). Federal, state, and municipal schools are considered public. |
loc | factor | The school is located in an urban (1) or rural area (2) |
drink_water | logical | The school provides drinking water with quality suitable for human consumption (i.e., ingestion, preparation, and production of food) according to the Brazilian national water quality standards (former Portaria nº 2.914/2011 now Portaria de Consolidação nº5/2017) |
public_water | logical | The water in the school is supplied by a public network . |
borehole_water | logical | The water in the school is supplied by a borehole |
well_water | logical | The water in the school is supplied by a cacimba, cistern, or well |
surface_water | logical | The water in the school is supplied by surface water source |
no_water | logical | There is no water supply in the school |
sewage_rede_publica | logical | The school dispose their sewage into a public sewerage system |
sewage_fossa_septica | logical | The school dispose their sewage into septic tank |
waste_servico_coleta | logical | The solid waste in the school is regularly collected by the public cleaning service |
waste_queima | logical | The solid waste in the school is disposed in an area licensed by environmental agencies, intended to receive solid waste in a planned manner (e.g., landfills) |
waste_enterra | logical | The solid waste in the school is burned or incinerated |
waste_destino_final_publico | logical | The solid waste in the school is buried |
waste_descarta_outra_area | logical | The solid waste in the school is disposed in another area (none of the other options) |
sanitary | logical | The school is equipped with sanitary facilities for personal hygiene/physiological needs |
sanitary_ei | logical | The school is equipped with sanitary facilities for children 0 to 5 years old |
sanitary_pne | logical | The school is equipped with disability-friendly sanitary facilities following the national guidelines (ABNT - NBR 9050) |
sanitary_funcionarios | logical | The school is equipped with sanitary facilities for personal hygiene/physiological needs exclusively for staff |
sanitary_chuveiro | logical | The school is equipped with sanitary facilities or changing room or washing room with appropriate equipment (shower) for bathing, exclusively for students |
qt_mat_bas | double | Total number of students per school |
pc_girl | double | Percentage of girls per school |
pc_boy | double | Percentage of boys per school |
pc_white | double | Percentage of students that are classified or self-identified as white race/skin color per school |
pc_brown | double | Percentage of students that are classified or self-identified as black race/skin color per school |
pc_black | double | Percentage of students that are classified or self-identified as brown race/skin color per school |
pc_indian | double | Percentage of students that are classified or self-identified as Asian race/skin color per school |
pc_asian | double | Percentage of students that are classified or self-identified as indigenous race/skin color per school |
pc_nd | double | Percentage of students that did not declared race/skin color per school |
pc_cre | double | Percentage of students in daycare (0 - 3 years old) |
pc_pre | double | Percentage of students in preschool (4 - 5 years old) |
pc_prim_1 | double | Percentage of students in primary education first cycle (6 - 10 years old) |
pc_prim_2 | double | Percentage of students in primary education second cycle (11 - 14 years old) |
pc_sec | double | Percentage of students in secondary education (15 - 18 years old) |
Examples
# Load necessary libraries
library(waschoolpiracema)
library(ggplot2)
library(dplyr)
library(tidyr)
# Load the dataset
load("data/waschoolpiracema.rda")
# Convert admin to a factor with descriptive labels
waschoolpiracema$admin <- factor(waschoolpiracema$admin, levels = c(1, 2, 3, 4),
labels = c("Federal", "State", "Municipal", "Private"))
# Create the plot
ggplot(waschoolpiracema, aes(x = qt_mat_bas, y = pc_girl, color = as.factor(admin))) +
geom_point() +
labs(title = "Percentage of Girls vs Total Number of Students per School",
x = "Total Number of Students",
y = "Percentage of Girls",
color = "Administration Type") +
theme_minimal()
# Summarize the data to get average percentages per year
summary_data <- waschoolpiracema %>%
group_by(year) %>%
summarise(avg_pc_girl = mean(pc_girl, na.rm = TRUE),
avg_pc_boy = mean(pc_boy, na.rm = TRUE))
# Create the plot
ggplot(summary_data, aes(x = year)) +
geom_line(aes(y = avg_pc_girl, color = "Girls")) +
geom_line(aes(y = avg_pc_boy, color = "Boys")) +
labs(title = "Average Percentage of Girls and Boys over the Years",
x = "Year",
y = "Average Percentage",
color = "Gender") +
theme_minimal()
# List of columns related to sanitary, sewage, and waste facilities
sanitary_sewage_waste_cols <- c(
"sanitary", "sanitary_ei", "sanitary_pne", "sanitary_funcionarios", "sanitary_chuveiro",
"sewage_rede_publica", "sewage_fossa_septica",
"waste_servico_coleta", "waste_queima", "waste_enterra", "waste_destino_final_publico", "waste_descarta_outra_area"
)
# Convert relevant columns to integers
waschoolpiracema[sanitary_sewage_waste_cols] <- lapply(waschoolpiracema[sanitary_sewage_waste_cols], function(x) as.integer(x))
# Summarize the waschoolpiracema data to get the count and percentage of schools with facilities per year
summary_data <- waschoolpiracema %>%
group_by(year) %>%
summarise(across(all_of(sanitary_sewage_waste_cols), ~ mean(.x, na.rm = TRUE))) %>%
pivot_longer(cols = sanitary_sewage_waste_cols, names_to = "facility", values_to = "percentage")
# Create the plot
ggplot(summary_data, aes(x = factor(year), y = percentage, fill = facility)) +
geom_bar(stat = "identity", position = "dodge") +
labs(title = "Percentage of Schools with Sanitary, Sewage, and Waste Facilities by Year",
x = "Year",
y = "Percentage of Schools",
fill = "Facility Type") +
theme_minimal() +
scale_y_continuous(labels = scales::percent)
# List of columns related to race
race_cols <- c("pc_white", "pc_brown", "pc_black", "pc_indian", "pc_asian", "pc_nd")
# Summarize the waschoolpiracema data to get the average percentage of students per race per year
summary_data <- waschoolpiracema %>%
group_by(year) %>%
summarise(across(all_of(race_cols), ~ mean(.x, na.rm = TRUE))) %>%
pivot_longer(cols = race_cols, names_to = "race", values_to = "percentage")
# Create the line plot
ggplot(summary_data, aes(x = factor(year), y = percentage, color = race, group = race)) +
geom_line(size = 1.2) +
geom_point(size = 3) +
labs(title = "Evolution of Racial Composition of Students Over the Years",
x = "Year",
y = "Average Percentage of Students",
color = "Race") +
theme_bw() + # Use a different theme for better visualization
theme(
plot.title = element_text(hjust = 0.5, size = 16, face = "bold"),
axis.title = element_text(size = 14),
axis.text = element_text(size = 12),
legend.title = element_text(size = 14),
legend.text = element_text(size = 12)
) +
scale_color_brewer(palette = "Set1")
water_cols <- c("drink_water", "public_water", "borehole_water", "well_water", "surface_water", "no_water")
# Convert relevant columns to integers
waschoolpiracema[water_cols] <- lapply(waschoolpiracema[water_cols], function(x) as.integer((x)))
# Summarize the data to get the count and percentage of schools with each type of water supply per year
summary_data <- waschoolpiracema %>%
group_by(year) %>%
summarise(across(all_of(water_cols), ~ sum(.x, na.rm = TRUE))) %>%
pivot_longer(cols = water_cols, names_to = "water_supply", values_to = "count")
# Create the stacked bar plot
ggplot(summary_data, aes(x = factor(year), y = count, fill = water_supply)) +
geom_bar(stat = "identity") +
labs(title = "Distribution of Water Supply Types in Schools Over the Years",
x = "Year",
y = "Number of Schools",
fill = "Water Supply Type") +
theme_minimal() +
theme(
plot.title = element_text(hjust = 0.5, size = 16, face = "bold"),
axis.title = element_text(size = 14),
axis.text = element_text(size = 12),
legend.title = element_text(size = 14),
legend.text = element_text(size = 12)
) +
scale_fill_brewer(palette = "Set3")
Capstone Project
This dataset is shared as part of a capstone project in Data Science for openwashdata. For more information about the project and to explore further insights, please visit the project page at https://ds4owd-001.github.io/project-poaguek/ (to be public available)
This study is a sub-project of a PhD project. It is also an initial study comparing the BNSC from 2020 and 2021(#TODO: add reference). Findings will be essential for the next phase of the research, which will be the collection of primary data in schools in the municipality of Piracema through qualitative methods (interviews, on-spot observations and art-based research).
License
Data are available as CC-BY.
Citation
Please cite this package using:
citation("waschoolpiracema")
#> To cite package 'waschoolpiracema' in publications use:
#>
#> Tabin A, Poague K, Zhong M (2024). "waschoolpiracema: WASH in Schools
#> in Piracema, Brazil." doi:10.5281/zenodo.12701107
#> <https://doi.org/10.5281/zenodo.12701107>,
#> <https://github.com/openwashdata/waschoolpiracema>.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Misc{tabin_etall:2024,
#> title = {waschoolpiracema: WASH in Schools in Piracema, Brazil},
#> author = {Alexis Tabin and Kasandra Poague and Mian Zhong},
#> year = {2024},
#> doi = {10.5281/zenodo.12701107},
#> url = {https://github.com/openwashdata/waschoolpiracema},
#> abstract = {The main goal of this study was to describe the profile of schools from the basic education system in the municipality of Piracema (Minas Gerais, Brazil). Moreover, we also aimed to compare the characteristics of schools, with a special concern to WASH, pre- (2020), peri- (2021) and post-COVID-19 pandemic (2022) to evaluate to what extend schools in Piracema made progress in providing WASH since the beginning of the COVID-19 pandemic. This study is a sub-project of a PhD project and an initial study comparing the BNSC from 2020 and 2021 has already been conducted and published by the author of this project (for more details see references). Findings will be essential for the next phase of the research, which will be the collection of primary data in schools in the municipality of Piracema through qualitative methods (interviews, on-spot observations and art-based research).},
#> keywords = {brazil,opendata,openwashdata,piracema,r,sanitation,wash},
#> version = {0.0.1},
#> }