Convert tidy data into R packages with documentation websites and intelligent AI descriptions.
Installation
# install.packages("devtools")
devtools::install_github("openwashdata/fairenough")
Usage
One-Click Pipeline
For the simplest experience, move your datasets in an empty directory, then run the main fairenough()
function:
library(fairenough)
# Set up your LLM chat object
Sys.setenv(OPENAI_API_KEY = "YOUR_API_TOKEN_HERE")
chat <- elmer::chat_openai(model = "gpt-4o-mini", api_args = list(temperature = 0.3))
# Run the complete pipeline with one command
fairenough(chat)
This automatically: - Sets up your R package structure - Processes all CSV/Excel files in the directory - Collects package metadata through interactive prompts - Builds documentation, citation files, README, and website - Generates intelligent data documentation using LLMs - Prepares your package for publication
Granular Control
For step-by-step control, use individual wrapper functions:
library(fairenough)
# Step 1: Initialize project structure
setup()
# Step 2: Process your data files
process()
# Step 3: Collect package metadata
collect()
# Step 4: Generate LLM-powered documentation
chat <- elmer::chat_openai(model = "gpt-4o-mini", api_args = list(temperature = 0.3))
generate(chat)
# Step 5: Build all package components
build()
Features
fairenough provides a complete pipeline for R data package creation, following these logical steps:
fairenough()
- One-Click Pipeline
Complete R data package creation with a single
fairenough(chat)
callAutomated workflow from tidy data to finished package
Args: Supports all flags that can be passed to the individual wrapper functions
Granular Control Options
Individual wrapper functions:
setup()
,process()
,collect()
,generate()
,build()
Global support for args:
overwrite = TRUE
andverbose = TRUE
1. setup()
- R Project Setup
R package structure initialization with
usethis
Directory and files organization (data_raw, .gitignore, etc.)
2. process()
- Automated Data Processing
Processes all your tidy data with
readr
,readxl
Validates data structure and formats
Arg:
auto_clean = TRUE
for automated minimal data cleaning and tidying
3. collect()
- Interactive Metadata Collection
Guided prompts for package metadata (title, description, authors, etc.) using
cli
Saves directly to DESCRIPTION file with
desc
Arg:
extended = TRUE
for comprehensive documentation
4. generate()
- LLM-Powered Documentation
Data dictionary generation using
elmer
chat/LLM integrationVariable descriptions uses the package’s description as context + actual data samples
Arg:
contex = "YOUR DATASETS CONTEXT"
5. build()
- Complete Package Infrastructure
Roxygen documentation generation with
roxygen2
Citation file creation with validation using
cffr
README generation with
rmarkdown
Package website building ready for deployment
References
Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer Archipelago (Antarctica) penguin data. R package version 0.1.0. https://allisonhorst.github.io/palmerpenguins/. doi: 10.5281/zenodo.3960218.
Contributions
Welcome!
Philosophy
Conventional commits! https://www.conventionalcommits.org/.
Feature functions must maintain architectural consistency. This means that aspects like supported formats and path handling should be uniform across all functions. For more details, refer to R/utils.R.
To test new functions:
- Due to their utility outside of {fairenough},
gendict.R
,build_license.R
andpromptit.R
have been kept general and not tied to the package’s architecture.