The STOPeData app uses what the developers of Golem call the “stratégie du petit r”; that is, almost all the app’s data is stored in a big reactiveValues object in the main server function.
initialise_userData <- function() {
list(
ENTERED_BY = character(0),
# Standard validated data ----
# All userData and module_state$data data is stored in a tabular (tibble) format centrally, even for campaign and reference (which currently only have one row)
# This means we can use a consistent set of functions to check for presence (nrow(tibble) > 0), and not have any nasty surprises when we expect one and get the other
# Data entry modules
sitesData = initialise_sites_tibble(),
sitesDataValid = FALSE,
parametersData = initialise_parameters_tibble(),
parametersDataValid = FALSE,
compartmentsData = initialise_compartments_tibble(),
compartmentsDataValid = FALSE,
referenceData = initialise_references_tibble(),
referenceDataValid = FALSE,
campaignData = initialise_campaign_tibble(),
campaignDataValid = FALSE,
methodsData = initialise_methods_tibble(),
methodsDataValid = FALSE,
samplesData = initialise_samples_tibble(),
samplesDataValid = FALSE,
biotaData = initialise_biota_tibble(),
biotaDataValid = FALSE,
samplesDataWithBiota = tibble(NULL),
measurementsData = initialise_measurements_tibble(),
measurementsDataValid = FALSE,
# CREED Data
datasetDetails = tibble(NULL),
creedRelevance = initialise_CREED_data_tibble(),
creedReliability = initialise_CREED_data_tibble(),
creedScores = initialise_CREED_scores_tibble(),
creedReport = "",
# CREED reactive objects that just exist to trigger reactivity. Probably bad coding!
creedGetData = 0, # watched by multiple observers in nested CREED modules. +1 every time we input$get_data in mod_CREED
creedCalculateScores = 0, # same
# LLM extracted data and metadata ----
schemaLLM = "",
promptLLM = "",
rawLLM = "",
pdfPath = NULL,
campaignDataLLM = tibble(NULL),
referenceDataLLM = tibble(NULL),
sitesDataLLM = tibble(NULL),
parametersDataLLM = tibble(NULL),
compartmentsDataLLM = tibble(NULL),
methodsDataLLM = tibble(NULL),
samplesDataLLM = tibble(NULL),
biotaDataLLM = tibble(NULL),
samplesDataLLM = tibble(NULL),
# LLM extraction status flags ----
llmExtractionComplete = FALSE, # tracks if the LLM data extraction process has completed, or the user has pressed the dummy data button
llmExtractionSuccessful = FALSE, # tracks if the LLM data extraction process (or dummy data) returned a tibble in the expected format
llmPopulateModules = FALSE, # tracks if the user has sent LLM data to modukles
llmExtractionComments = tibble(NULL),
# Import data from save status flags ----
saveExtractionComplete = FALSE,
saveExtractionSuccessful = FALSE
)
}This has its pros and cons.
- Pro: No need to write specific code to for each transfer of data between modules
- Pro: Data can be initialised using a list function, as above. makes testing easier since we can mock the app data structure outside of a shiny session
- Pro: Better centralised overview of flags, etc.
- Con: Assigning to a new value in a list creates it, so there’s no way to defend against e.g. typos in flags, which will create a new flag (that doesn’t do anything)
- Con: At least the way I’m using it, leads to long, difficult-to-read variable names like
session$userData$reactiveValues$sitesData - Con: Our checks for data presence are fairly messy, because we have to use
!is.null(data) && nrow(data) != 0for tibbles, but different checks for strings, vectors, etc.
It would probably be much safer to use an R6 object because then we could defend against invalid inputs. And I have actually done this in another branch, but it’s not up to date with other changes, and I think it’s something I’ll save for when I (perhaps) have more time and money.