Generate a semi-unique string ID for a protocol (PROTOCOL_ID) based on its type, name, the campaign name, and a sequence number; used as a key between Methods and Measurements table.
Source: R/fct_generate_IDs.R
generate_protocol_id.RdGenerates a standardised protocol identifier based on protocol type, name, sequence number, and campaign. The function is fully vectorised and can handle multiple protocols simultaneously.
Arguments
- protocol_type
Character vector. The category of protocol (e.g., "Sampling Protocol", "Analytical Protocol"). Must match one of the predefined categories or will default to "X".
- protocol_name
Character vector. The specific name/method within the protocol category. Will be abbreviated and cleaned for ID generation.
- sequence_number
Numeric vector. Sequential number for protocols within the same category. Defaults to 1. Will be zero-padded to 2 digits.
- campaign_name
Character vector. Name of the campaign/study. Will be abbreviated to first 10 alphanumeric characters. Defaults to "".
Value
Character vector of protocol IDs with format:
TypeCodeSequenceNumber_AbbreviatedName_CampaignAbbrev
Where:
TypeCode: S (Sampling), F (Fractionation), E (Extraction), A (Analytical), X (Unknown)
SequenceNumber: Zero-padded 2-digit number
AbbreviatedName: Up to 15 alphanumeric characters (optional)
CampaignAbbrev: Up to 10 alphanumeric characters (optional)
Details
Edge cases:
NULL or empty values result in appropriate defaults
Invalid protocol types default to "X"
Names are cleaned of special characters and spaces
Empty components are omitted from the final ID
See also
Other generate_id:
abbreviate_string(),
generate_reference_id(),
generate_sample_id_with_components(),
protocol_id_regex(),
sample_id_regex()
Other method:
analytical_protocols_vocabulary(),
example_methods_tibble(),
extraction_protocols_vocabulary(),
fractionation_protocols_vocabulary(),
initialise_methods_tibble(),
protocol_categories_vocabulary(),
protocol_id_regex(),
protocol_options_vocabulary(),
sampling_protocols_vocabulary()
Examples
# Single protocol
generate_protocol_id("Sampling Protocol", "Water Sample", 1, "MyStudy")
#> [1] "S01_WaterSample_MyStudy"
# Returns: "S01_WaterSample_MyStudy"
# Multiple protocols (vectorised)
types <- c("Sampling Protocol", "Analytical Protocol")
names <- c("Water Sample", "LC-MS Analysis")
sequences <- c(1, 2)
generate_protocol_id(types, names, sequences, "Study2024")
#> [1] "S01_WaterSample_Study2024" "A02_LCMSAnalysis_Study2024"
# Returns: c("S01_WaterSample_Study2024", "A02_LCMSAnalysis_Study2024")