Skip to contents

'data_dict' returns a table of information associated with each curated feature (not prefixed with "uncurated_") in sampleMetadata. This includes column names, data types, descriptions, allowed values, and whether fields are required or allow multiple values.

Usage

data_dict()

Value

Data frame: A table of metadata feature information with columns: 'ColName', 'ColClass', 'Unique', 'Required', 'MultipleValues', 'Description', 'AllowedValues', 'Delimiter', 'Separater', 'DynamicEnum', and 'DynamicEnumProperty'.

See also

Examples

# View the sample metadata data dictionary
metadata_info <- data_dict()
head(metadata_info)
#>            ColName  ColClass     Unique Required MultipleValues
#> 1       study_name character non-unique optional          FALSE
#> 2       subject_id character non-unique required          FALSE
#> 3        sample_id character     unique required          FALSE
#> 4 target_condition character non-unique required           TRUE
#> 5          control character non-unique required          FALSE
#> 6        body_site character non-unique required          FALSE
#>                                                                                                                                                              Description
#> 1                                                                                                                                                          Dataset name.
#> 2                                                                                                                                                    Subject identifier.
#> 3                                                                                                                                                     Sample identifier.
#> 4                                                                              The primary phenotype/condition of interest in the study from which the sample is derived
#> 5                                                                                                          Whether the sample is control, case, or not used in the study
#> 6 Named locations of or within the body. The anatomical location(s) affected by the patient's disease/condition/cancer, often the site from which the sample was derived
#>                                                                                                       AllowedValues
#> 1 [a-zA-Z-]+_[0-9]{4}|[a-zA-Z-]+_[0-9]{4}[a-zA-Z-]+|[a-zA-Z-]+_[0-9]{4}_[a-zA-Z-]+|[a-zA-Z-]+_[0-9]{4}_[a-zA-Z0-9]+
#> 2                                                                                                   [0-9a-zA-Z]\\S+
#> 3                                                                                                   [0-9a-zA-Z]\\S+
#> 4                                                                                                              <NA>
#> 5                                                                                       Study Control;Case;Not Used
#> 6                                                         feces;milk;nasal cavity;oral cavity;skin epidermis;vagina
#>   Delimiter Separater            DynamicEnum DynamicEnumProperty
#> 1      <NA>        NA                   <NA>                <NA>
#> 2      <NA>        NA                   <NA>                <NA>
#> 3      <NA>        NA                   <NA>                <NA>
#> 4         ;        NA NCIT:C7057;EFO:0000408          descendant
#> 5      <NA>        NA                   <NA>                <NA>
#> 6      <NA>        NA                   <NA>                <NA>

# Check required fields
required_fields <- metadata_info[metadata_info$Required == "required", ]
required_fields$ColName
#> [1] "subject_id"       "sample_id"        "target_condition" "control"         
#> [5] "body_site"        "curator"          "curation_id"