The Studies Table
howto_studies_table.RmdThis table captures information about the provenance of the data. An example of a correctly formatted Studies table is given below (scroll too see the whole table):
| study_id | study_label | description | access_level | contributors | reference | reference_year | PMID |
|---|---|---|---|---|---|---|---|
| Dama_2017 | Reduced ex vivo susceptibility of Plasmodium falciparum after oral artemether-lumefantrine treatment in Mali | NA | public | Dama et al. | https://pubmed.ncbi.nlm.nih.gov/28148267/ | 2017 | 28148267 |
| Asua_2019 | Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda | NA | public | Asua et al. | https://pubmed.ncbi.nlm.nih.gov/30559133/ | 2019 | 30559133 |
Fields and requirements
Mandatory fields are shown in blue.
| Column | Type | Notes |
|---|---|---|
| study_id | Character string. Must be a valid identifier (see below). | The private key for this study. |
| study_label |
Character string or NA
|
A few words to identify the study, for example the title of an academic paper. |
| description |
Character string or NA
|
A longer space to describe any relevant study details (free text). |
| access_level |
Character string. One of {'public',
'restricted', 'private'}.
|
The access level of the data. Private data are allowed, but it is down to the user to ensure the resulting STAVE object is not shared beyond those with permissions. |
| contributors |
Character string or NA
|
List of key contributors, for example authors of an academic paper. |
| reference | Character string | A pointer that unambiguously defines the data source. For example a URL to a published paper, or for unpublished data a permanent path to the location where the data are stored. This is one of the most important fields in maintaining data provenance - please complete it as best you can. |
| reference_year |
Numeric or NA
|
The calendar year of the source reference. |
| PMID |
Numeric or NA
|
For academic papers, the unique PubMed ID. This optional field can be very useful for data de-duplication. |
Valid identifiers
All relational keys must be valid identifiers. This means they:
- Contain only English letters (uppercase or lowercase), numbers (0-9), or underscores (_).
- Do not begin with a number or an underscore.
Beyond these restrictions, any naming convention can be used. However, it is recommended to adopt a systematic approach to avoid potential conflicts. For instance, using generic IDs like “study1” is not a good idea, as such IDs could overlap with those from other datasets (although STAVE will prevent you from appending a study with an ID matching an existing loaded study). A better approach is to use a concise, descriptive format, such as the first author’s surname and the year of publication for an academic paper, e.g., Bloggs_2024.
The next page shows how to specify location and time data in the Surveys table.