The Surveys Table
howto_surveys_table.RmdThis table captures information about the context within which data were collected. We can think of a survey as a single sampling event (or tightly bounded collection period) at a specific location. Different surveys may occupy different spatial locations, different sampling times, or both.
An example of a correctly formatted Surveys table is given below (scroll too see the whole table):
| study_id | survey_id | country_name | site_name | latitude | longitude | location_method | location_notes | collection_start | collection_end | collection_day | time_method | time_notes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dama_2017 | Dama_2017_Bamako_2014 | Mali | Koulikoro | 12.612900 | -8.13560 | WWARN lat and long | NA | 2014-01-01 | 2014-12-31 | 2014-07-02 | automated midpoint | NA |
| Asua_2019 | Asua_2019_Agago_2017 | Uganda | Agago | 2.984722 | 33.33055 | WWARN lat and long | NA | 2017-01-01 | 2017-12-31 | 2017-07-02 | automated midpoint | NA |
| Asua_2019 | Asua_2019_Arua_2017 | Uganda | Arua | 3.030000 | 30.91000 | WWARN lat and long | NA | 2017-01-01 | 2017-12-31 | 2017-07-02 | automated midpoint | NA |
| Asua_2019 | Asua_2019_Kole_2017 | Uganda | Kole | 2.428611 | 32.80111 | WWARN lat and long | NA | 2017-01-01 | 2017-12-31 | 2017-07-02 | automated midpoint | NA |
| Asua_2019 | Asua_2019_Lamwo_2017 | Uganda | Lamwo | 3.533333 | 32.80000 | WWARN lat and long | NA | 2017-01-01 | 2017-12-31 | 2017-07-02 | automated midpoint | NA |
| Asua_2019 | Asua_2019_Mubende_2017 | Uganda | Mubende | 0.557500 | 31.39500 | WWARN lat and long | NA | 2017-01-01 | 2017-12-31 | 2017-07-02 | automated midpoint | NA |
Fields and requirements
Mandatory fields are shown in blue.
| Column | Type | Notes |
|---|---|---|
| study_id | Character string. Must be a valid identifier. | The foreign key for this study, pointing back to the Studies table. |
| survey_id | Character string. Must be a valid identifier. | The private key for this survey. Must be completely unique - not just unique within a study |
| country_name |
Character string or NA
|
Useful for quickly identifying the country, although should not be used for analysis or when joining data (see Core Design Principles) |
| site_name |
Character string or NA
|
Useful for quickly identifying the site, although should not be used in analysis or when joining data (see Core Design Principles) |
| latitude |
Numeric [-90, 90] |
Spatial coordinates. If imputed then details of the imputation should be
recorded in the location_method and/or
location_notes.
|
| longitude |
Numeric [-180, 180] |
Spatial coordinates. If imputed then details of the imputation should be
recorded in the location_method and/or
location_notes.
|
| location_method |
Character string or NA
|
How the location was determined, e.g. ‘exact coordinates of health facility’, or ‘imputated as the centroid of the province’, etc. |
| location_notes |
Character string or NA
|
Any other important notes pertaining to the location (free text). |
| collection_start |
Proper date or NA(see ?as.Date)
|
The first day of a sample collection period. |
| collection_end |
Proper date or NA(see ?as.Date)
|
The last day of a sample collection period. |
| collection_day |
Proper date (see ?as.Date)
|
The specific day of sample collection. If imputed then details of the
imputation should be recorded in the time_method and/or
time_notes.
|
| time_method |
Character string or NA
|
How the collection day was determined, e.g. ‘exact date’, or ‘imputated as the midpoint of the collection range’, etc.. |
| time_notes |
Character string or NA
|
Any other important notes pertaining to the collection time (free text). |
The next page shows how to encode genetic data in the Counts table.