R/data.R
Taylor_2020.Rd
Data from Taylor et al. (2020). Here we give a brief summary of the data -
see the original paper for full details.
The study by Taylor et al. (2020) uses data from a previously published study
by Echeverry et al. (2013). Samples were obtained from patients with
symptomatic uncomplicated malaria, and were collected between 1993 and 2007
from five cities in four provinces of Colombia. Samples were genotyped using
a 250-SNP barcode. These samples were all considered to be monoclonal
infections, therefore any heterozygous genotype calls were recoded as missing
data. Note that markers were re-ordered post-publication, but this did not
qualitatively alter any of the major conclusions
(see
Github README).
data(Taylor_2020)
A dataframe with 257 columns, giving the sample ID (column 1), multi-locus genotype ID (column 2), collection place and time (columns 3:5), the genotype call at all 250 SNPs (columns 6:255), the number of heterozygous loci in the original data (now recoded as missing) (column 256) and the collection year (column 257, matches info in column 5). Genotype values give 0 for minor allele, 1 for major allele, or NA for missing data. Samples were considered monoclonal and therefore any heterozygous calls were recoded as missing data.
Taylor AR, Echeverry DF, Anderson TJC, Neafsey DE, Buckee CO (2020). “Identity-by-descent with uncertainty characterises connectivity of Plasmodium falciparum populations on the Colombian-Pacific coast.” PLOS Genetics, 16(11), e1009101. ISSN 1553-7404, doi: 10.1371/journal.pgen.1009101 , Publisher: Public Library of Science, https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1009101.
Echeverry DF, Nair S, Osorio L, Menon S, Murillo C, Anderson TJ (2013). “Long term persistence of clonal malaria parasite Plasmodium falciparum lineages in the Colombian Pacific region.” BMC Genetics, 14(1), 2. ISSN 1471-2156, doi: 10.1186/1471-2156-14-2 , 2022-04-27.