Data from Chang et al. (2017). Here we give a brief summary of the data - see
the original paper for full details.
Dried blood spots were taken from 2012-13 cross sectional surveys in several
provinces in Uganda. Households (n = 200) were randomly selected from each
province. All samples that had detectable asexual parasitemia were selected
for Sequenom SNP genotyping. Medium to high frequency SNPs (n = 128) with
high frequency in malaria populations (pf-community-project) were chosen,
leaving 105 SNPs after filtering variants with lower or missing frequencies.
Genotyping was based on the intensity of the SNPs. In addition, merozoite
surface protein 2 (msp2) genotyping was conducted on an age stratified subset
of the samples. Capillary electrophoresis was used to distinguish msp2 allele
sizes.
The data contains a list (size = 3) of data frames. The first data frame
contains SNP calls for 105 filtered SNPs. The values {-1, 0, 0.5, 1} denote
missing value / no call, heterozygous, or homozygous alleles. The second and
third data frame contain 95
frequencies using their described categorical and proportional methods for
modeling homozygous/heterozygous calls and with-in host allele frequency,
respectively.
data(Chang_2017)
A list of multiple data objects
S1_Table_SNP_data
: SNP calls for each sample {-1, 0, 0.5, 1}.
95_cred_interval_coi
: Calculated COI of Uganda Samples
using categorical and proportional methods described in paper.
95_cred_interval_allele_freq
: Calculated allele frequency
of Uganda Samples using categorical and proportional methods described
in paper.
S1_Table_SNP_data
: A data frame of 107 columns. The sample id and
location are included in the first 2 columns. SNP calls (columns 3:107) are
either no call / missing data (-1), heterozygous (0.5) or homozygous (0 or
1).
S2_Table_95_cred_int_coi_uganda_samples
: A data frame of 15 columns
containing calculated complexity of infection summary statistics for
categorical, proportional, and COIL methods.
S3_Table_95_cred_int_allele_freq
: A data frame of 25 columns
containing summary statistics of allele frequency from categorical,
proportional, and COIL methods.
Chang H, Worby CJ, Yeka A, Nankabirwa J, Kamya MR, Staedke SG, Dorsey G, Murphy M, Neafsey DE, Jeffreys AE, Hubbart C, Rockett KA, Amato R, Kwiatkowski DP, Buckee CO, Greenhouse B (2017). “THE REAL McCOIL: A method for the concurrent estimation of the complexity of infection and SNP allele frequency for malaria parasites.” PLoS Comput Biol, 13(1), e1005348. ISSN 1553-734X, doi: 10.1371/journal.pcbi.1005348 , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5300274/.