Data from Chang et al. (2017). Here we give a brief summary of the data - see the original paper for full details.

Dried blood spots were taken from 2012-13 cross sectional surveys in several provinces in Uganda. Households (n = 200) were randomly selected from each province. All samples that had detectable asexual parasitemia were selected for Sequenom SNP genotyping. Medium to high frequency SNPs (n = 128) with high frequency in malaria populations (pf-community-project) were chosen, leaving 105 SNPs after filtering variants with lower or missing frequencies. Genotyping was based on the intensity of the SNPs. In addition, merozoite surface protein 2 (msp2) genotyping was conducted on an age stratified subset of the samples. Capillary electrophoresis was used to distinguish msp2 allele sizes.

The data contains a list (size = 3) of data frames. The first data frame contains SNP calls for 105 filtered SNPs. The values {-1, 0, 0.5, 1} denote missing value / no call, heterozygous, or homozygous alleles. The second and third data frame contain 95 frequencies using their described categorical and proportional methods for modeling homozygous/heterozygous calls and with-in host allele frequency, respectively.

data(Chang_2017)

Format

A list of multiple data objects

  • S1_Table_SNP_data: SNP calls for each sample {-1, 0, 0.5, 1}.

  • 95_cred_interval_coi: Calculated COI of Uganda Samples using categorical and proportional methods described in paper.

  • 95_cred_interval_allele_freq: Calculated allele frequency of Uganda Samples using categorical and proportional methods described in paper.

S1_Table_SNP_data: A data frame of 107 columns. The sample id and location are included in the first 2 columns. SNP calls (columns 3:107) are either no call / missing data (-1), heterozygous (0.5) or homozygous (0 or 1).

S2_Table_95_cred_int_coi_uganda_samples: A data frame of 15 columns containing calculated complexity of infection summary statistics for categorical, proportional, and COIL methods.

S3_Table_95_cred_int_allele_freq: A data frame of 25 columns containing summary statistics of allele frequency from categorical, proportional, and COIL methods.

References

Chang H, Worby CJ, Yeka A, Nankabirwa J, Kamya MR, Staedke SG, Dorsey G, Murphy M, Neafsey DE, Jeffreys AE, Hubbart C, Rockett KA, Amato R, Kwiatkowski DP, Buckee CO, Greenhouse B (2017). “THE REAL McCOIL: A method for the concurrent estimation of the complexity of infection and SNP allele frequency for malaria parasites.” PLoS Comput Biol, 13(1), e1005348. ISSN 1553-734X, doi: 10.1371/journal.pcbi.1005348 , https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5300274/.