Calculating Prevalence

This tutorial covers the following topics:

Calculating prevalence at a single locus
Calculating prevalence at multiple loci and dealing with ambiguities

Let’s begin by creating a new STAVE object and appending the example data:

# create new object
s <- STAVE_object$new()

# append example data
s$append_data(studies_dataframe = example_input$studies,
              surveys_dataframe = example_input$surveys,
              counts_dataframe = example_input$counts)
#> data correctly appended

Before calculating prevalence, it is often useful to inspect the set of variants encoded in the object:

s$get_variants()
#> [1] "crt:76:T"  "k13:469:F" "k13:469:Y" "k13:675:V" "mdr1:86:Y"

By default, get_variants() lists single-locus variants. If you instead want to see all multi-locus haplotypes, set:

s$get_variants(report_haplo = TRUE)
#> [1] "crt:76:T"  "k13:469:F" "k13:469:Y" "k13:675:V" "mdr1:86:Y"

(for this example there is no difference because we have no multi-locus haplotypes loaded).

Prevalence at a single locus

To calculate the prevalence of a specific variant, use get_prevalence(). For example, here is the prevalence of the mutation crt:76:T:

s$get_prevalence(target_variant = "crt:76:T")

study_id	study_label	description	access_level	contributors	reference	reference_year	PMID	survey_id	country_name	site_name	latitude	longitude	location_method	location_notes	collection_start	collection_end	collection_day	time_method	time_notes	numerator	denominator	prevalence	prevalence_lower	prevalence_upper
Dama_2017	Reduced ex vivo susceptibility of Plasmodium falciparum after oral artemether-lumefantrine treatment in Mali	NA	public	Dama et al.	https://pubmed.ncbi.nlm.nih.gov/28148267/	2017	28148267	Dama_2017_Bamako_2014	Mali	Koulikoro	12.612900	-8.13560	WWARN lat and long	NA	2014-01-01	2014-12-31	2014-07-02	automated midpoint	NA	130	170	76.47059	69.36751	82.62694
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Agago_2017	Uganda	Agago	2.984722	33.33055	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Arua_2017	Uganda	Arua	3.030000	30.91000	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Kole_2017	Uganda	Kole	2.428611	32.80111	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Lamwo_2017	Uganda	Lamwo	3.533333	32.80000	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Mubende_2017	Uganda	Mubende	0.557500	31.39500	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	NA	NA	NA

The output is a joined table containing study, survey, and count information, as well as the estimated prevalence and its 95% confidence interval for each survey.

Note that we have a row for every loaded survey, even when the denominator is zero. To return only surveys with non-zero denominators, use:

s$get_prevalence(target_variant = "crt:76:T", return_full = FALSE)

study_id	study_label	description	access_level	contributors	reference	reference_year	PMID	survey_id	country_name	site_name	latitude	longitude	location_method	location_notes	collection_start	collection_end	collection_day	time_method	time_notes	numerator	denominator	prevalence	prevalence_lower	prevalence_upper
Dama_2017	Reduced ex vivo susceptibility of Plasmodium falciparum after oral artemether-lumefantrine treatment in Mali	NA	public	Dama et al.	https://pubmed.ncbi.nlm.nih.gov/28148267/	2017	28148267	Dama_2017_Bamako_2014	Mali	Koulikoro	12.6129	-8.1356	WWARN lat and long	NA	2014-01-01	2014-12-31	2014-07-02	automated midpoint	NA	130	170	76.47059	69.36751	82.62694

Prevalence of a haploype, and ambiguous matches

Here is another example, this time allowing for ambiguous matches.

s$get_prevalence("crt:76:T", keep_ambiguous = TRUE, prev_from_min = TRUE)

study_id	study_label	description	access_level	contributors	reference	reference_year	PMID	survey_id	country_name	site_name	latitude	longitude	location_method	location_notes	collection_start	collection_end	collection_day	time_method	time_notes	numerator	numerator_min	numerator_max	denominator	prevalence	prevalence_lower	prevalence_upper
Dama_2017	Reduced ex vivo susceptibility of Plasmodium falciparum after oral artemether-lumefantrine treatment in Mali	NA	public	Dama et al.	https://pubmed.ncbi.nlm.nih.gov/28148267/	2017	28148267	Dama_2017_Bamako_2014	Mali	Koulikoro	12.612900	-8.13560	WWARN lat and long	NA	2014-01-01	2014-12-31	2014-07-02	automated midpoint	NA	130	130	130	170	76.47059	69.36751	82.62694
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Agago_2017	Uganda	Agago	2.984722	33.33055	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Arua_2017	Uganda	Arua	3.030000	30.91000	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Kole_2017	Uganda	Kole	2.428611	32.80111	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Lamwo_2017	Uganda	Lamwo	3.533333	32.80000	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	0	0	NA	NA	NA
Asua_2019	Changing Molecular Markers of Antimalarial Drug Sensitivity across Uganda	NA	public	Asua et al.	https://pubmed.ncbi.nlm.nih.gov/30559133/	2019	30559133	Asua_2019_Mubende_2017	Uganda	Mubende	0.557500	31.39500	WWARN lat and long	NA	2017-01-01	2017-12-31	2017-07-02	automated midpoint	NA	0	0	0	0	NA	NA	NA

A min and max numerator are now given. In this example there is no ambiguity as we are calculating prevalence at a single locus, but for longer haplotypes the min and max can differ. The prevalence and 95% CI calculated using either the min or the max values, specified by the prev_from_min argument.