The intention of the package popim is to provide tools to easily evaluate and track the POPulation IMmunity of an age-structured population that results from vaccination. Given that vaccination has long lasting effects, this requires tracking the status of the population through time.

Data structures

To this end the package defines two S3 classes to store data in an appropriate format. Both classes are dataframes, but have additional requirements such as certain columns that must be present, and the format and range of these columns.

S3 class popim_population

The class popim_population is designed to hold information on a population of interest.

The dataframe

The dataframe holds information on the population size and immunity status of the population through time. The population is disaggregated into 1-year age groups, and potentially several spatial regions. Each row refers to a particular (1 year) birth cohort in a given location and year.

The mandatory columns are:

  • region: The population of interest may be spatially disaggregated into separate geographical regions (e.g., countries or subnational administrative regions). This column holds a string that identifies which region each entry refers to.
  • year: The population is tracked through time using annual time steps.
  • age: The age-structure of the population is tracked in annual age groups.
  • cohort: This column is redundant as it equals year - age, and therefore refers to the year in which this cohort was born. It is included for ease of handling.
  • immunity: Gives the proportion of each cohort that is immune due to past vaccination.
  • pop_size: Number of people in any given age group and year. To facilitate using common units such as thousands this is not limited to integer values. It is however up to the user to ensure that such units are used consistently.
column type range
region character
year integer
age integer non-negative
cohort integer year - age
immunity numeric [0, 1]
pop_size numeric non-negative

Class attributes

In addition to the attributes that every dataframe has (names, class, row.names), the objects of the popim_population class also retain the input parameters of the basic constructor popim_population():

  • region: a character vector of all regions in this object,
  • year_min and year_max: the range of years, and
  • age_min and age_max: the age range covered in this population.

Create, modify and validate objects of this class

create: popim_population(), read_popim_pop(), as_popim_pop()

modify: apply_vacc()

validate: is_population() (currently not exported)

S3 class popim_vacc_activities

The class popim_vacc_activities is designed to hold information on vaccination activities that have occurred or are planned in the population of interest. Each row of the dataframe relates to one vaccination activity.

The dataframe

The mandatory columns are:

  • region: The region in which the vaccination activity takes place. While this is a free character field, when applied to a population, this must match a region that occurs in the population.
  • year: year during which the vaccination activity takes place.
  • age_first: youngest age group to be targeted.
  • age_last: oldest age group to be targeted.
  • coverage: proportion of the target population to be immunised.
  • doses: number of doses used in the vaccination activity.
  • targeting: determines how doses are allocated when there is pre-existing immunity in the population, see below for a description of the different targeting methods.
column type range
region character
year integer
age_first integer non-negative
age_last integer \geage_first
coverage numeric [0, 1]
doses numeric non-negative
targeting character "random", "correlated", "targeted"
Relationship between coverage, doses and target population

If the target population size is known, the information on coverage and doses is redundant (and potentially conflicting) as coverage = doses / target population size. However, the popim_vacc_activities object does not store information on population size, this is held in the popim_population object to which the vaccination activities will be applied, so this potential conflict cannot be detected in a popim_vacc_activities object in isolation. The validator for this class, validate_vacc_activities(), checks that the required columns exist and are of correct data type and range (where applicatble). It also currently requires that at least one of coverage and doses is non-missing in each column. (This behaviour is maybe not sensible as inconsistent with the other columns?)

Class attributes

As this is a subclass of dataframe, it has the same attributes as a dataframe (names, class and row_names). It has no additional attributes.

Create and validate objects of this class

create: popim_vacc_activities(), read_vacc_activities(), as_vacc_activities(), vacc_from_immunity()

modify: complete_vacc_activities()

validate: validate_vacc_activities() (currently not exported)

Primary functionality: applying vaccination activities to a population

Once objects holding data on the population and vaccination activities have been set up, the primary functionality provided by popim is to apply the vaccination activities to the population to evaluate the population immunity by age over time that results from the given vaccination activities, which is done with the function apply_vacc().

Assumptions when applying vaccination activities to the population

  • 100% vaccine effectiveness
  • no wastage of doses
  • no waning immunity
  • mortality is independent of immunity status

While the first two are clearly unrealistic assumptions, if there is a constant vaccine effectiveness (<100%), or a constant wastage factor, the results can easily be scaled up by these constant factors to obtain more realistic values for vaccine demand.

For a potentially deadly disease one would expect that mortality due to the disease would be strongly correlated with immunity status, however, even the most devastating outbreaks typically only affect a small part of the population, and therefore all-cause mortality will be much less impacted by mortality due to the specific disease. This means that this assumption is closer to reality than it might appear at first.

Waning immunity is a potentially important factor to consider for many vaccines, particularly when looking at the long term benefits of vaccination. However, allowing for this is currently not implemented.

Description of the algorithm

The vaccination activities object has columns for coverage, the proportion of the target population that will be vaccinated, and doses, the number of vaccine doses to be administered in the activity. Given a target population size this is redundant (and potentially conflicting) information. The function apply_vacc() will always use coverage information in preference over doses, if this is non-missing. If coverage is missing, the target population size (based on the population size of the region and age groups targeted) will be calculated and the corresponding coverage calculated as doses / target_population.

While the function apply_vacc() will calculate the coverage if it is missing with respect to the target population size, without modifying the input popim_vacc_activities object, the function complete_vacc_activities() will fill in missing coverage and doses information from the other, and check for inconsistencies in activities that have both data on coverage and doses. This is done with respect to a particular popim_population object which contains the relevant population sizes.

Vaccinating a cohort with no previous immunity

When vaccination is applied to a particular birth cohort that has no immunity, the result is simply that the proportion of the cohort receiving the vaccine will be immune from the time of the vaccination onwards, i.e., the immunity for this cohort will be set to equal the coverage of the vaccination activity in question. Due to the annual time step, vaccination is assumed to take effect at the end of the year in which it is implemented, so the immunity is updated from the next year onwards.

Vaccinating a cohort with previous immunity

When a vaccination activity is applied to a cohort that has some previous immunity, we need to make some assumptions about who will be vaccinated in the new activity. There are 3 different options implemented which are governed by the parameter targeting, one of the input parameters to the function apply_vacc(). This parameter can take the values "targeted", "random" or "correlated".

"targeted" targeting

The most effective way to implement a new vaccination activity is to use "targeted" targeting, meaning that the vaccine is targeted at previously non-immunised individuals. This means that the new immunity of after the vaccination activity is simply the sum of the previous immunity and the coverage of the current vaccination activity (capped by 1 which indicates complete immunity of the cohort in question):

$$\rm{immunity}_{\rm new} = \min(\rm{immunity}_{\rm prev} + \rm{coverage}, 1)$$

While this is not a realistic assumption in many settings, it can be useful for instance in multi-year campaigns that target the same region (but possibly proceed sub-region by sub-region) over several years. This multi-year campaign would be coded as separate vaccination activities for each year, using the option targeting = "targeted".

"random" targeting

A reasonable first assumption might be to use targeting = "random", meaning that individuals will receive vaccination irrespective of their previous immunity status. The immunity after the new campaign is applied is calculated according to

$$ \rm{immunity}_{\rm new} = \rm{coverage} + \rm{immunity}_{\rm prev} - \rm{coverage}*\rm{immunity}_{\rm prev}$$

"correlated" targeting

On the opposite end of the spectrum is the most pessimistic assumption that might apply in situations where there is unequal access to vaccination: under "correlated" targeting, people who have previously been immunised get vaccinated first in the new vaccination activity, and those previously not immune will only receive any vaccine if the new coverage extends further:

$$ \rm{immunity}_{\rm new} = \max(\rm{immunity}_{\rm prev}, \rm{coverage})$$

Note that the order in which vaccination activities are applied to a population does not matter under any of the targeting assumptions.

The inverse: inferring vaccination activities from a population

It is not possible to infer the vaccination activities that have given rise to a particular immunity profile of a population by age and over time: The same profile can be reached with different targeting choices (where different targeting choices will require different amount of vaccine). Furthermore, if there are cohorts that reached 100% immunity, there may have been double vaccination of some individuals that left no trace in the immunity profile. However, the function vacc_from_immunity() does infer the vaccination activities from a population immunity profile as far as possible, returning an obejct of class popim_vacc_activities.

To this end, in addition to passing the popim_population object of interest, the user needs to specify a targeting option (which will be assumed for all vaccination activities that are identified). The inferred vaccination activities will be the minimal activities possible to give rise to the observed immunity profile given the targeting option supplied.

Population summary & visualisation

An object of class popim_population holds a somewhat complex dataset recording population size, age distribution and immunity status through time, and potentially disaggregated into various geographical regions. To facilitate interpretation of this kind of data, there are a couple of functions to aggregate and visualise these objects.


The functions plot_pop_size() and plot_immunity() serve to plot the age-disaggregated population size and immunity through time in a grid where year and age are plotted on the x- and y-axis, respectively. The size of the cohort or proportion of the cohort that is immune is indicated by the colour of the grid cell. If there are several regions, these will be shown in separate panels.

The implementation of these plot functions is based on ggplot2, and therefore the returned plot objects can be further modified using the ggplot2 syntax.

Aggregation over age

While the age structure of a population’s immunity profile is important to understand how immunity is likely to develop through time, for the purposes of understanding how well a population is protected at any point in time, often the overall proportion immune is of greater interest. This can be calculated using the function calc_pop_immunity() which will aggregate the population over age and return a dataframe with the whole population immunity over time and geographical region.