IPUMS PMA Weighting Guide

Most PMA samples were designed to be nationally representative. The survey design also allows for representative estimates for subnational regions, such as urban and rural areas, or regions/provinces. The sample description page provides information about the representative geographic levels for each survey round, and the margin of error associated with each sample and geographic level. That document also identifies survey rounds conducted only within certain regions of a country.

PMA sample design and weights yield representative statistics for households in the general population, because households were randomly selected for interview within enumeration areas. However, only women age 15 to 49 were eligible to answer the female questionnaire, so weighted estimates of the female-level variables are representative of just that target population.

Service delivery point files do not include a service delivery point (SDP) level weight. The SDP files are designed to represent the services and facilities available to women in each enumeration area. See PMA's Service Delivery Point Sampling Memo for more information.

Which weight to use

Each cross sectional IPUMS PMA survey sample includes normalized weights for the estimation of family planning and sanitation indicators. The appropriate weight to use depends on whether the variable is derived from the household or female questionnaire.

Household questionnaire variables

All samples except Nigeria

All variables based on the household questionnaire should be estimated using the household weight HQWEIGHT (except Nigerian samples - see below). HQWEIGHT consistently controls for non-response.

If you wish to know the proportion of people (i.e., members of sampled households, including the resident women of childbearing age) who have Internet access at home, for example, you should use HQWEIGHT with the variable INTERNET.

However, if you want to know the proportion of households that have Internet access, restrict the sample to one person per household, using LINENO (usually the first person in the household, line number = 1). If you do not restrict the sample to one person per household, a 5-person household gets counted five times as often as a one-person household.

For samples (except Nigeria) that are not nationally representative, HQWEIGHT is used to calculate estimates representative of the surveyed region.

For the Indonesia 2015 Round 1 sample, the weight variables HQWEIGHT_MK and HQWEIGHT_SS were included for finding region-level estimates of the districts of Makassar and South Sulawesi, respectively.

Nigeria samples

The Nigeria 2016 Round 3 sample is nationally representative, and the weight HQWEIGHT should be used. However, the Nigeria Round 1 (2014) and Round 2 (2015) surveys did not sample the entire country. For Nigeria 2014 and 2015 samples, the following region-specific household weight variables are provided, rather than a single nationally-representative weight. Each of these weight variables has a non-zero value only for households in the corresponding region and is zero for households outside the region. For comparability, region-specific weight variables are also included for nationally-representative Nigerian samples.

Region-specific household weights for Nigeria

Female questionnaire variables (cross sectional)

All samples except Nigeria

All variables based on the female questionnaire (for women age 15-49 only) should be estimated using the female weight FQWEIGHT (except Nigerian samples - see below).

For the Indonesia 2015 Round 1 sample, the weight variables FQWEIGHT_MK and FQWEIGHT_SS were included for the estimation of region-level estimates of the districts of Makassar and South Sulawesi, respectively.

Nigeria samples

The Nigeria 2016 Round 3 sample is nationally representative, and the weight FQWEIGHT should be used. The Nigeria Round 1 (2014) and Round 2 (2015) surveys did not sample the entire country. For Nigeria 2014 and 2015 samples, region-specific female weight variables are available, because nationally-representative weights are not included in the samples. Each of these weight variables has a non-zero weight value for women aged 15 to 49 living in households in the corresponding regions, and has a value of zero for all other individuals in the sample. For comparability, region-specific weight variables are also included for nationally-representative Nigerian samples.

Region-specific female weights for Nigeria

Female questionnaire variables (longitudinal)

Longitudinal weights should be used when analyzing data on eligible female respondents from two or more phases of the longitudinal panel survey. The correct longitudinal weights to use depend on which phases of the longitudinal survey are included in your analyses. This is because PMA longitudinal weights account for respondents entering, exiting and reentering the panel at different phases.

PANEL1_2_WEIGHT should be used when analyzing data only from Phase 1 and Phase 2. PANEL1_3_WEIGHT should be used when analyzing data only from Phase 1 and Phase 3. PANEL2_3_WEIGHT should be used when analyzing data only from Phase 2 and Phase 3. FULLPANELWEIGHT should be used when analyzing data from all three phases of the panel survey.

For example, if using a predictor measured at Phase 1 (e.g., women's education) to estimate an outcome measured at Phase 3 (e.g., women's contraceptive use) mediated by a covariate measured at Phase 2 (e.g., women's marital status), or if using a predictor measured at Phase 1 (e.g., women's decision-making autonomy) to estimate an outcome joining data from both Phase 2 and Phase 3 (e.g., women's upward economic mobility between Phase 2 and Phase 3), then FULLPANELWEIGHT should be used. If omitting data from Phase 2 to focus solely on the association between a predictor at Phase 1 and an outcome at wave 3, then PANEL1_3_WEIGHT should be used. The same logic applies to using PANEL1_2_WEIGHT and PANEL2_3_WEIGHT.

All longitudinal weights allow for representative estimates up to the largest area sampled. For example, in the Democratic Republic of the Congo (Kinshasa) sample, FULLPANELWEIGHT produces estimates representative of Kinshasa, whereas for Burkina Faso, FULLPANELWEIGHT produces a nationally representative estimate.

Population Count Expansion Factor

IPUMS PMA has constructed a population adjustment factor (POPWT) to yield population counts. POPWT inflates counts from female questionnaire variables to the national population of reproductive-age women. POPWT contrasts with FQWEIGHT, which produces correctly weighted proportions, but produces counts only for people included in the survey sample. POPWT is only constructed for nationally representative samples. See our memo about the creation of POPWT and how to apply it.

Service Delivery Point Variables

PMA surveyed service delivery points (SDPs), such as hospitals, pharmacies, and clinics, in the same sampling areas as households and females in the same survey round. These SDP data are not meant to be nationally representative. Instead, they are meant to portray the health provision environment of the surveyed households and women. Thus, there are no sampling weights for SDP variables. The files do contain a weight variable for the sampling units EAWEIGHT, which is a probability weight representing the likelihood of the enumeration area (EA) being selected for sampling. The collectors of the original data do not recommend using EAWEIGHT to weight SDP variables.

The best use of SDP variables is to calculate summary statistics at the EA level and attach them to female records using the EAID variable as a source of contextual information for the woman's service delivery environment. For example, a researcher could calculate the percentage of facilities located in a woman's EA that supply injectables. Note that the variable series EASERVED1 to EASERVED42 reports EAs the facility serves beyond the EA in which it is located.

Only public service delivery points have information about the EAs the facility serves beyond the EA the facility is located in. PMA received this information from the national or local governments.

Maternal and newborn health weights

All variables in the Maternal and Newborn Health module should be estimated using the weight variable MNHWEIGHT.

Nutrition module weights

Household variables

All variables based on the household questionnaire should be estimated using the household weight HQWEIGHT. HQWEIGHT controls for household non-response.

If you wish to know the proportion of people (i.e., members of sampled households, including the resident women of childbearing age) who have air conditioning at home, for example, you should use HQWEIGHT with the variable AIRCON.

However, if you want to know the proportion of households that have air conditioning, restrict the sample to one person per household, using LINENO (usually the first person in the household, line number = 1). If you do not restrict the sample to one person per household, a 5-person household gets counted five times as often as a one-person household.

Female variables

Variables based on the female-child questionnaire with a universe of women age 10-49 who are eligible (see our Sample Notes for more details on which women were eligible for the female questionnaire in each survey round) should be estimated using the female weight for nutrition module FNQWEIGHT.

Children variables

Variables based on the female-child questionnaire with a universe of eligible children under 5 (see our Sample Notes for more details on which children were eligible for the certain questions in each survey round) should be estimated using the child weight for nutrition module CQWEIGHT.

COVID-19 questionnaire variables

All variables based on the COVID-19 survey (for women of childbearing age only) should be estimated using the COVID-19 survey weight CVQWEIGHT. The women surveyed with this questionnaire had previously been interviewed for a family planning core survey, and were followed up by phone. See the PMA memo on constructing weights for more details.