Frequently Asked Questions (FAQ)
What is IPUMS-PMA (IPUMS Performance Monitoring and Accountability 2020)?
What's in the future for IPUMS-PMA?
How does the PMA2020 project differ from the Demographic and Health Surveys?
Why might estimates of some key indicators from PMA2020 data differ from estimates from the DHS?
How do I get access to IPUMS-PMA data?
How do I request access to additional countries after I've already registered?
Where can I send questions?
Where should a new user start?
What are microdata?
What are "integrated variables"?
What are "weights"?
What does "universe" mean in the variable descriptions?
How do I obtain data?
What format are the data in?
What is the best way to use the extract system?
How long does a data extract take?
How does "sample selection" work on the IPUMS-PMA web site?
What does "Add to cart" mean?
Why can't I open the data file?
Is there a preferred statistical package for using IPUMS-PMA?
Can I get the original data?
Using IPUMS-PMA data
Are there tricky aspects of IPUMS-PMA data to be particularly aware of?
What are the major limitations of the data?
Can I find particular individuals in the IPUMS-PMA data?
How can I cite IPUMS-PMA?
Using the variables page
Variables page menu
Variables page details
Using the data extract system
Your data cart
Why are some variables in my data cart preselected?
What is "Type"?
Extract request page
Extract option: Describe your extract
Extract option: Change data format
General information about the project
What is IPUMS-PMA (IPUMS Performance Monitoring and Accountability 2020)? [top]
IPUMS-PMA is a project designed to help researchers conduct comparative analyses of the Performance Monitoring and Accountability 2020 (PMA2020) project, funded by the Bill & Melinda Gates Foundation and administered by a research team at Johns Hopkins University. The data come from nationally-representative household, female, and service delivery point surveys carried out in 11 Family Planning (FP2020) pledging countries. The first round of data was collected in 2013. PMA2020 collects data in each participating country at least annually, sometimes with more than one round per year. Enumerators use mobile devices to enter survey responses. High frequency and low cost collection allows PMA2020 to provide researchers with data to monitor key family planning indicators over time. Names and low-level geographic information are not included, to protect the confidentiality of respondents. The IPUMS-PMA variables have been given consistent codes and have been documented to facilitate cross-national and cross-temporal comparisons. This "integration" process is described more fully here.
IPUMS-PMA is not a collection of compiled statistics; it is composed of microdata. Each record is a person, with all characteristics numerically coded. Performance Monitoring and Accountability 2020 fields a household survey for sampled households, and also surveys women aged 15 to 24 years in the household with a separate questionnaire regarding family planning behaviors. PMA2020 also samples family planning service delivery points in the same enumeration areas as the households that are sampled. Enumerators ask qualified respondents at the service delivery point about their provision of family planning products and services. Service delivery point data are not currently available in IPUMS-PMA, though they will be added by the end of the 2018 calendar year.
IPUMS-PMA is similar to the IPUMS-DHS data system, which contains harmonized Demographic and Health Surveys data.
What's in the future for IPUMS-PMA? [top]
IPUMS-PMA plans to harmonize and release service delivery point data by the end of the 2018 calendar year. As new household and female datasets are produced by the PMA2020 team at Johns Hopkins University, IPUMS-PMA will harmonize and release the new samples as quickly as possible.
In 2017, the PMA2020 team modified the survey design. Any comparability issues caused by the adjustment are documented on a variable-specific basis.
How does the PMA2020 project differ from the Demographic and Health Surveys? [top]
While the survey design of PMA2020 is purposely similar to the Demographic and Health Surveys (DHS), there are several differences. PMA2020 began collecting data in 2013, and returns to each participating country at least annually. The DHS began fielding surveys in the 1980s, and typically several years pass between surveys in the same country. There is topical overlap, but PMA2020's focus is on family planning, water, and hygiene, whereas the focus of DHS is maternal and child health. PMA2020, which is funded by the Gates Foundation, employs mobile devices and local enumerators to reduce costs and improve efficiency. DHS collects data on births and children in addition to women of childbearing age (15 to 49), and also collects limited information on household members. PMA2020 collects information from women of childbearing age and limited information on household members.
DHS has select samples of data collected on service delivery points (SDP), but the availability is very limited. PMA2020 collects SDP data for almost every year and country that female and household data is collected.
Why might estimates of some key indicators from PMA2020 data differ from estimates from the DHS? [top]
There are several differences between the PMA2020 data and DHS data that may influence key indicator estimates.
There is natural sampling variability that arises from random sampling in a population. DHS surveys are not conducted every year like PMA2020, so a 2010 DHS sample might be compared to a 2014 PMA sample, over which time the key indicators within a population could change. Furthermore, some measurement differences between PMA2020 and DHS might exist. For example, the reference period for a survey question may differ between the two survey series, the wording of a question may change the way a respondent may answer the question in either survey, or the universe of the question may differ. Lastly, for some samples, PMA2020 data is not nationally representative, for example, India 2015 is representative only within the region of Rajasthan.
How do I get access to IPUMS-PMA data? [top]
Access to the documentation is freely available without restriction; however, users must register before extracting data from the website. Users request access to data on a country by country basis, and the research description will be evaluated to ensure that use of the data adheres to our terms and conditions.
How do I request access to additional countries after I've already registered? [top]
Once you are registered with IPUMS-PMA and have been granted access to particular countries, you may request access to additional countries if your research question has expanded or you have a new research question that warrants access to additional countries. You may apply for access to additional countries' data here.
Where can I send questions? [top]
If you need more information about an IPUMS-PMA variable, have problems making or opening a data extract, or have other questions specifically related to IPUMS-PMA, you can ask the Minnesota Population Center's User Support staff, whose contact information is here. If your question relates to the original PMA2020 data collection or to material in the original PMA2020 files that serve as the basis for IPUMS-PMA, you can contact the PMA2020 team here.
Where should a new user start? [top]
First, you should register a free account with IPUMS-PMA in order to gain access to the data.
Next, the natural starting point is the data extract system, which is the primary tool for exploring the contents of the IPUMS-PMA database.
By default, the variables page displays one variable group at a time for all samples in the data series. You can filter the information at any point to include only the samples of interest to you ("Select samples").
After you select samples, the page will display only variables present in those samples. An "x" indicates the availability of a variable for a particular sample.
You can navigate through variables by topic area, alphabetically, or by keyword search. Add a variable to your cart by clicking the plus sign under the "Add to cart" column. Clicking on a variable's name brings up its documentation. The Codes tab is the default, showing the codes, labels, and unweighted frequencies for the variable, and the availability of categories across samples. The Description tab provides a brief statement of the meaning of the variable. Discussion of international and intra-national comparability issues, plus information about comparable variables in IPUMS-PMA, appears on the Comparability tab. The Survey Text tab shows the survey text associated with the variable.
Once you are logged in, the Data Cart in the upper right keeps track of your variable and sample selections. Once you have made some selections, you can click on "View Cart" to review your choices. Further instructions for the extraction system are here.
What are microdata? [top]
Microdata are composed of individual records containing information collected on persons (and sometimes households). The unit of observation is the individual. The responses of each person to the different survey questions are recorded in separate variables.
Microdata stand in contrast to more familiar "summary" or "aggregate" data. Aggregate data are compiled statistics, such as a table of marital status by sex for some locality. IPUMS-PMA does not supply such tabular or summary statistics for analysis.
Microdata are inherently flexible. One need not depend on published statistics that compiled the data in a certain way, if at all. Users can generate their own statistics from the data in any manner desired, including individual-level multivariate analyses.
What are "integrated variables"? [top]
Integration -- or "harmonization" -- is the process of making data from different surveys and countries comparable. Some level of consistency is already present in the original PMA2020 data. However, the response categories within these consistently-named variables in the IR files often differ across samples.
To create an integrated variable for any of these circumstances, we recode the relevant variable from each survey into a unified coding scheme that we design.
Because some samples provide more detail than others, a coding scheme that reduced variables down to the lowest common denominator across all samples would inevitably lose important information. As a result, some IPUMS-PMA integrated variables use composite coding schemes. The first one or two digits of the code provides information available across all samples, with trailing digits providing detail only intermittently available. All meaningful detail in the original surveys is therefore available to researchers if they need it, but they can confine their attention to the less-detailed digits if they wish. One example of this type of coding scheme is the IPUMS-PMA variable FPPROVIDER.
The other component of integration is the variable documentation. The documentation aims to highlight important comparability issues that are not self-evident from the coding structure for the variable. The general comparability discussion emphasizes issues for international comparisons and notes changes across survey rounds.
What are "weights"? [top]
To obtain representative statistics from the samples included in IPUMS-PMA, users must make use of weight variables.
1. For variables from the female questionnaire, users should employ the FQWEIGHT variable.
2. For household-level variables such as TOILETTYPE or FLOOR, users should apply the HQWEIGHT weight. However, all records represent individuals. Household-level analyses should be limited to only one person per household using the variable LINENO (limit to cases where LINENO=1). Users should note that households that do not include women of childbearing age are included in IPUMS-PMA.
Users should note that the weights provided with IPUMS-PMA do not provide nationally representative counts, only proportions. The weights adjust the sample of surveyed people to become representative of the national population.
What does "universe" mean in the variable descriptions? [top]
The universe is the population responding to the question. In IPUMS-PMA, the universe generally includes the women to whom the question was asked, as reflected on the survey questionnaire. For example, only women who have ever had sex are asked their age at first intercourse.
Cases that are outside of the universe for a variable are labeled "NIU (not in universe)" on the codes tab. Differences in a variable's universe across samples are a common data comparability issue.
In some cases, the variable universe specified in IPUMS-PMA will not exactly match the frequencies in the variable. For some variables in the original PMA2020 data, missing values were assigned the same codes as NIU cases, and the IPUMS-PMA team is not able to explain these cases. Users may find that the variables RESULTFQ and RESULTHQ, which describe why a survey was not completed, are helpful to explain missing values that should be in universe.
How do I obtain data? [top]
All IPUMS-PMA data are delivered through a data extraction system. Once logged in as a registered user, researchers can select the variables and samples they are interested in, and the system creates a custom-made extract containing only this information. The system will pool data from multiple samples into a single data file; in fact, it was primarily designed for this purpose.
Instructions for the downloading and reading the data are available here.
Data are generated on a server. The system sends out an email message to the user when the extract is completed. The user must download the extract and analyze it on a local machine. The extract system is accessed through the Data Cart, which becomes clickable once you have selected variables and samples.
What format are the data in? [top]
By default, IPUMS-PMA produces fixed-column ASCII data. Data are rectangular (rather than hierarchical) and mostly numeric (there are a small number of string variables), with one record per person. Only person-records are included in IPUMS-PMA; there are no separate records representing households.
In addition to the ASCII data file option, users may specify that they wish to receive their data as an SPSS, SAS, or Stata file or as a CSV (comma delimited) file.
A codebook file is also created with each extract. It records the characteristics of your extract and should be downloaded for record-keeping.
All data files are created in gzip compressed format. You must uncompress the file to analyze it. Most data compression utilities will handle the files.
What is the best way to use the extract system? [top]
The data extraction system is a flexible tool. There is no need to download variables or samples you don't expect to use for your current analysis. The system records every extract you make. You can reload and modify an old extract, dropping or adding variables or samples. Go to the "My Data Extracts" page and click on the "Revise" link.
Some IPUMS-PMA variables are preselected for you. They identify the sample (with the SAMPLE variable), in case your extract pools data from multiple sources, as well as the variables used for weighting (FQWEIGHT and HQWEIGHT) to create numbers representative of the sampled population. Other preselected variables, such as HHID and LINENO, uniquely identify records and households. Finally, variables required for variance estimation, such as STRATA, are also preselected for inclusion in your data extract.
How long does a data extract take? [top]
The time needed to make an extract differs depending on the number and size of samples requested and the load on the server. Most extracts will only take a few moments. Sometimes extracts will take longer. The system sends an email when the extract is completed, so there is no need to stay active on the IPUMS-PMA site while the extract is being made.
How does "sample selection" work on the IPUMS-PMA web site? [top]
When a user first enters the variable documentation system, all samples are selected by default. Every variable in the system will display on all relevant screens.
Users can filter the information displayed by selecting only the samples of interest to them. Only the variables available in one of the selected samples will appear in the variable lists. The variable descriptions and codes pages will also be filtered to display only the text and columns corresponding to the selected samples. Sample selections can be altered at any time in your session. Selections do not persist beyond the current session.
Users can also choose to filter the cases in an extract. The cases in PMA2020 data are female respondents, household members, female non-respondents, and persons who were not interviewed for the household questionnaire. On the samples selection page, users can choose one of the following groups:
1) Female respondents
2) Female respondents and household members
3) Female respondents and female non-respondents
4) All cases
What does "Add to cart" mean? [top]
While browsing variables in the documentation system, you can select them to include in a data extract, sending them to your data cart (assuming you have been approved to download data from the sample(s) in question). You can deselect a variable by unchecking its box in the data cart. After you proceed to "create data extract," you can return to the variable list to make more selections.
Why can't I open the data file? [top]
The data produced by the extract system are gzipped (the file has a .gz extension). You must use a data compression utility to uncompress the file before you can analyze it.
Detailed instructions for the downloading and reading the data are available here.
Is there a preferred statistical package for using IPUMS-PMA? [top]
IPUMS-PMA supports SPSS, SAS and Stata. Users may also request a comma-delimited (CSV) file and read the data into Excel for analysis.
Can I get the original data? [top]
The source materials for IPUMS-PMA are the Performance Monitoring and Accountability 2020 (PMA2020) household and female files distributed through PMA2020. Researchers go through a similar process to apply for access to the original files or the IPUMS-PMA version of the data. PMA2020 variables may differ in their coding schemes and mnemonics from the IPUMS-PMA version of the same variable. To apply for access to the original PMA2020 files, go here.
Using IPUMS-PMA data
Are there tricky aspects of IPUMS-PMA data to be particularly aware of? [top]
IPUMS-PMA samples are weighted: each individual does not represent the same number of persons in the population. It is important to use the weight variables when performing analyses with these samples. Use the FQWEIGHT variable for IPUMS-PMA variables on the female questionnaire; use the HQWEIGHT variable with questions asked on the household questionnaire.
For more information on the samples included in IPUMS-PMA, see the Sample Descriptions.
It is important to examine the documentation for the variables you are using. The codes and labels for variable categories do not tell the whole story. In other words, the syntax labels are not enough. There are two things to pay particular attention to. The universe for a variable -- the population answering the question -- can differ subtly or markedly across samples. Also, read the variable comparability discussions for the samples you are interested in. Important comparability issues should be mentioned there. If a variable is of particular importance in your research (for example, it is your dependent variable), you are also well served to read the survey text associated with it. This text is linked directly to the variable on the questionnaire text tab.
What are the major limitations of the data? [top]
IPUMS-PMA data are composed entirely of individual person records from a survey. There are no macroeconomic, business, or aggregate statistics. We do not deliver the published statistics from the survey; for aggregate statistics, please consult the key indicators briefs on the Source Documents page for the sample of interest.
Some databases of integrated microdata (such as the IPUMS-International census database) include records for all members of the household. IPUMS-PMA currently includes full information only for women of childbearing age, and very limited information (such as age and relationship to the household head) for all household members. Thus, IPUMS-PMA is not well-suited to studying some subpopulations, such as the elderly.
Because the data are public-use, measures have been taken to assure confidentiality. Names and other identifying information are suppressed. Most importantly for many researchers, detailed geographic information is limited.
Can I find particular individuals in the IPUMS-PMA data? [top]
No. A variety of steps have been taken to ensure the confidentiality of the data. Most fundamentally, the samples do not contain names or addresses. The data are only samples, so there is no guarantee any given individual will be in the dataset. Low levels of geography are suppressed.
How can I cite IPUMS-PMA? [top]
Our suggested citation can be found here.
Using the variables page
Variables page menu [top]
Use the left side of the menu to browse variables:
Topic: person variables by group
A-Z: integrated variables by first letter of the IPUMS-PMA variable name
Search: display only variables that contain specified text in particular fields
Use the buttons and links on the right side of the menu to:
Select Samples: limit the display of variable information to selected samples
Display Options: alter how the variable list is displayed or get help for this page
Variables page details [top]
The variables page allows you to browse harmonized variables while limiting and controlling how the information is displayed.
The left side of the menu is for browsing the variables.
When you select samples, you limit the variable list to display only variables that are available in at least one of those samples. The effect of selecting samples also extends to all the variable descriptions and codes pages you can access through the variable system. Only information relevant to your selected samples will be displayed in any context while you browse the variables. You can change your sample selections at any point.
Selecting samples is a good practice when exploring IPUMS-PMA, because the amount of information can be unwieldy. Selecting samples also makes sense if you know you are only interested in a specific country or countries. On the other hand, sometimes you need to see everything to determine what kinds of research are possible using the database.
"Search" lets you specify search terms for specific fields of variable metadata. The system will return a list of variables that include any of the search terms you indicate.
The final choices are "Display options" and "Help." The "Display options" item brings up a screen that offers a number of choices regarding the display of the variable list. Each selection has a default choice.
Use short country codes / Use long country codes
Switch between the 2-letter country abbreviations and longer abbreviations.
View one group / View all groups together
Switch between viewing one variable group at a time and viewing all variable groups on one screen. Unless you have a limited number of samples selected, your browser may be slow to display all groups. The default view is one group at a time.
Show availability detail / Show availability summary
Switch between displaying the full sample-specific availability matrix, and a view that only displays the total number of samples that contain each variable. Both views only display or sum the samples that the user has selected in "Select samples." The default view is the detailed availability information.
View available variables / View all variables
Switch between a view that only displays variables present in one of your selected samples, and a view that displays all variables, even if they are not available in your samples of interest. The default view is to only display available variables.
Samples are displayed oldest to newest / Samples . . . newest to oldest
Display the samples columns indicating variable availability in chronological order or reverse chronological order. The default is oldest to newest.
The Variable List
As you browse the variables, they are displayed in a list containing a number of columns. The variable name links to the variable description, which will include detailed comparability discussions, universes, and survey text. The "Type" column on the variables selection pages and in your data cart indicates the record type of the variable. Currently, all records in IPUMS-PMA are person records, so a "P" will always be shown. When the service delivery point data is released, variables from the SDP files will be type "S".
By default, all samples are selected for display. The country abbreviation and sample year identify each sample at the top of every column. Hover over the country code with the mouse to see the full sample name. If a variable is available in a given sample, an "x" is printed in that column.
In the column labeled "Add to cart," each variable has a purple circle with a "+" on the far left. Click these circles to add them to your data cart. Once you have clicked them, these icons change to a checked box, indicating that the variable is in your data cart. To remove the variable from your data cart, simply click the checkbox.
Using the data extract system
Your data cart [top]
You cannot create data from the extract system unless you are a registered user approved to download a given sample (or samples). If you are not registered, you must apply for access.
At the top right corner of the variables page is a summary of your data cart. This box displays the number of variables and samples you have selected. Clicking the yellow circle next to a variable places it in your data cart. You can view your data cart at any time by clicking "View Cart." The "View Cart" link only becomes operative when you have selected a variable or sample.
The data cart lists the variables preselected by the extract system as well as any variables you selected while browsing the documentation. As with the variable selection page, you can remove variables from your extract in this step by clicking the checkbox next to the variable in the "Add to cart" column. If you chose a variable but subsequently altered your sample selections in such a way that the variable is no longer available, it is indicated by an "i" icon.
The data cart also includes links to codes pages and sample availability for the variables in your cart.
Buttons are provided to return to the variable list to make more selections or to alter your sample choices. If you return to the variable list, click on "View Cart" again to return to the data cart.
When you are satisfied with your data selections, click "Create data extract" to finalize your extract request.
Why are some variables in my data cart preselected? [top]
Certain variables appear in your data cart even if you did not select them, and they are not included in the constantly updated count of variables in your data cart.
Unless you are absolutely certain you will not need one of these variables, we recommend that you not remove them from your data cart.
What is "Type"? [top]
The "Type" column on the variables selection pages and in your data cart indicates the record type of the variable. Currently, all records in IPUMS-PMA are person records, so a "P" will always be shown. When the service delivery point data is released, variables from the SDP files will be type "S".
Extract request page [top]
When you click "Create data extract" in the Data Cart, you come to the Extract Request page. If you wish, you can simply hit the "Submit" button and create your data extract.
The page summarizes your data extract and provides options for modifying it. A link at the top expands to show the samples you selected. Click the appropriate links to go back to the variable browsing and sample selection pages to alter your choices. You return to the extract request page via the data cart, where you can review the availability matrix for selections and easily drop variables by unchecking them.
When you submit an extract, there will be a delay ranging from minutes to hours, depending on the size of the job. You do not need to wait on our site for the job to be completed. The system will send you an email when your extract is ready.
The definitions of every extract will remain on the server indefinitely, but the data files are subject to deletion after three days. However, the screen where you download extracts has a feature that lets you revise old extracts. When you click on "revise," all your selections for that extract will be loaded into the system, after which you can edit or regenerate it. Note, however, that each successive data release can create difficulties for recreating old extracts, because codes might change.
Extract option: Describe your extract [top]
You can describe your extract for future reference. The system will display the description on the page where you download your data extract.
Extract option: Change data format [top]
There is a link across from "Data Format" called "Change". Click the link to request your extract in .csv, .sav, .dta, or .sas7bdat instead of the default .dat file.