Codes and Frequencies
IMPOVFLAG4 is a generated variable that indicates whether the value for the variable POVIMP4 was directly reported (IPUMS NHIS code 0) or was imputed from various levels of income information (IHIS codes 10 to 22). POVIMP4 reports the ratio of family income to the poverty threshold and replaces missing data with imputed values; the original poverty threshold variable, POVERTY, includes a substantial number of cases with "unknown" responses.
Related Variables and Sources of Additional Information
The complementary variable POVIMP4 was created as part of a set of variables that provide complete (i.e., without missing values) data on family income; the accompanying flag variables allow researchers to identify in which cases imputation was used.
For more details on the purpose and methodology of imputation used in the NHIS, see EMPSTATIMP1.
Before using the imputed income and earnings variables, researchers are strongly advised to read the National Center for Health Statistics documentation on imputed income, such as 2008 Imputed Family/Personal Earnings Files and "Multiple Imputation of Family Income and Personal Earnings in the National Health Interview Survey: Methods and Examples".
The codes and categories for IMPOVFLAG4 in the original NHIS public use files differ between 1997-2006 and 2007-2008. These differences stem from a change in how interviewers probed for family income information, as explained below.
From 1997 to 2006, respondents were asked an open-ended question about their total family income. Those who refused to answer or responded "I don't know" were asked two follow-up questions: 1) whether the figure was above or below $20,000; and 2) if their family income was one of 44 categories.
The categories for the IMPOVFLAG4 variable in the original NHIS public use files reflect this system of collecting information.
Specifically, the original values distinguished between: a) cases that were not imputed, but were instead based on reported income; b) cases that were imputed, in which respondents supplied no income information; c) cases that were imputed with "2-category income reported" (i.e., income reported to be above or below $20,000); and d) cases that were imputed with "44-category income reported." (A small number of cases were also categorized as "indefinable," for instances in which all co-resident family members were under age 18.)
Because of low response rates for these follow-up questions used for 1997-2006, a new series of questions, which used an unfolding bracket methodology, was introduced in 2007. This method asked a series of closed-ended income range questions (e.g., "is it less than $50,000?"). The closed-ended income range questions were constructed so that each successive question established a smaller range for the amount of the family's income. This change resulted in a somewhat different level of detail of information on which to impute income.
The system implemented in 2007 also produced fewer categories in the imputation flag for poverty ratios.
Beginning in 2007, the source variables for IMPOVFLAG4 in the original public use files distinguished between: a) cases that were not imputed, but were instead based on reported income; b) cases that were imputed, in which respondents supplied no income information; and c) cases that were imputed with income "reported in categories." (Again, a small number of cases were also categorized as "indefinable," for families in which all co-resident members were under age 18.)
IHIS uses composite coding to maximize comparability for IMPOVFLAG4 without losing detail.
The first digit in composite coding covers broad categories available across all years, while trailing digits preserve detail present in only some years. Specifically, in IHIS, cases in IMPOVFLAG4 that are not imputed are coded 0; cases that are imputed with no income information begin with a 1; cases that are imputed with some income information share a common first digit of 2; and "indefinable" cases begin with a 9. The differences between the various kinds of imputation with some income information are indicated in codes 20 (imputed, some income reported in 2007-2008), 21 (imputed, 2-category income reported in 1997-2006), and 22 (imputed, 44-category income reported in 1997-2006).
Composite coding provides a mechanical solution to the change in imputation methods and codes in IMPOVFLAG4. Across all years, researchers can combine cases sharing a common first digit to produce a standard 4-category system of "not imputed," "imputed with no income information,""imputed with some income information," and "indefinable." Even then, the underlying substantive difference in collecting income information persists; this in turn somewhat limits the comparability of IMPOVFLAG4 and the complementary variable POVIMP4.
- 1997-2008: All persons.
- 1997-2008 : PERWEIGHT