Codes and Frequencies
EMPSTATIMP1 is a variable that includes imputed values to replace missing data for the original variable EMPSTAT (employment status). While the original EMPSTAT variable provides considerable detail about an individual's labor force status (distinguishing, for example, between those who were working for pay, those who were working without pay, and those who were temporarily absent from a job or business), EMPSTATIMP1 is a simple dichotomous variable distinguishing between adults who were "employed" and adults who were "not employed." The complementary imputation flag variable IMPEMPSFLAG1 indicates whether values in EMPSTATIMP1 were reported or imputed.
In EMPSTATIMP1, for 1997-2000, persons were categorized as "employed" (IPUMS NHIS code 1) under three circumstances: a) they reported working at a job or business last week; or b) they did not report working last week but did report working for pay in the last year; or c) they did not report their labor force status but were imputed the value "employed." Conversely, for 1997-2000, persons were categorized as "not employed" (IHIS code 2) under three circumstances: a) they reported that they did not work at a job or business last week; or b) they reported that they did not work for pay in the last year; or c) they did not report their labor force status but were imputed the value "not employed."
For 2001 forward, EMPSTATIMP1 maintains the same dichotomous distinction between "employed" and "not employed," but the system for assigning that status was simplified. Beginning in 2001, persons were categorized as "employed" (IHIS code 1) under two circumstances: a) they reported working for pay in the last year; or b) they did not answer the question about working for pay in the past year and were imputed the value "employed." Conversely, for 2001 forward, persons were categorized as "not employed" (IHIS code 2) under two circumstances: a) they reported that they did not work for pay in the past year; or b) they did not answer that question but were imputed the value "not employed."
Use in Imputed Data on Family Income
EMPSTATIMP1 is the first of five EMPSTATIMP variables that contain imputed values for dichotomous employment status. It was created as part of a set of variables that provide complete (i.e., without missing values) data on family income.
One of the purposes of NHIS data is to study relationships between income and health and to monitor health and health care for persons at different income levels. However, as the technical documentation on "Multiple Imputation of Family Income and Personal Earnings in the National Health Interview Survey: Methods and Examples" describes, non-response rates are high for questions on total family income in the previous calendar year and personal earnings from employment in the previous calendar year.
To obtain estimates of family income and personal earnings for all survey participants, the National Center for Health Statistics (NCHS) created variables with values imputed for missing data for 1997 forward, using multiple-imputation methodology. The NHIS public use files with multiply imputed data constitute five files (and thus five versions of variables containing imputed values for missing data), one for each set of imputed values, to allow the assessment of variability due to imputation. For each person, each file contains family income values, personal earnings values, dichotomous employment status, and the ratio of family income to the poverty line. Complementary flag variables indicate whether, for each individual, the value of each variable was reported or imputed. For a more detailed description of the imputation process, see IMPEMPSFLAG1.
Before using the imputed income and earnings variables, researchers are strongly advised to read the NCHS documentation on imputed income, such as 2018 Imputed Family/Personal Earnings Files. This documentation cautions that each of the five datasets must be merged with other data from the survey to form a single completed dataset. For IHIS data users, the imputed income files have already been merged with other data from each survey year for 1997 through the current year of data, as part of the process of adding these imputed income files and variables to the IHIS database.
The NCHS documentation for the imputed income files further directs that analysis of the five versions of each imputed income variable should be done separately, using methods and software that are appropriate for such survey data (for example, SAS-callable SUDAAN or SAS-callable IVEware). Only then can estimates and standard errors be combined using the combining rules described in the aforementioned document on "Multiple Imputation of Family Income and Personal Earnings in the National Health Interview Survey." The 2018 imputed income file documentation further warns:
Examples of correct data analyses and additional information about the procedures used to create the imputed data are provided in the technical documentation referred to above.
EMPSTATIMP1, like all EMPSTATIMP variables, is largely comparable over time. However, as described above, the coding (and thus imputation) of dichotomous employment status changed over time. In sum, dichotomous employment status for 1997-2000 was decided by reported employment status last week, reported work-for-pay status in the past year, or, in the case of missing values, imputation. For 2001 forward, dichotomous employment status was decided by reported work-for-pay status in the past year or imputation.
Small changes in the definition of "work" also occurred from 2001 forward; see EMPSTAT for the full description of these changes.
- 1997-2018: Persons age 18+.
- 1997-2017 : PERWEIGHT