Codes and Frequencies
EARNIMP1 is a variable that includes imputed values to replace missing data for the original variable EARNINGS, a recoded variable reporting total personal earnings in the previous calendar year. The complementary imputation flag variable IMPEARNFLAG1 indicates whether responses in EARNIMP1 were reported or imputed.
Related Variables and Sources of Additional Information
EARNIMP1 is the first of five variables that contain imputed values for personal earnings. It was created as part of a set of variables that provide complete (i.e., without missing values) data on family income.
One of the purposes of NHIS data is to study relationships between income and health and to monitor health and health care for persons at different income levels. However, as the technical documentation on "Multiple Imputation of Family Income and Personal Earnings in the National Health Interview Survey: Methods and Examples" describes, non-response rates are high for questions on total family income in the previous calendar year and personal earnings from employment in the previous calendar year. For more information on the imputation methodology, see EMPSTATIMP1.
Before using the imputed income and earnings variables, researchers are strongly advised to read the NCHS documentation on imputed income.
This includes such sources as 2018 Imputed Family/Personal Earnings Files. This documentation cautions that each of the five datasets must be merged with other data from the survey to form a single completed dataset. For IPUMS NHIS data users, the imputed income files have already been merged with other data from each survey year for 1997 through the current year of data, as part of the process of adding these imputed income files and variables to the IHIS database.
The NCHS documentation for the imputed income files directs that analysis of the five versions of each imputed income variable should be done separately, using methods and software that are appropriate for such survey data (for example, SAS-callable SUDAAN or SAS-callable IVEware).
Only then can estimates and standard errors be combined using the combining rules described in the aforementioned document on "Multiple Imputation of Family Income and Personal Earnings in the National Health Interview Survey." The 2018 imputed income file documentation further warns:
Examples of correct data analyses and additional information about the procedures used to create the imputed data are provided in the technical documentation referred to above.
The comparability of the EARNIMP1 variable over time (and all the EARNIMP variables) is somewhat limited by changes in the recoded categories.
From 1997 to 2006, earnings were recoded into eleven brackets, and the top code was $75,000 and over. From 2007 forward, there were 21 brackets, and the top code was $100,000 and over.
To maximize comparability across years despite these changes, the IHIS variable EARNIMP1 employs composite coding, in which the first digit identifies broad groups that are consistent across years and the second digit provides additional detail present only in some years.
Consider, for example, the grouped income categories covering the range from $25,000 to $34,999. For 1997 to 2006, the original NHIS public use files for EARNIMP1 provide a single category for the entire income range $25,000-$34,999 (which has code 10 in the IHIS database). For 2007 forward, in the NHIS public use files, two separate categories cover the ranges $25,000-$29,999 (with IHIS code 11) and $30,000-$34,999 (with IHIS code 12). Under the composite coding system, these income categories share a common first digit of 1, indicating that researchers may wish to combine these categories to achieve comparability for 1997 forward. Researchers interested only in data for 2007 forward can take advantage of the full detail by distinguishing between IHIS category 11 (for $25,000-$29,999) and IHIS category 12 (for $30,000-$34,999).
Researchers may also want to take note of the small changes in what counted as "earnings" for the original variable EARNINGS. For example, from 1997 to 2000 and 2003 forward, unemployment or worker's compensation was considered part of earnings, whereas in 2001 and 2002, it was not.
- 1997-2000: Persons age 18+ who worked at a job or business last week, or who did not work last week but worked for pay in the last year.
- 2001-2018: Persons age 18+ worked for pay last year, or whose employment status is imputed as employed for pay.
- 1997-2018 : PERWEIGHT