User Note - Condition Records and Condition Variables

Prior to the 1997 NHIS redesign, most information about health conditions was collected in the form of many-to-one "condition records." In the NHIS public use files containing the condition data, each health condition appears as a single record, with one such condition record for every health problem reported for a survey participant (e.g., as the reason for hospitalization, bed days, doctor visits, or disability). In the public use files, conditions themselves are the unit of analysis, and thus the pre-1997 condition information is difficult for researchers to use in concert with other NHIS data that treat the person or household as the unit of analysis.

By contrast, beginning in 1997, information about health conditions is included on person records. As part of the 1997 NHIS redesign, most condition information was collected by asking directly whether the survey participant had a specific health condition and storing that information in a yes/no variable on the person record. Interviewers still asked about the causes of school and work disability and about the health conditions limiting adults' ability to perform activities of daily living. But within the public use files, the specific limiting conditions reported were grouped into broad categories (e.g., "bone/joint/muscle problem") to make person-level variables for 1997 forward.

To facilitate analysis of information stored on the pre-1997 condition records, and to harmonize the older conditions data with the data on conditions available for 1997 forward, IPUMS NHIS has converted the pre-1997 condition data into person-level variables. This user note describes how we constructed person-level condition variables and discusses general issues that affect these variables, such as comparability over time and weighting. Please consult the appropriate variable description(s) for specific information about particular condition variables.

I. NHIS Conditions Data Collection, 1969-1996

For 1969 to 1996, each person included in the NHIS could have 0 to 20 condition records--one record for each health condition the person reported. These records could be generated in response to two types of survey questions, which we classify as either indirect or direct. Below, we describe how condition data were collected prior to 1997, using asthma as an example.

Indirect questions

Direct question

An asthma record could have been generated by responses to any of the above questions. We create person-level condition variables only from condition records that were generated (or that could have been generated) via direct questions. We imposed this restriction primarily to avoid a selection bias (as explained below) and, to a lesser extent, to improve comparability with condition information for 1997 forward.

II. Avoiding Selection Bias

In survey years without a direct question about asthma, persons could only report their asthma via one of the indirect questions. A person who had asthma but who did not see a doctor for, was not limited by, or did not miss work/school because of her asthma would have no occasion to report her asthma. Thus, in years without a direct asthma question, asthma records would be biased toward including people with more serious cases of asthma and/or asthma sufferers with better access to medical care, and would not be representative of the total population of persons with asthma. To avoid misclassifying persons with asthma who were not treated for or limited by their condition, we do not make the asthma variable (ASTHMAYRC) available for years where there is no direct question about asthma.

III. Creating the Person-Level Variable

IPUMS NHIS uses condition records generated by the above means to create a variable that indicates whether each in-universe person has asthma. To create such a variable, we have to 1) identify the universe (i.e., who was asked the question), 2) identify which condition records are asthma records, 3) assign in-universe persons with an asthma record a code of "Yes", and 4) assign in-universe persons without an asthma record a code of "No."

Step 1: Identifying the Universe

In all years, the universe is those persons who were asked the direct survey question about asthma.

For 1969-1977, there was 1 list each year of direct questions about conditions that was asked of all persons. Asthma was included as a condition in the 1970 survey, so the universe of ASTHMAYRC for that year is "All persons."

For 1978-1996, the universe narrows substantially. During that period, NHIS produced 6 lists of conditions (rather than 1), and gave each list to 1/6th of households. Therefore, the universe for a given condition depends on which condition list included that health problem. Asthma was always included on Condition List 6 (for respiratory conditions), so the universe for ASTHMAYRC for 1978-1996 is "Persons living in the 1/6th of households who received Condition List 6." Not all condition variables remain on the same list for the whole period 1978 to 1996.

These universes have consequences for which weight variable should be applied to a given condition variable in a given year. For 1969-1977, the variable PERWEIGHT should be used. For 1978-1996, one of the following weight variables should be used, depending on which list a given condition appears: CONDWT1, CONDWT2, CONDWT3, CONDWT4, CONDWT5, CONDWT6.

There are two exceptions. DIABETICYRC should be weighted with DIABWT for 1978-1981, and PARALANYNOWC should be weighted with PARALWT for 1978-1996. These conditions, diabetes and paralysis, require special weights because they were included on 2 condition lists in the same year. As a result, they have a universe that is "Persons living in 1/3rd of households," rather than persons living in 1/6th of households.

Persons who are not in universe for a given condition are given a code of "NIU," meaning "not in universe." The NIU code indicates that the person was not asked a direct question about the condition. Even if a person has an asthma record, if she was not in a household that received Condition List 6, we label her as NIU. This is to avoid the selection bias described above.

Step 2: Identifying Condition Records and Assigning Codes of "Yes" or "No"

The condition records in the public use "condition files" can be linked to the public use "person files," using a linking key. We did this linkage so we knew which survey participants had a given set of condition records. To determine whether a person had a specific health problem (such as asthma), we looped through that individual's condition records searching for a record with a code value identifying the condition in question.

In the case of asthma, we look for a code of "24" in the variable DIAGR3 for the years 1970 and 1978-1981. If a person is in-universe for ASTHMAYRC and has a condition record with a code of "24" in the variable DIAGR3, we code that person as "Yes" (i.e., the person has asthma) and give them an IPUMS NHIS code of either "21" or "22" (see the section on composite coding below for a clarification of the difference between these codes).

If a person is in-universe for ASTHMAYRC and does not have a condition record with a code of "24" in the variable DIAGR3 for the relevant years, we code that person as "No" (i.e., the person does not have asthma) and give them an IPUMS NHIS code of "10."

The variable description for each pre-1997 condition variable reports the names of the NHIS source variables and the specific codes that are used to identify that condition for each year. Condition codes variously come from NCHS's Diagnostic Recodes #1, #3, B, and C, and the International Classification of Diseases (ICD) Revisions 8 and 9.

IV. Composite Coding

For 1969 to 1981, each condition record contains 3 variables (identified by NHIS names QUESTNS1, QUESTNS2, and QUESTNS3 in 1969 and CONQNFS1, CONQNFS2, and CONQNFS3 for 1970 to 1981) that report the source(s) of information (i.e., which of the six questions mentioned above) that generated a given condition record. One condition record could list up to three different survey questions yielding a response that the person had asthma. For example, a person could have reported their asthma by having had a doctor visit, having missed work days, and saying "Yes" to the direct question about having asthma in the past year.

IPUMS staff made use of these source of information variables in supplemental programming to distinguish between condition records generated in response to direct versus indirect survey questions. By focusing only on survey years with a direct question about the condition of interest, we avoid the selection bias mentioned previously, even though we code as "Yes" some condition records that do not list the direct survey question as one of their sources of information.

We distinguish between two types of "Yes" responses (generated by direct versus indirect questions) for 1969 to 1981 only. Starting in 1982, the 3 NHIS variables providing source information are no longer included on the condition records, so only one "Yes" code is used.

Composite codes for "Yes" responses

To facilitate comparison across the sets of years 1969-1981 and 1982-1996, users can combine all responses that begin with a "2" into one generic "Yes" category.

V. Comparability with 1997 Forward Condition Variables

Researchers should use caution when comparing prevalence estimates derived from pre-1997 (e.g., ASTHMAYRC) and 1997 forward (e.g., ASTHATAKYR) condition variables, as the manner of collecting information differs greatly between the two. As described above, the pre-1997 variables are created by recoding condition records that were generated in response to several different survey questions about health problems underlying disability days, sick days, or contacts with health care practitioners. By contrast, the variables for 1997 forward are based on a direct question about whether the person had a particular condition, with the information stored on the person record in the original public use files. In addition to differences in data collection, there can be differences in reference period (e.g., pre-1997 EMPHYSEMYRC [Had emphysema, past year] vs. 1997 forward EMPHYSEMEV [Ever told had emphysema]) or question wording that limit comparability over time.

Back to Top