Hispanic oversample

Codes and Frequencies

HISPOVERSAMP reports the Hispanic oversample status for all persons in the 1992 National Health Interview Survey (NHIS). A value of 1 identifies the Hispanic oversample subjects. Values of 2 or 3 indicate those who resided in Hispanic households but who were not part of the oversample. A value of 4 indicates a non-Hispanic household.

Description of Hispanic Oversample 

The Hispanic oversample is described on page 2 of the introduction to the NHIS 1992 survey documentation:

At the request of the National Cancer Institute, one of the cosponsors of the 1992 NHIS, the Hispanic population was oversampled for the 1992 data collection year. This was accomplished by recontacting Hispanic respondents who had participated in the 1991 NHIS and asking them to participate in the 1992 survey. The recontacted Hispanic households were assigned for interview in 1992 in the same week they were assigned for interview in the 1991 NHIS.

Analysts need to keep in mind this oversample feature when using the 1992 NHIS Core data sets. The sampling weights provided on the tapes take into account this oversampling. Those in the oversample may have also participated in the 1991 NHIS and analysts are cautioned to consider the impact of this if they combine the 1991 and 1992 data sets. The oversample respondents can be identified in the 1992 data set and potentially eliminated for analysis; however, the weights for the remainder of the 1992 respondents on the file would no longer be correct and the file would need to be reweighted.

IPUMS NHIS has retained the Hispanic oversample, in spite of these cautions. There are 7,712 subjects in the Hispanic oversample, representing 3.3 percent of the weighted population for 1992.

Consequences of Excluding Versus Including Oversample 

Users may choose to exclude the oversample using the HISPOVERSAMP variable to identify this subgroup. However, as noted above, if members of the oversample are excluded, the sampling weights will be incorrect. Conversely, users may choose to keep the full sample, with the knowledge that some subjects may be represented twice. The impact of keeping the oversample will be negligible for population-level analyses, even when combining multiple years of data. However, analyses of the Hispanic subpopulation for the two-year period of 1991-1992 may be affected by standard error estimates that are too small.


This variable is only available in 1992.


  • 1992: All persons.


  • 1992