Codes and Frequencies
HISPETH identifies and classifies persons of Hispanic/Spanish/Latino origin or ancestry. According to the Field Representative's Manual for 1976-1981, "The term 'national origin or ancestry' refers to the national or cultural group from which the person is descended. A person may report his origin based on the origin of a parent, a grandparent, or some far-removed ancestor." The 1982 Field Representative's Manual noted, "There is no set rule as to how many generations are to be taken into account in determining origin."
According to the 2020 Survey Description, for some variables, including HISPETH, the 2020 responses of sample adults that were part of the 2020 longitudinal sample were overwritten with their 2019 responses "to mitigate disclosure risks associated with differences in response from repeated measures among the same Sample Adults" (33). The sample adults' actual 2020 responses can be accessed through a Research Data Center (RDC). For more information on the 2020 longitudinal sample, please see SALNGPRTFLG.
The procedures for collecting information about Hispanic origin changed over time, with the most striking shift separating the period 1976-1977 from 1978 and later years.
Data Collection Procedures for 1976-1977
For 1976-1977, this information was collected by handing respondents a card and asking, "Which of those groups best describes [your/person's] national origin or ancestry?" Along with 8 Hispanic categories, the card listed non-Hispanic ancestry groups such as "American," "Other European," "Black," and "Asian or Pacific Islander." If the response was "American," interviewers were to stress that origin or ancestry is determined by the country the person's ancestors came from and to re-ask the question to obtain a more specific response. (The response "American" was to be accepted only if the respondent could not be more specific.) If the respondent gave a response that did not fit into one of the specified categories, the interviewer was to record the response verbatim. If the respondent replied, "Latin American," the interviewer was to probe further, to determine whether the person's ancestry was Mexican, Mexican-American, Chicano, or Central or South American descent. If multiple responses were given, the interviewer was to ask which of the responses "BEST" described the person's national origin or ancestry.
While the general approach was the same for 1976 and 1977, there were some slight differences between the two years. In 1976, if interviewers obtained responses from parents, they were to skip the question for children and transcribe the entries from the parents to the children's records; this directive was not given to interviewers in 1977. The 1976 data separately identify the categories "American Indian," "Another group not listed," "Multiple," "Russian," "Canadian," and "Two Origins, Unknown which is main"; the 1977 data group together these categories plus Unknown/Refused/Not reported under the single label "All Other" (which is recoded as "Origin unknown, refused, or not reported" in the IPUMS NHIS data on HISPETH).
Data Collection Procedures for 1978-1998
Beginning in 1978, the approach shifted to collecting information only on "Main Spanish Origin." Interviewers again handed the respondent a card listing categories and asked, "Are any of those groups [your/person's] national origin or ancestry?" and "Please give me the number of the group." (If the initial question was not understood, the interviewer could rephrase the question as, "Where did [your/person's] ancestors come from?") Categories on the card were now limited to 8 Hispanic categories, plus "No - Not Spanish origin" and "Unknown."
The approach initiated in 1978 continued through 1998.
Data Collection Procedures for 1999-2018
Beginning in 1999, interviewers asked, "Do you consider yourself to be Hispanic or Latino?" From 1999-2018 if the answer was affirmative, the interviewer handed the respondent a card listing categories and asked, "Please give me the number of the group that represent your Hispanic origin or ancestry." Starting in 2019, instead of a card listing categories, the interviewer asked, "What is your Hispanic or Latino ancestry or origin, such as Mexican, Mexican American, Chicano/a, Central or South American, Puerto Rican, Cuban, Dominican (Republic), or Other Hispanic, Latino/a, or Spanish -- and if you have more than one, tell me all of them."
Justification for asking about Hispanic/Spanish/Latino Ancestry
While the 1976-1977 question on "Main National Origin" superficially appears to be a broad inquiry into ancestry, the question appears to be directed more towards gathering information specifically for people of Hispanic origin than for people of other origins.
This is suggested by the greater degree of detail included for Hispanic categories (e.g., 8 as opposed to only 2 categories--"Russian" and "Other European"--for all non-Hispanic European ancestry groups). This is further implied by the fact that, if respondents gave a very specific answer referring to a non-Hispanic category (e.g., "Swedish" or "Vietnamese"), the interviewer simply marked the relevant broad category (e.g., "Other European" or "Asian and Pacific Islander"), rather than recording the response verbatim. A similar strategy of collecting information on Hispanic origin under the guise of collecting information about ancestry generally was employed in other U.S. government surveys; for example, the March Current Population Survey used the same general approach to collect information on Hispanic origin from 1971 through 2002.
The Field Representative's Manual for 1978 stated, "If you are questioned as to why we are asking only about Spanish ancestry, say that we collect information on different cultural groups at different times."
For 1982-1993, the suggested response to questions about why the survey asked only about Spanish ancestry was, "Say that we collect information on certain cultural groups." Beginning in 1994, interviewers were told to respond to such questions by saying "that we collect information on different groups of people and are trying to increase the reliability of the data on Hispanics." Beginning in 2001, the Field Representative's Manual included the following general justification for collecting information on race and national origin: "We collect information on race and national origin for several reasons. The first is to determine whether this household should be included in the sample based on the screening status of this case [for oversampling of Black and Hispanic households]. More is discussed about screening later in this section. The second reason for collecting racial and national origin information is so that data on doctor visits, hospitalizations, and other health variables can be linked to various racial and cultural groups throughout the nation."
Changes in Response Categories
The two responses "Other Latin American" and "Other Spanish" were included among the possible choices beginning in 1978, but the distinction between the two responses was not clarified until the 1990s. Specifically, according to the 1997 Field Representative's Manual, "Other Latin American" encompassed "Argentina, Bolivia, Chile, Honduras, Columbia, Costa Rica, Dominican Republic, Ecuador, El Salvador, Guatemala, Nicaragua, Panama, Paraguay, Peru, Uruguay, Venezuela"; "Other Spanish or Hispanic" encompassed "Balearic Islands, Basque, California, Canary Islands, Catalonia, Hispanic, Iberian (i.e. Spain), Majorcan, Spanish, Spaniard, Spanish-American, Spanish speaking."
Shifts in the recognized categories for Hispanic origin also occurred over time. For example, Mexico, Puerto Rico, and Cuba were the only countries of ancestral origin specifically identified prior to 1999; in that year, the Dominican Republic was also specifically identified. The category "Chicano" was sometimes listed separately, sometimes combined with "Mexican-American," and sometimes not listed at all.
To increase comparability in categories over time, IPUMS NHIS uses a composite coding system for the HISPETH variable. The first digit groups together similar responses falling into broad categories available across years; the second digit supplies additional detail available only in some years. For example, all responses referring to Mexican origin share a first digit of "2," while codes 21 through 24 distinguish between "Mexican-Mexicano," "Mexicano," "Mexican-American," and "Chicano."
The NHIS questionnaire was substantially redesigned in 2019 to introduce a different data collection structure and new content. For more information on changes in terminology, universes, and data collection methods beginning in 2019, please see the user note.
- 1976-2018: All persons.
- 2019-2021: Sample adults age 18+ and sample children age 0-17.
- 1976-2018 : PERWEIGHT
- 2019-2021 : SAMPWEIGHT