Data Cart

Your data extract

0 variables
0 samples
View Cart


AGE is a 3-digit-numeric variable.

085: Top code for 85 years or older (1963-1968 and 1997-forward)
090: Top code for 90 years or older (1996 only)
099: Top code for 99 years or older (1969-1995)
997: Unknown-refused
998: Unknown-not ascertained
999: Unknown-don't know


AGE reports the individual's age, in years since their last birthday. Starting in 2019, "Unknown-refused" and "Unknown-don't know" are allowed responses. Prior to 2019 age is not coded as "unknown" for any persons included in the IPUMS NHIS data. As the public use file's codebooks for 1998-2003 state, "Because age is an important variable for instrument check items and in developing the weights, all respondents must have data on age."


Starting in 2019, unknown responses are allowed on AGE. The values for AGE are consistent over time, except for changes in topcodes. The top-coded value was 99 (age 99+) for 1969-1995, 90 (age 90+) for 1996, and 85 (age 85+) for 1963-1968 and 1997 forward.

While the meaning of AGE did not change, the basic question wording shifted over time.


For 1968-1972, the main source of information on age was responses to the question, "How old was (person) on his last birthday?" If the respondent did not know a person's age, he or she was to estimate it as closely as possible.

For 1973-1995, interviewers asked the person's date of birth [month, day, and year] rather than age. If the exact date could not be obtained, they were to elicit the approximate date or at least the year of birth. Using an Age Verification Chart, they converted the birthdate information into a figure on age at last birthday and confirmed that figure with the respondent.

In 1997, interviewers asked, "Would you say [person's] years of age?" They also collected information on the person's date of birth and asked such follow-up questions as, "That would make [FULLNAME] [AGE] years old. Is that correct?" to check the accuracy of responses.

For 1998-2003, interviewers asked about age and birthdate simultaneously, via the question, "What is [your/name] age and date of birth? Please give month, day, and year for the date of birth." For 2004 forward, interviewers asked about age and birthdate in two separate questions using wording similar to that used in 1998-2003. As in 1997, checks (e.g., comparing reported age to age indicated by birthdate) and follow-up questions were used in 1998 forward to make certain that age was recorded correctly.

For 1968 forward, if respondents refused to give their own ages or the ages of others, interviewers were to make their own best estimate (e.g., 45-55 years) and to footnote the fact that they had estimated the person's age.

Eliciting information about age via questions about birthdate--rather than simply asking age at last birthday--is intended to limit age heaping (the tendency to round off age to the nearest digit ending in 5 or 0). While age heaping is a serious problem in survey data from some less developed countries, demographers generally consider age reporting to be quite accurate in countries like the modern United States. Moreover, the differences in question wording for the AGE variable will have minimal effect for researchers who chose to work with grouped data (e.g., five-year age groups) rather than with single years of age.

The NHIS questionnaire was substantially redesigned in 2019 to introduce a different data collection structure and new content. For more information on changes in terminology, universes, and data collection methods beginning in 2019, please see the user note.


  • 1963-2018: All persons.
  • 2019-2022: Sample adults age 18+ and sample children age 0-17.


  • 1963-2022