Abstract: Data systems collecting information from different sources or over long periods of time can receive multiple reports from the same indi vidual. An important example is public health surveillance systems that monitor conditions with long natural histories. Several state-level systems for surveillance of one such condition, the human immunodeficiency virus (HIV), use codes composed of combinations of non-unique personal charac teristics such as birth date, soundex (a code based on last name), and sex as patient identifiers. As a result, these systems cannot distinguish between several different individuals having identical codes and a unique individual erroneously represented several times. We applied results for occupancy models to estimate the potential magnitude of duplicate case counting for AIDS cases reported to the Centers for Disease Control and Prevention with only non-unique partial personal identifiers. Occupancy models with equal and unequal occupancy probabilities are considered. Unbiased estimators for the numbers of true duplicates within and between case reporting areas are provided. Formulas to calculate estimators’ variances are also provided. These results can be applied to evaluating duplicate reporting in other data systems that have no unique identifier for each individual.