BS/1 Small Business 2.2 serial key or number

BS/1 Small Business 2.2 serial key or number

BS/1 Small Business 2.2 serial key or number

BS/1 Small Business 2.2 serial key or number

Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule

 This page provides guidance about methods and approaches to achieve de-identification in accordance with the Health Insurance Portability and Accountability Act of (HIPAA) Privacy Rule. The guidance explains and answers questions regarding the two methods that can be used to satisfy the Privacy Rule’s de-identification standard: Expert Determination and Safe Harbor1.  This guidance is intended to assist covered entities to understand what is de-identification, the general process by which de-identified information is created, and the options available for performing de-identification.

In developing this guidance, the Office for Civil Rights (OCR) solicited input from stakeholders with practical, technical and policy experience in de-identification.  OCR convened stakeholders at a workshop consisting of multiple panel sessions held March , , in Washington, DC. Each panel addressed a specific topic related to the Privacy Rule’s de-identification methodologies and policies. The workshop was open to the public and each panel was followed by a question and answer period.  Read more on the Workshop on the HIPAA Privacy Rule's De-Identification Standard. Read the Full Guidance.

General

Protected Health Information
Covered Entities, Business Associates, and PHI
De-identification and its Rationale
The De-identification Standard
Preparation for De-identification

Guidance on Satisfying the Expert Determination Method

Have expert determinations been applied outside of the health field?
Who is an “expert?”
What is an acceptable level of identification risk for an expert determination?
How long is an expert determination valid for a given data set?
Can an expert derive multiple solutions from the same data set for a recipient?
How do experts assess the risk of identification of information?
What are the approaches by which an expert assesses the risk that health information can be identified?
What are the approaches by which an expert mitigates the risk of identification of an individual in health information?
Can an Expert determine a code derived from PHI is de-identified?
Must a covered entity use a data use agreement when sharing de-identified data to satisfy the Expert Determination Method?

Guidance on Satisfying the Safe Harbor Method

When can ZIP codes be included in de-identified information?
May parts or derivatives of any of the listed identifiers be disclosed consistent with the Safe Harbor Method?
What are examples of dates that are not permitted according to the Safe Harbor Method?
Can dates associated with test measures for a patient be reported in accordance with Safe Harbor?
What constitutes “any other unique identifying number, characteristic, or code” with respect to the Safe Harbor method of the Privacy Rule?
What is “actual knowledge” that the remaining information could be used either alone or in combination with other information to identify an individual who is a subject of the information?
If a covered entity knows of specific studies about methods to re-identify health information or use de-identified health information alone or in combination with other information to identify an individual, does this necessarily mean a covered entity has actual knowledge under the Safe Harbor method?
Must a covered entity suppress all personal names, such as physician names, from health information for it to be designated as de-identified?
Must a covered entity use a data use agreement when sharing de-identified data to satisfy the Safe Harbor Method?
Must a covered entity remove protected health information from free text fields to satisfy the Safe Harbor Method?

Glossary of Terms

Protected Health Information

The HIPAA Privacy Rule protects most “individually identifiable health information” held or transmitted by a covered entity or its business associate, in any form or medium, whether electronic, on paper, or oral. The Privacy Rule calls this information protected health information (PHI)2. Protected health information is information, including demographic information, which relates to:

  • the individual’s past, present, or future physical or mental health or condition,
  • the provision of health care to the individual, or
  • the past, present, or future payment for the provision of health care to the individual, and that identifies the individual or for which there is a reasonable basis to believe can be used to identify the individual. Protected health information includes many common identifiers (e.g., name, address, birth date, Social Security Number) when they can be associated with the health information listed above. 

For example, a medical record, laboratory report, or hospital bill would be PHI because each document would contain a patient’s name and/or other identifying information associated with the health data content.

By contrast, a health plan report that only noted the average age of health plan members was 45 years would not be PHI because that information, although developed by aggregating information from individual plan member records, does not identify any individual plan members and there is no reasonable basis to believe that it could be used to identify an individual.

The relationship with health information is fundamental.  Identifying information alone, such as personal names, residential addresses, or phone numbers, would not necessarily be designated as PHI.  For instance, if such information was reported as part of a publicly accessible data source, such as a phone book, then this information would not be PHI because it is not related to heath data (see above).  If such information was listed with health condition, health care provision or payment data, such as an indication that the individual was treated at a certain clinic, then this information would be PHI.

Back to top

Covered Entities, Business Associates, and PHI

In general, the protections of the Privacy Rule apply to information held by covered entities and their business associates.  HIPAA defines a covered entity as 1) a health care provider that conducts certain standard administrative and financial transactions in electronic form; 2) a health care clearinghouse; or 3) a health plan.3  A business associate is a person or entity (other than a member of the covered entity’s workforce) that performs certain functions or activities on behalf of, or provides certain services to, a covered entity that involve the use or disclosure of protected health information. A covered entity may use a business associate to de-identify PHI on its behalf only to the extent such activity is authorized by their business associate agreement.

See the OCR website cipsas.com for detailed information about the Privacy Rule and how it protects the privacy of health information.

Back to top

De-identification and its Rationale

The increasing adoption of health information technologies in the United States accelerates their potential to facilitate beneficial studies that combine large, complex data sets from multiple sources.  The process of de-identification, by which identifiers are removed from the health information, mitigates privacy risks to individuals and thereby supports the secondary use of data for comparative effectiveness studies, policy assessment, life sciences research, and other endeavors.

The Privacy Rule was designed to protect individually identifiable health information through permitting only certain uses and disclosures of PHI provided by the Rule, or as authorized by the individual subject of the information.  However, in recognition of the potential utility of health information even when it is not individually identifiable, §(d) of the Privacy Rule permits a covered entity or its business associate to create information that is not individually identifiable by following the de-identification standard and implementation specifications in §(a)-(b).  These provisions allow the entity to use and disclose information that neither identifies nor provides a reasonable basis to identify an individual.4 As discussed below, the Privacy Rule provides two de-identification methods: 1) a formal determination by a qualified expert; or 2) the removal of specified individual identifiers as well as absence of actual knowledge by the covered entity that the remaining information could be used alone or in combination with other information to identify the individual.

Both methods, even when properly applied, yield de-identified data that retains some risk of identification.  Although the risk is very small, it is not zero, and there is a possibility that de-identified data could be linked back to the identity of the patient to which it corresponds.

Regardless of the method by which de-identification is achieved, the Privacy Rule does not restrict the use or disclosure of de-identified health information, as it is no longer considered protected health information.

Back to top

The De-identification Standard

Section (a) of the HIPAA Privacy Rule provides the standard for de-identification of protected health information.  Under this standard, health information is not individually identifiable if it does not identify an individual and if the covered entity has no reasonable basis to believe it can be used to identify an individual.

§ Other requirements relating to uses and disclosures of protected health information.
(a) Standard: de-identification of protected health information. Health information that does not identify an individual and with respect to which there is no reasonable basis to believe that the information can be used to identify an individual is not individually identifiable health information.

Sections (b) and(c) of the Privacy Rule contain the implementation specifications that a covered entity must follow to meet the de-identification standard. As summarized in Figure 1, the Privacy Rule provides two methods by which health information can be designated as de-identified.

Figure 1. Two methods to achieve de-identification in accordance with the HIPAA Privacy Rule.

The first is the “Expert Determination” method:

(b) Implementation specifications: requirements for de-identification of protected health information. A covered entity may determine that health information is not individually identifiable health information only if:
(1) A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:
(i) Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and
(ii) Documents the methods and results of the analysis that justify such determination; or

The second is the “Safe Harbor” method:

(2)(i) The following identifiers of the individual or of relatives, employers, or household members of the individual, are removed:

(A) Names

(B) All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes, except for the initial three digits of the ZIP code if, according to the current publicly available data from the Bureau of the Census:
(1) The geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20, people; and
(2) The initial three digits of a ZIP code for all such geographic units containing 20, or fewer people is changed to

(C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older

(D) Telephone numbers

(L) Vehicle identifiers and serial numbers, including license plate numbers

(E) Fax numbers

(M) Device identifiers and serial numbers

(F) Email addresses

(N) Web Universal Resource Locators (URLs)

(G) Social security numbers

(O) Internet Protocol (IP) addresses

(H) Medical record numbers

(P) Biometric identifiers, including finger and voice prints

(I) Health plan beneficiary numbers

(Q) Full-face photographs and any comparable images

(J) Account numbers

(R) Any other unique identifying number, characteristic, or code, except as permitted by paragraph (c) of this section [Paragraph (c) is presented below in the section “Re-identification”]; and

(K) Certificate/license numbers

(ii) The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information.

Satisfying either method would demonstrate that a covered entity has met the standard in §(a) above.  De-identified health information created following these methods is no longer protected by the Privacy Rule because it does not fall within the definition of PHI.  Of course, de-identification leads to information loss which may limit the usefulness of the resulting health information in certain circumstances. As described in the forthcoming sections, covered entities may wish to select de-identification strategies that minimize such loss.

Re-identification

The implementation specifications further provide direction with respect to re-identification, specifically the assignment of a unique code to the set of de-identified health information to permit re-identification by the covered entity.

If a covered entity or business associate successfully undertook an effort to identify the subject of de-identified information it maintained, the health information now related to a specific individual would again be protected by the Privacy Rule, as it would meet the definition of PHI.  Disclosure of a code or other means of record identification designed to enable coded or otherwise de-identified information to be re-identified is also considered a disclosure of PHI.

(c) Implementation specifications: re-identification. A covered entity may assign a code or other means of record identification to allow information de-identified under this section to be re-identified by the covered entity, provided that:
(1) Derivation. The code or other means of record identification is not derived from or related to information about the individual and is not otherwise capable of being translated so as to identify the individual; and
(2) Security. The covered entity does not use or disclose the code or other means of record identification for any other purpose, and does not disclose the mechanism for re-identification.

Back to top

Preparation for De-identification

The importance of documentation for which values in health data correspond to PHI, as well as the systems that manage PHI, for the de-identification process cannot be overstated.  Esoteric notation, such as acronyms whose meaning are known to only a select few employees of a covered entity, and incomplete description may lead those overseeing a de-identification procedure to unnecessarily redact information or to fail to redact when necessary.  When sufficient documentation is provided, it is straightforward to redact the appropriate fields.  See section for a more complete discussion.

In the following two sections, we address questions regarding the Expert Determination method (Section 2) and the Safe Harbor method (Section 3).

Back to top

Guidance on Satisfying the Expert Determination Method

In §(b), the Expert Determination method for de-identification is defined as follows:

 (1) A person with appropriate knowledge of and experience with generally accepted statistical and scientific principles and methods for rendering information not individually identifiable:
(i) Applying such principles and methods, determines that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by an anticipated recipient to identify an individual who is a subject of the information; and
(ii) Documents the methods and results of the analysis that justify such determination

Back to top

Have expert determinations been applied outside of the health field?

Yes. The notion of expert certification is not unique to the health care field.  Professional scientists and statisticians in various fields routinely determine and accordingly mitigate risk prior to sharing data. The field of statistical disclosure limitation, for instance, has been developed within government statistical agencies, such as the Bureau of the Census, and applied to protect numerous types of data.5

Back to top

Who is an “expert?”

There is no specific professional degree or certification program for designating who is an expert at rendering health information de-identified.  Relevant expertise may be gained through various routes of education and experience. Experts may be found in the statistical, mathematical, or other scientific domains.  From an enforcement perspective, OCR would review the relevant professional experience and academic or other training of the expert used by the covered entity, as well as actual experience of the expert using health information de-identification methodologies.

Back to top

What is an acceptable level of identification risk for an expert determination?

There is no explicit numerical level of identification risk that is deemed to universally meet the “very small” level indicated by the method.  The ability of a recipient of information to identify an individual (i.e., subject of the information) is dependent on many factors, which an expert will need to take into account while assessing the risk from a data set.  This is because the risk of identification that has been determined for one particular data set in the context of a specific environment may not be appropriate for the same data set in a different environment or a different data set in the same environment.  As a result, an expert will define an acceptable “very small” risk based on the ability of an anticipated recipient to identify an individual.  This issue is addressed in further depth in Section

Back to top

How long is an expert determination valid for a given data set?

The Privacy Rule does not explicitly require that an expiration date be attached to the determination that a data set, or the method that generated such a data set, is de-identified information.  However, experts have recognized that technology, social conditions, and the availability of information changes over time.  Consequently, certain de-identification practitioners use the approach of time-limited certifications.  In this sense, the expert will assess the expected change of computational capability, as well as access to various data sources, and then determine an appropriate timeframe within which the health information will be considered reasonably protected from identification of an individual.

Information that had previously been de-identified may still be adequately de-identified when the certification limit has been reached.  When the certification timeframe reaches its conclusion, it does not imply that the data which has already been disseminated is no longer sufficiently protected in accordance with the de-identification standard.  Covered entities will need to have an expert examine whether future releases of the data to the same recipient (e.g., monthly reporting) should be subject to additional or different de-identification processes consistent with current conditions to reach the very low risk requirement.

Back to top

Can an expert derive multiple solutions from the same data set for a recipient?

Yes.  Experts may design multiple solutions, each of which is tailored to the covered entity’s expectations regarding information reasonably available to the anticipated recipient of the data set.  In such cases, the expert must take care to ensure that the data sets cannot be combined to compromise the protections set in place through the mitigation strategy. (Of course, the expert must also reduce the risk that the data sets could be combined with prior versions of the de-identified dataset or with other publically available datasets to identify an individual.) For instance, an expert may derive one data set that contains detailed geocodes and generalized aged values (e.g., 5-year age ranges) and another data set that contains generalized geocodes (e.g., only the first two digits) and fine-grained age (e.g., days from birth).  The expert may certify a covered entity to share both data sets after determining that the two data sets could not be merged to individually identify a patient.  This certification may be based on a technical proof regarding the inability to merge such data sets.  Alternatively, the expert also could require additional safeguards through a data use agreement.

Back to top

How do experts assess the risk of identification of information?

No single universal solution addresses all privacy and identifiability issues. Rather, a combination of technical and policy procedures are often applied to the de-identification task. OCR does not require a particular process for an expert to use to reach a determination that the risk of identification is very small.  However, the Rule does require that the methods and results of the analysis that justify the determination be documented and made available to OCR upon request. The following information is meant to provide covered entities with a general understanding of the de-identification process applied by an expert.  It does not provide sufficient detail in statistical or scientific methods to serve as a substitute for working with an expert in de-identification.

A general workflow for expert determination is depicted in Figure 2. Stakeholder input suggests that the determination of identification risk can be a process that consists of a series of steps.  First, the expert will evaluate the extent to which the health information can (or cannot) be identified by the anticipated recipients.  Second, the expert often will provide guidance to the covered entity or business associate on which statistical or scientific methods can be applied to the health information to mitigate the anticipated risk.  The expert will then execute such methods as deemed acceptable by the covered entity or business associate data managers, i.e., the officials responsible for the design and operations of the covered entity’s information systems.  Finally, the expert will evaluate the identifiability of the resulting health information to confirm that the risk is no more than very small when disclosed to the anticipated recipients.  Stakeholder input suggests that a process may require several iterations until the expert and data managers agree upon an acceptable solution. Regardless of the process or methods employed, the information must meet the very small risk specification requirement.

Figure 2.  Process for expert determination of de-Identification.

Data managers and administrators working with an expert to consider the risk of identification of a particular set of health information can look to the principles summarized in Table 1 for assistance.6  These principles build on those defined by the Federal Committee on Statistical Methodology (which was referenced in the original publication of the Privacy Rule).7 The table describes principles for considering the identification risk of health information. The principles should serve as a starting point for reasoning and are not meant to serve as a definitive list. In the process, experts are advised to consider how data sources that are available to a recipient of health information (e.g., computer systems that contain information about patients) could be utilized for identification of an individual.8

Table 1. Principles used by experts in the determination of the identifiability of health information.
PrincipleDescriptionExamples
ReplicabilityPrioritize health information features into levels of risk according to the chance it will consistently occur in relation to the individual.Low: Results of a patient’s blood glucose level test will vary
High: Demographics of a patient (e.g., birth date) are relatively stable
Data source AvailabilityDetermine which external data sources contain the patients’ identifiers and the replicable features in the health information, as well as who is permitted access to the data source.Low: The results of laboratory reports are not often disclosed with identity beyond healthcare environments.
High: Patient name and demographics are often in public data sources, such as vital records -- birth, death, and marriage registries.
DistinguishabilityDetermine the extent to which the subject’s data can be distinguished in the health information.Low: It has been estimated that the combination of Year of Birth,Gender, and 3-Digit ZIP Code is unique for approximately % of residents in the United States9.  This means that very few residents could be identified through this combination of data alone.
High: It has been estimated that the combination of a patient’s Date of Birth, Gender, and 5-Digit ZIP Code is unique for over 50% of residents in the United States10,11.  This means that over half of U.S. residents could be uniquely described just with these three data elements.
Assess RiskThe greater the replicability, availability, and distinguishability of the health information, the greater the risk for identification.Low: Laboratory values may be very distinguishing, but they are rarely independently replicable and are rarely disclosed in multiple data sources to which many people have access.
High: Demographics are highly distinguishing, highly replicable, and are available in public data sources.

When evaluating identification risk, an expert often considers the degree to which a data set can be “linked” to a data source that reveals the identity of the corresponding individuals.  Linkage is a process that requires the satisfaction of certain conditions.  The first condition is that the de-identified data are unique or “distinguishing.”  It should be recognized, however, that the ability to distinguish data is, by itself, insufficient to compromise the corresponding patient’s privacy.  This is because of a second condition, which is the need for a naming data source, such as a publicly available voter registration database (see Section ).  Without such a data source, there is no way to definitively link the de-identified health information to the corresponding patient. Finally, for the third condition, we need a mechanism to relate the de-identified and identified data sources. Inability to design such a relational mechanism would hamper a third party’s ability to achieve success to no better than random assignment of de-identified data and named individuals. The lack of a readily available naming data source does not imply that data are sufficiently protected from future identification, but it does indicate that it is harder to re-identify an individual, or group of individuals, given the data sources at hand. 

Example Scenario
Imagine that a covered entity is considering sharing the information in the table to the left in Figure 3. This table is devoid of explicit identifiers, such as personal names and Social Security Numbers.  The information in this table is distinguishing, such that each row is unique on the combination of demographics (i.e., Age, ZIP Code, and Gender).  Beyond this data, there exists a voter registration data source, which contains personal names, as well as demographics (i.e., Birthdate, ZIP Code, and Gender), which are also distinguishing.  Linkage between the records in the tables is possible through the demographics.  Notice, however, that the first record in the covered entity’s table is not linked because the patient is not yet old enough to vote.

Figure 3.  Linking two data sources to identity diagnoses.

Thus, an important aspect of identification risk assessment is the route by which health information can be linked to naming sources or sensitive knowledge can be inferred. A higher risk “feature” is one that is found in many places and is publicly available. These are features that could be exploited by anyone who receives the information.  For instance, patient demographics could be classified as high-risk features.  In contrast, lower risk features are those that do not appear in public records or are less readily available.  For instance, clinical features, such as blood pressure, or temporal dependencies between events within a hospital (e.g., minutes between dispensation of pharmaceuticals) may uniquely characterize a patient in a hospital population, but the data sources to which such information could be linked to identify a patient are accessible to a much smaller set of people. 

Example Scenario
An expert is asked to assess the identifiability of a patient’s demographics.  First, the expert will determine if the demographics are independently replicable.  Features such as birth date and gender are strongly independently replicable—the individual will always have the same birth date -- whereas ZIP code of residence is less so because an individual may relocate.  Second, the expert will determine which data sources that contain the individual’s identification also contain the demographics in question.  In this case, the expert may determine that public records, such as birth, death, and marriage registries, are the most likely data sources to be leveraged for identification.  Third, the expert will determine if the specific information to be disclosed is distinguishable.  At this point, the expert may determine that certain combinations of values (e.g., Asian males born in January of and living in a particular 5-digit ZIP code) are unique, whereas others (e.g., white females born in March of and living in a different 5-digit ZIP code) are never unique.  Finally, the expert will determine if the data sources that could be used in the identification process are readily accessible, which may differ by region.  For instance, voter registration registries are free in the state of North Carolina, but cost over $15, in the state of Wisconsin.  Thus, data shared in the former state may be deemed more risky than data shared in the latter.12

Back to top

What are the approaches by which an expert assesses the risk that health information can be identified?

The de-identification standard does not mandate a particular method for assessing risk.

A qualified expert may apply generally accepted statistical or scientific principles to compute the likelihood that a record in a data set is expected to be unique, or linkable to only one person, within the population to which it is being compared. Figure 4 provides a visualization of this concept.13 This figure illustrates a situation in which the records in a data set are not a proper subset of the population for whom identified information is known.  This could occur, for instance, if the data set includes patients over one year-old but the population to which it is compared includes data on people over 18 years old (e.g., registered voters).

The computation of population uniques can be achieved in numerous ways, such as through the approaches outlined in published literature.14,15  For instance, if an expert is attempting to assess if the combination of a patient’s race, age, and geographic region of residence is unique, the expert may use population statistics published by the U.S. Census Bureau to assist in this estimation.  In instances when population statistics are unavailable or unknown, the expert may calculate and rely on the statistics derived from the data set.  This is because a record can only be linked between the data set and the population to which it is being compared if it is unique in both.  Thus, by relying on the statistics derived from the data set, the expert will make a conservative estimate regarding the uniqueness of records. 

Example Scenario
Imagine a covered entity has a data set in which there is one 25 year old male from a certain geographic region in the United States.  In truth, there are five 25 year old males in the geographic region in question (i.e., the population).  Unfortunately, there is no readily available data source to inform an expert about the number of 25 year old males in this geographic region.

By inspecting the data set, it is clear to the expert that there is at least one 25 year old male in the population, but the expert does not know if there are more.  So, without any additional knowledge, the expert assumes there are no more, such that the record in the data set is unique.  Based on this observation, the expert recommends removing this record from the data set.  In doing so, the expert has made a conservative decision with respect to the uniqueness of the record.

In the previous example, the expert provided a solution (i.e., removing a record from a dataset) to achieve de-identification, but this is one of many possible solutions that an expert could offer.  In practice, an expert may provide the covered entity with multiple alternative strategies, based on scientific or statistical principles, to mitigate risk.

Figure 4. Relationship between uniques in the data set and the broader population, as well as the degree to which linkage can be achieved.

The expert may consider different measures of “risk,” depending on the concern of the organization looking to disclose information.  The expert will attempt to determine which record in the data set is the most vulnerable to identification.  However, in certain instances, the expert may not know which particular record to be disclosed will be most vulnerable for identification purposes.  In this case, the expert may attempt to compute risk from several different perspectives. 

Back to top

What are the approaches by which an expert mitigates the risk of identification of an individual in health information?

The Privacy Rule does not require a particular approach to mitigate, or reduce to very small, identification risk.  The following provides a survey of potential approaches.  An expert may find all or only one appropriate for a particular project, or may use another method entirely.

If an expert determines that the risk of identification is greater than very small, the expert may modify the information to mitigate the identification risk to that level, as required by the de-identification standard. In general, the expert will adjust certain features or values in the data to ensure that unique, identifiable elements no longer, or are not expected to, exist.  Some of the methods described below have been reviewed by the Federal Committee on Statistical Methodology16, which was referenced in the original preamble guidance to the Privacy Rule de-identification standard and recently revised.

Several broad classes of methods can be applied to protect data.  An overarching common goal of such approaches is to balance disclosure risk against data utility  If one approach results in very small identity disclosure risk but also a set of data with little utility, another approach can be considered.  However, data utility does not determine when the de-identification standard of the Privacy Rule has been met.

Table 2 illustrates the application of such methods. In this example, we refer to columns as “features” about patients (e.g., Age and Gender) and rows as “records” of patients (e.g., the first and second rows correspond to records on two different patients).

Table 2. An example of protected health information.
Age (Years)GenderZIP CodeDiagnosis
15MaleDiabetes
21FemaleInfluenza
36MaleBroken Arm
91FemaleAcid Reflux

A first class of identification risk mitigation methods corresponds to suppression techniques. These methods remove or eliminate certain features about the data prior to dissemination.  Suppression of an entire feature may be performed if a substantial quantity of records is considered as too risky (e.g., removal of the ZIP Code feature).  Suppression may also be performed on individual records, deleting records entirely if they are deemed too risky to share.  This can occur when a record is clearly very distinguishing (e.g., the only individual within a county that makes over $, per year).   Alternatively, suppression of specific values within a record may be performed, such as when a particular value is deemed too risky (e.g., “President of the local university”, or ages or ZIP codes that may be unique).  Table 3 illustrates this last type of suppression by showing how specific values of features in Table 2 might be suppressed (i.e., black shaded cells).

Table 3. A version of Table 2 with suppressed patient values.
Age (Years)GenderZIP CodeDiagnosis
MaleDiabetes
21FemaleInfluenza
36MaleBroken Arm
FemaleAcid Reflux

A second class of methods that can be applied for risk mitigation are based on generalization (sometimes referred to as abbreviation) of the information.  These methods transform data into more abstract representations.  For instance, a five-digit ZIP Code may be generalized to a four-digit ZIP Code, which in turn may be generalized to a three-digit ZIP Code, and onward so as to disclose data with lesser degrees of granularity.  Similarly, the age of a patient may be generalized from one- to five-year age groups. Table 4 illustrates how generalization (i.e., gray shaded cells) might be applied to the information in Table 2.

Table 4. A version of Table 2 with generalized patient values.
Age (Years)GenderZIP CodeDiagnosis
Under 21Male*Diabetes
Between  21 and 34Female*Influenza
Between 35 and 44Male*Broken Arm
45 and overFemale*Acid Reflux

A third class of methods that can be applied for risk mitigation corresponds to perturbation.  In this case, specific values are replaced with equally specific, but different, values.  For instance, a patient’s age may be reported as a random value within a 5-year window of the actual age.  Table 5 illustrates how perturbation (i.e., gray shaded cells) might be applied to Table 2.  Notice that every age is within +/- 2 years of the original age.  Similarly, the final digit in each ZIP Code is within +/- 3 of the original ZIP Code.

Table 5. A version of Table 2 with randomized patient values.
Age (Years)GenderZIP CodeDiagnosis
16MaleDiabetes
20FemaleInfluenza
34MaleBroken Arm
93FemaleAcid Reflux

In practice, perturbation is performed to maintain statistical properties about the original data, such as mean or variance. 

The application of a method from one class does not necessarily preclude the application of a method from another class.  For instance, it is common to apply generalization and suppression to the same data set.

Using such methods, the expert will prove that the likelihood an undesirable event (e.g., future identification of an individual) will occur is very small.  For instance, one example of a data protection model that has been applied to health information is the k-anonymity principle.18,19  In this model, “k” refers to the number of people to which each disclosed record must correspond.  In practice, this correspondence is assessed using the features that could be reasonably applied by a recipient to identify a patient.  Table 6 illustrates an application of generalization and suppression methods to achieve 2-anonymity with respect to the Age, Gender, and ZIP Code columns in Table 2.  The first two rows (i.e., shaded light gray) and last two rows (i.e., shaded dark gray) correspond to patient records with the same combination of generalized and suppressed values for Age, Gender, and ZIP Code.  Notice that Gender has been suppressed completely (i.e., black shaded cell).

Table 6, as well as a value of k equal to 2, is meant to serve as a simple example for illustrative purposes only.  Various state and federal agencies define policies regarding small cell counts (i.e., the number of people corresponding to the same combination of features) when sharing tabular, or summary, data.20,21,22,23,24,25,26,27  However, OCR does not designate a universal value for k that covered entities should apply to protect health information in accordance with the de-identification standard.  The value for k should be set at a level that is appropriate to mitigate risk of identification by the anticipated recipient of the data set.28

Table 6. A version of Table 2 that is 2-anonymized.
Age (years)GenderZIP CodeDiagnosis
Under 30*Diabetes
Under 30*Influenza
Over 30*Broken Arm
Over 30*Acid Reflux

As can be seen, there are many different disclosure risk reduction techniques that can be applied to health information. However, it should be noted that there is no particular method that is universally the best option for every covered entity and health information set.  Each method has benefits and drawbacks with respect to expected applications of the health information, which will be distinct for each covered entity and each intended recipient.  The determination of which method is most appropriate for the information will be assessed by the expert on a case-by-case basis and will be guided by input of the covered entity

Finally, as noted in the preamble to the Privacy Rule, the expert may also consider the technique of limiting distribution of records through a data use agreement or restricted access agreement in which the recipient agrees to limits on who can use or receive the data, or agrees not to attempt identification of the subjects.  Of course, the specific details of such an agreement are left to the discretion of the expert and covered entity.

Back to top

Can an Expert determine a code derived from PHI is de-identified?

There has been confusion about what constitutes a code and how it relates to PHI.  For clarification, our guidance is similar to that provided by the National Institutes of Standards and Technology (NIST)29, which states:

De-identified information can be re-identified (rendered distinguishable) by using a code, algorithm, or pseudonym that is assigned to individual records.  The code, algorithm, or pseudonym should not be derived from other related information* about the individual, and the means of re-identification should only be known by authorized parties and not disclosed to anyone without the authority to re-identify records.  A common de-identification technique for obscuring PII [Personally Identifiable Information] is to use a one-way cryptographic function, also known as a hash function, on the PII.

*This is not intended to exclude the application of cryptographic hash functions to the information.”

In line with this guidance from NIST, a covered entity may disclose codes derived from PHI as part of a de-identified data set if an expert determines that the data meets the de-identification requirements at §(b)(1).  The re-identification provision in §(c) does not preclude the transformation of PHI into values derived by cryptographic hash functions using the expert determination method, provided the keys associated with such functions are not disclosed, including to the recipients of the de-identified information.

Back to top

Must a covered entity use a data use agreement when sharing de-identified data to satisfy the Expert Determination Method?

No. The Privacy Rule does not limit how a covered entity may disclose information that has been de-identified.  However, a covered entity may require the recipient of de-identified information to enter into a data use agreement to access files with known disclosure risk, such as is required for release of a limited data set under the Privacy Rule.  This agreement may contain a number of clauses designed to protect the data, such as prohibiting re-identification.30 Of course, the use of a data use agreement does not substitute for any of the specific requirements of the Expert Determination Method. Further information about data use agreements can be found on the OCR website.31  Covered entities may make their own assessments whether such additional oversight is appropriate.

Back to top

Guidance on Satisfying the Safe Harbor Method

In §(b), the Safe Harbor method for de-identification is defined as follows:

(2)(i) The following identifiers of the individual or of relatives, employers, or household members of the individual, are removed:

(A) Names

(B) All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes, except for the initial three digits of the ZIP code if, according to the current publicly available data from the Bureau of the Census:
(1) The geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20, people; and
(2) The initial three digits of a ZIP code for all such geographic units containing 20, or fewer people is changed to

(C) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older

(D) Telephone numbers

(L) Vehicle identifiers and serial numbers, including license plate numbers

(E) Fax numbers

(M) Device identifiers and serial numbers

(F) Email addresses

(N) Web Universal Resource Locators (URLs)

(G) Social security numbers

(O) Internet Protocol (IP) addresses

(H) Medical record numbers

(P) Biometric identifiers, including finger and voice prints

(I) Health plan beneficiary numbers

(Q) Full-face photographs and any comparable images

(J) Account numbers

(R) Any other unique identifying number, characteristic, or code, except as permitted by paragraph (c) of this section; and

(K) Certificate/license numbers

(ii) The covered entity does not have actual knowledge that the information could be used alone or in combination with other information to identify an individual who is a subject of the information.

Back to top

When can ZIP codes be included in de-identified information?

Covered entities may include the first three digits of the ZIP code if, according to the current publicly available data from the Bureau of the Census: (1) The geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20, people; or (2) the initial three digits of a ZIP code for all such geographic units containing 20, or fewer people is changed to This means that the initial three digits of ZIP codes may be included in de-identified information except when the ZIP codes contain the initial three digits listed in the Table below.  In those cases, the first three digits must be listed as

OCR published a final rule on August 14, , that modified certain standards in the Privacy Rule.  The preamble to this final rule identified the initial three digits of ZIP codes, or ZIP code tabulation areas (ZCTAs), that must change to for release. 67 FR , (Aug. 14, )).

Utilizing Census data, the following three-digit ZCTAs have a population of 20, or fewer persons. To produce a de-identified data set utilizing the safe harbor method, all records with three-digit ZIP codes corresponding to these three-digit ZCTAs must have the ZIP code changed to Covered entities should not, however, rely upon this listing or the one found in the August 14, regulation if more current data has been published.

The 17 restricted ZIP codes are:

The Department notes that these three-digit ZIP codes are based on the five-digit ZIP Code Tabulation Areas created by the Census Bureau for the Census. This new methodology also is briefly described below, as it will likely be of interest to all users of data tabulated by ZIP code. The Census Bureau will not be producing data files containing U.S. Postal Service ZIP codes either as part of the Census product series or as a post Census product. However, due to the public’s interest in having statistics tabulated by ZIP code, the Census Bureau has created a new statistical area called the Zip Code Tabulation Area (ZCTA) for Census The ZCTAs were designed to overcome the operational difficulties of creating a well-defined ZIP code area by using Census blocks (and the addresses found in them) as the basis for the ZCTAs. In the past, there has been no correlation between ZIP codes and Census Bureau geography. Zip codes can cross State, place, county, census tract, block group, and census block boundaries. The geographic designations the Census Bureau uses to tabulate data are relatively stable over time. For instance, census tracts are only defined every ten years. In contrast, ZIP codes can change more frequently. Because of the ill-defined nature of ZIP code boundaries, the Census Bureau has no file (crosswalk) showing the relationship between US Census Bureau geography and U.S. Postal Service ZIP codes.

ZCTAs are generalized area representations of U.S. Postal Service (USPS) ZIP code service areas. Simply put, each one is built by aggregating the Census blocks, whose addresses use a given ZIP code, into a ZCTA which gets that ZIP code assigned as its ZCTA code. They represent the majority USPS five-digit ZIP code found in a given area. For those areas where it is difficult to determine the prevailing five-digit ZIP code, the higher-level three-digit ZIP code is used for the ZCTA code. For further information, go to: cipsas.com

The Bureau of the Census provides information regarding population density in the United States.  Covered entities are expected to rely on the most current publicly available Bureau of Census data regarding ZIP codes. This information can be downloaded from, or queried at, the American Fact Finder website (cipsas.com).  As of the publication of this guidance, the information can be extracted from the detailed tables of the “Census Summary File 1 (SF 1) Percent Data” files under the “Decennial Census” section of the website. The information is derived from the Decennial Census and was last updated in   It is expected that the Census Bureau will make data available from the Decennial Census in the near future.  This guidance will be updated when the Census makes new information available.

Back to top

May parts or derivatives of any of the listed identifiers be disclosed consistent with the Safe Harbor Method?

No.  For example, a data set that contained patient initials, or the last four digits of a Social Security number, would not meet the requirement of the Safe Harbor method for de-identification.

Back to top

What are examples of dates that are not permitted according to the Safe Harbor Method?

Elements of dates that are not permitted for disclosure include the day, month, and any other information that is more specific than the year of an event.  For instance, the date “January 1, ” could not be reported at this level of detail. However, it could be reported in a de-identified data set as “”.

Many records contain dates of service or other events that imply age.  Ages that are explicitly stated, or implied, as over 89 years old must be recoded as 90 or above.  For example, if the patient’s year of birth is and the year of healthcare service is reported as , then in the de-identified data set the year of birth should be reported as “on or before ”  Otherwise, a recipient of the data set would learn that the age of the patient is approximately

Back to top

Can dates associated with test measures for a patient be reported in accordance with Safe Harbor?

No. Dates associated with test measures, such as those derived from a laboratory report, are directly related to a specific individual and relate to the provision of health care. Such dates are protected health information.  As a result, no element of a date (except as described in above) may be reported to adhere to Safe Harbor. 

Back to top

What constitutes “any other unique identifying number, characteristic, or code” with respect to the Safe Harbor method of the Privacy Rule?

This category corresponds to any unique features that are not explicitly enumerated in the Safe Harbor list (A-Q), but could be used to identify a particular individual.  Thus, a covered entity must ensure that a data set stripped of the explicitly enumerated identifiers also does not contain any of these unique features.  The following are examples of such features:

Identifying Number
There are many potential identifying numbers.  For example, the preamble to the Privacy Rule at 65 FR , (Dec. 28, ) noted that “Clinical trial record numbers are included in the general category of ‘any other unique identifying number, characteristic, or code.’

Identifying Code
A code corresponds to a value that is derived from a non-secure encoding mechanism.  For instance, a code derived from a secure hash function without a secret key (e.g., “salt”) would be considered an identifying element.  This is because the resulting value would be susceptible to compromise by the recipient of such data. As another example, an increasing quantity of electronic medical record and electronic prescribing systems assign and embed barcodes into patient records and their medications.  These barcodes are often designed to be unique for each patient, or event in a patient’s record, and thus can be easily applied for tracking purposes.  See the discussion of re-identification.

Identifying Characteristic
A characteristic may be anything that distinguishes an individual and allows for identification.  For example, a unique identifying characteristic could be the occupation of a patient, if it was listed in a record as “current President of State University.”

Many questions have been received regarding what constitutes “any other unique identifying number, characteristic or code” in the Safe Harbor approach, §(b)(2)(i)(R), above.  Generally, a code or other means of record identification that is derived from PHI would have to be removed from data de-identified following the safe harbor method.  To clarify what must be removed under (R), the implementation specifications at §(c) provide an exception with respect to “re-identification” by the covered entity.  The objective of the paragraph is to permit covered entities to assign certain types of codes or other record identification to the de-identified information so that it may be re-identified by the covered entity at some later date. Such codes or other means of record identification assigned by the covered entity are not considered direct identifiers that must be removed under (R) if the covered entity follows the directions provided in §(c).

Back to top

What is “actual knowledge” that the remaining information could be used either alone or in combination with other information to identify an individual who is a subject of the information?

In the context of the Safe Harbor method, actual knowledge means clear and direct knowledge that the remaining information could be used, either alone or in combination with other information, to identify an individual who is a subject of the information.  This means that a covered entity has actual knowledge if it concludes that the remaining information could be used to identify the individual.  The covered entity, in other words, is aware that the information is not actually de-identified information. 

The following examples illustrate when a covered entity would fail to meet the “actual knowledge” provision.   

Example 1: Revealing Occupation
Imagine a covered entity was aware that the occupation of a patient was listed in a record as “former president of the State University.”  This information in combination with almost any additional data – like age or state of residence – would clearly lead to an identification of the patient.  In this example, a covered entity would not satisfy the de-identification standard by simply removing the enumerated identifiers in §(b)(2)(i) because the risk of identification is of a nature and degree that a covered entity must have concluded that the information could identify the patient.   Therefore, the data would not have satisfied the de-identification standard’s Safe Harbor method unless the covered entity made a sufficient good faith effort to remove the ‘‘occupation’’ field from the patient record.

Example 2: Clear Familial Relation
Imagine a covered entity was aware that the anticipated recipient, a researcher who is an employee of the covered entity, had a family member in the data (e.g., spouse, parent, child, or sibling). In addition, the covered entity was aware that the data would provide sufficient context for the employee to recognize the relative.  For instance, the details of a complicated series of procedures, such as a primary surgery followed by a set of follow-up surgeries and examinations, for a person of a certain age and gender, might permit the recipient to comprehend that the data pertains to his or her relative’s case.  In this situation, the risk of identification is of a nature and degree that the covered entity must have concluded that the recipient could clearly and directly identify the individual in the data.  Therefore, the data would not have satisfied the de-identification standard’s Safe Harbor method. 

Example 3: Publicized Clinical Event
Rare clinical events may facilitate identification in a clear and direct manner.  For instance, imagine the information in a patient record revealed that a patient gave birth to an unusually large number of children at the same time.  During the year of this event, it is highly possible that this occurred for only one individual in the hospital (and perhaps the country).  As a result, the event was reported in the popular media, and the covered entity was aware of this media exposure.  In this case, the risk of identification is of a nature and degree that the covered entity must have concluded that the individual subject of the information could be identified by a recipient of the data.  Therefore, the data would not have satisfied the de-identification standard’s Safe Harbor method. 

Example 4: Knowledge of a Recipient’s Ability
Imagine a covered entity was told that the anticipated recipient of the data has a table or algorithm that can be used to identify the information, or a readily available mechanism to determine a patient’s identity.  In this situation, the covered entity has actual knowledge because it was informed outright that the recipient can identify a patient, unless it subsequently received information confirming that the recipient does not in fact have a means to identify a patient.  Therefore, the data would not have satisfied the de-identification standard’s Safe Harbor method.

Back to top

If a covered entity knows of specific studies about methods to re-identify health information or use de-identified health information alone or in combination with other information to identify an individual, does this necessarily mean a covered entity has actual knowledge under the Safe Harbor method?

No.  Much has been written about the capabilities of researchers with certain analytic and quantitative capacities to combine information in particular ways to identify health information.32,33,34,35  A covered entity may be aware of studies about methods to identify remaining information or using de-identified information alone or in combination with other information to identify an individual.  However, a covered entity’s mere knowledge of these studies and methods, by itself, does not mean it has “actual knowledge” that these methods would be used with the data it is disclosing.  OCR does not expect a covered entity to presume such capacities of all potential recipients of de-identified data.   This would not be consistent with the intent of the Safe Harbor method, which was to provide covered entities with a simple method to determine if the information is adequately de-identified.

Back to top

Must a covered entity suppress all personal names, such as physician names, from health information for it to be designated as de-identified?

No. Only names of the individuals associated with the corresponding health information (i.e., the subjects of the records) and of their relatives, employers, and household members must be suppressed.  There is no explicit requirement to remove the names of providers or workforce members of the covered entity or business associate.  At the same time, there is also no requirement to retain such information in a de-identified data set.

Beyond the removal of names related to the patient, the covered entity would need to consider whether additional personal names contained in the data should be suppressed to meet the actual knowledge specification.  Additionally, other laws or confidentiality concerns may support the suppression of this information.

Back to top

Must a covered entity use a data use agreement when sharing de-identified data to satisfy the Safe Harbor Method?

No. The Privacy Rule does not limit how a covered entity may disclose information that has been de-identified.  However, nothing prevents a covered entity from asking a recipient of de-identified information to enter into a data use agreement, such as is required for release of a limited data set under the Privacy Rule.  This agreement may prohibit re-identification. Of course, the use of a data use agreement does not substitute for any of the specific requirements of the Safe Harbor method. Further information about data use agreements can be found on the OCR website.36  Covered entities may make their own assessments whether such additional oversight is appropriate.

Back to top

Must a covered entity remove protected health information from free text fields to satisfy the Safe Harbor Method?

PHI may exist in different types of data in a multitude of forms and formats in a covered entity.  This data may reside in highly structured database tables, such as billing records. Yet, it may also be stored in a wide range of documents with less structure and written in natural language, such as discharge summaries, progress notes, and laboratory test interpretations.  These documents may vary with respect to the consistency and the format employed by the covered entity.

The de-identification standard makes no distinction between data entered into standardized fields and information entered as free text (i.e., structured and unstructured text) -- an identifier listed in the Safe Harbor standard must be removed regardless of its location in a record if it is recognizable as an identifier.

Whether additional information must be removed falls under the actual knowledge provision; the extent to which the covered entity has actual knowledge that residual information could be used to individually identify a patient. Clinical narratives in which a physician documents the history and/or lifestyle of a patient are information rich and may provide context that readily allows for patient identification. 

Medical records are comprised of a wide range of structured and unstructured (also known as “free text”) documents.  In structured documents, it is relatively clear which fields contain the identifiers that must be removed following the Safe Harbor method.  For instance, it is simple to discern when a feature is a name or a Social Security Number, provided that the fields are appropriately labeled.  However, many researchers have observed that identifiers in medical information are not always clearly labeled.37.38 As such, in some electronic health record systems it may be difficult to discern what a particular term or phrase corresponds to (e.g., is 5/97 a date or a ratio?).  It also is important to document when fields are derived from the Safe Harbor listed identifiers.  For instance, if a field corresponds to the first initials of names, then this derivation should be noted.  De-identification is more efficient and effective when data managers explicitly document when a feature or value pertains to identifiers.  Health Level 7 (HL7) and the International Standards Organization (ISO) publish best practices in documentation and standards that covered entities may consult in this process.

Example Scenario 1
The free text field of a patient’s medical record notes that the patient is the Executive Vice President of the state university.  The covered entity must remove this information. 

Example Scenario 2
The intake notes for a new patient include the stand-alone notation, “Newark, NJ.”  It is not clear whether this relates to the patient’s address, the location of the patient’s previous health care provider, the location of the patient’s recent auto collision, or some other point.  The phrase may be retained in the data.

Back to top

Glossary of Terms

Glossary of terms used in Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule.  Note: some of these terms are paraphrased from the regulatory text; please see the HIPAA Rules for actual definitions.

Business AssociateA person or entity that performs certain functions or activities that involve the use or disclosure of protected health information on behalf of, or provides services to, a covered entity.  A member of the covered entity’s workforce is not a business associate.  A covered health care provider, health plan, or health care clearinghouse can be a business associate of another covered entity.
Covered Entity

Any entity that is

  • a health care provider that conducts certain transactions in electronic form (called here a "covered health care provider").
  • a health care clearinghouse.
  • a health plan.
Cryptographic Hash FunctionA hash function that is designed to achieve certain security properties. Further details can be found at cipsas.com
DisclosureA “disclosure” of Protected Health Information (PHI) is the sharing of that PHI outside of a covered entity. The sharing of PHI outside of the health care component of a covered entity is a disclosure.
Hash FunctionA mathematical function which takes binary data, called the message, and produces a condensed representation, called the message digest.  Further details can be found at cipsas.com
Health Information

Any information, whether oral or recorded in any form or medium, that:

  • (1) Is created or received by a health care provider, health plan, public health authority, employer, life insurer, school or university, or health care clearinghouse; and
  • (2) Relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to an individual.
Individually Identifiable Health Information

Information that is a subset of health information, including demographic information collected from an individual, and:
(1) Is created or received by a health care provider, health plan, employer, or health care clearinghouse; and
(2) Relates to the past, present, or future physical or mental health or condition of an individual; the provision of health care to an individual; or the past, present, or future payment for the provision of health care to the individual; and
(i) That identifies the individual; or
(ii) With respect to which there is a reasonable basis to believe the information can be used to identify the individual.

Protected Health InformationIndividually identifiable health information:
(1) Except as provided in paragraph (2) of this definition, that is:
(i) Transmitted by electronic media;
(ii) Maintained in electronic media; or
(iii) Transmitted or maintained in any other form or medium.
(2) Protected health information excludes individually identifiable health information in:
(i) Education records covered by the Family Educational Rights and Privacy Act, as amended, 20 U.S.C. g;
(ii) Records described at 20 U.S.C. g(a)(4)(B)(iv); and
(iii) Employment records held by a covered entity in its role as employer.
SuppressionWithholding information in selected records from release.

Back to top

1. The Health Information Technology for Economic and Clinical Health (HITECH) Act was enacted as part of the American Recovery and Reinvestment Act of (ARRA).  Section (c) of the HITECH Act requires the Secretary of HHS to issue guidance on how best to implement the requirements for the de-identification of health information contained in the Privacy Rule.
2. Protected health information (PHI) is defined as individually identifiable health information transmitted or maintained by a covered entity or its business associates in any form or medium (45 CFR ).  The definition exempts a small number of categories of individually identifiable health information, such as individually identifiable health information found in employment records held by a covered entity in its role as an employer.
Источник: [cipsas.com]
, BS/1 Small Business 2.2 serial key or number

Cisco IOS VPN Configuration Guide


Site-to-Site and Extranet VPN Business Scenarios


This chapter explains the basic tasks for configuring IP-based, site-to-site and extranet Virtual Private Networks (VPNs) on a Cisco series router using generic routing encapsulation (GRE) and IPSec tunneling protocols. Basic security, Network Address Translation (NAT), Encryption, Cisco IOS weighted fair queuing (WFQ), and extended access lists for basic traffic filtering are configured.


Note In this Guide, the term `Cisco series router' implies that an Integrated Service Adaptor (ISA) or a VAM (VAM, VAM2, or VAM2+) is installed in the Cisco series router.


This chapter describes basic features and configurations used in a site-to-site VPN scenario. Some Cisco IOS security software features not described in this document can be used to increase performance and scalability of your VPN. For up-to-date Cisco IOS security software features documentation, refer to the Cisco IOS Security Configuration Guide and the Cisco IOS Security Command Reference publications for your Cisco IOS Release. For information on how to access the publications, see "Related Documentation" section on page xi.

This chapter includes the following sections:

Scenario Descriptions

Step 1—Configuring the Tunnel

Step 2—Configuring Network Address Translation

Step 3—Configuring Encryption and IPSec

Step 4—Configuring Quality of Service

Step 5—Configuring Cisco IOS Firewall Features

Comprehensive Configuration Examples


Note Throughout this chapter, there are numerous configuration examples and sample configuration outputs that include unusable IP addresses. Be sure to use your own IP addresses when configuring your Cisco series router.


Scenario Descriptions

This section includes the following topics:

Site-to-Site Scenario

Extranet Scenario

Configuring a GRE Tunnel

Configuring an IPSec Tunnel

Configuring Static Inside Source Address Translation

Verifying Static Inside Source Address Translation

Configuring IKE Policies

Verifying IKE Policies

Configuring IPSec and IPSec Tunnel Mode

Configuring Crypto Maps

Configuring Network-Based Application Recognition

Configuring Weighted Fair Queuing

Verifying Weighted Fair Queuing

Configuring Class-Based Weighted Fair Queuing

Verifying Class-Based Weighted Fair Queuing

Creating Extended Access Lists Using Access List Numbers

Verifying Extended Access Lists

Applying Access Lists to Interfaces

Verifying Extended Access Lists Are Applied Correctly

Site-to-Site Scenario

Figure  shows a headquarters network providing a remote office access to the corporate intranet. In this scenario, the headquarters and remote office are connected through a secure GRE tunnel that is established over an IP infrastructure (the Internet). Employees in the remote office are able to access internal, private web pages and perform various IP-based network tasks.


Note Although the site-to-site VPN scenario in this chapter is configured with GRE tunneling, a site-to-site VPN can also be configured with IPSec only tunneling.


Figure  Site-to-Site VPN Business Scenario

Figure  shows the physical elements of the scenario. The Internet provides the core interconnecting fabric between the headquarters and remote office routers. Both the headquarters and remote office are using a Cisco IOS VPN gateway (a Cisco series with an Integrated Service Adaptor (ISA) or VAM (VAM, VAM2, or VAM2+), a Cisco series router, or a Cisco series router).


Note VPN Acceleration Module (VAM) information for your Cisco series router can be found at cipsas.com


The GRE tunnel is configured on the first serial interface in chassis slot 1 (serial 1/0) of the headquarters and remote office routers. Fast Ethernet interface 0/0 of the headquarters router is connected to a corporate server and Fast Ethernet interface 0/1 is connected to a web server. Fast Ethernet interface 0/0 of the remote office router is connected to a PC client.

Figure  Site-to-Site VPN Scenario Physical Elements

The configuration steps in the following sections are for the headquarters router, unless noted otherwise. Comprehensive configuration examples for both the headquarters and remote office routers are provided in the "Comprehensive Configuration Examples" section.

Table  lists the physical elements of the site-to-site scenario.

Headquarters Network Remote Office Network
Site Hardware WAN IP
Address
Ethernet IP Address Site
Hardware
WAN IP
Address
Ethernet IP Address

hq-sanjose

Serial interface 1/0:

Tunnel interface 0:

Fast Ethernet
Interface 0/0:

Fast Ethernet
Interface 0/1:

ro-rtp

Serial interface 1/0:

Tunnel interface 1:

Fast Ethernet
Interface 0/0:

Corporate server

PC A

Web server


Extranet Scenario

The extranet scenario introduced in Figure  builds on the site-to-site scenario by providing a business partner access to the same headquarters network. In the extranet scenario, the headquarters and business partner are connected through a secure IPSec tunnel and the business partner is given access only to the headquarters public server to perform various IP-based network tasks, such as placing and managing product orders.

Figure  Extranet VPN Business Scenario

Figure  shows the physical elements of the scenario. As in the site-to-site business scenario, the Internet provides the core interconnecting fabric between the headquarters and business partner routers. Like the headquarters office, the business partner is also using a Cisco IOS VPN gateway (a Cisco series with an Integrated Service Adaptor (ISA) or VAM (VAM, VAM2, or VAM2+), a Cisco series router, or a Cisco series router).


Note VPN Acceleration Module (VAM) information for your Cisco series router can be found at cipsas.com


The IPSec tunnel between the two sites is configured on the second serial interface in chassis slot 2 (serial 2/0) of the headquarters router and the first serial interface in chassis slot 1 (serial 1/0) of the business partner router. Fast Ethernet interface 0/0 of the headquarters router is still connected to a private corporate server and Fast Ethernet interface 0/1 is connected to a public server. Fast Ethernet interface 0/0 of the business partner router is connected to a PC client.

Figure  Extranet VPN Scenario Physical Elements

The configuration steps in the following sections are for the headquarters router, unless noted otherwise. Comprehensive configuration examples for both the headquarters and business partner routers are provided in the "Comprehensive Configuration Examples" section.

Table  lists the extranet scenario's physical elements.

Headquarters Network Business Partner Network
Site Hardware WAN IP
Address
Ethernet IP Address Site
Hardware
WAN IP
Address
Ethernet IP Address

hq-sanjose

Serial interface 2/0:

Fast Ethernet
Interface 0/0:

Fast Ethernet
Interface 0/1:

bus-ptnr

Serial interface 1/0:

Fast Ethernet
Interface 0/0:

Corporate server

PC B

Web server

1


Step 1—Configuring the Tunnel

Tunneling provides a way to encapsulate packets inside of a transport protocol. Tunneling is implemented as a virtual interface to provide a simple interface for configuration. The tunnel interface is not tied to specific "passenger" or "transport" protocols, but rather, it is an architecture that is designed to provide the services necessary to implement any standard point-to-point encapsulation scheme. Because tunnels are point-to-point links, you must configure a separate tunnel for each link.

Tunneling has the following three primary components:

Passenger protocol, which is the protocol you are encapsulating (AppleTalk, Banyan VINES, Connectionless Network Service [CLNS], DECnet, IP, or Internetwork Packet Exchange [IPX]).

Carrier protocol, such as the generic routing encapsulation (GRE) protocol or IPSec protocol.

Transport protocol, such as IP, which is the protocol used to carry the encapsulated protocol.

Figure  illustrates IP tunneling terminology and concepts.

Figure  IP Tunneling Terminology and Concepts

This section contains the following topics:

Configuring a GRE Tunnel

Configuring an IPSec Tunnel

Configuring a GRE Tunnel

GRE is capable of handling the transportation of multiprotocol and IP multicast traffic between two sites, which only have IP unicast connectivity. The importance of using tunnels in a VPN environment is based on the fact that IPSec encryption only works on IP unicast frames. Tunneling allows for the encryption and the transportation of multiprotocol traffic across the VPN since the tunneled packets appear to the IP network as an IP unicast frame between the tunnel endpoints. If all connectivity must go through the home Cisco series router , tunnels also enable the use of private network addressing across a service provider's backbone without the need for running the Network Address Translation (NAT) feature.

Network redundancy (resiliency) is an important consideration in the decision to use GRE tunnels, IPSec tunnels, or tunnels which utilize IPSec over GRE. GRE can be used in conjunction with IPSec to pass routing updates between sites on an IPSec VPN. GRE encapsulates the clear text packet, then IPSec (in transport or tunnel mode) encrypts the cipsas.com packet flow of IPSec over GRE enables routing updates, which are generally multicast, to be passed over an encrypted link. IPSec alone can not achieve this, because it does not support multicast.

Using redundant GRE tunnels protected by IPSec from a remote router to redundant headquarter routers, routing protocols can be employed to delineate the "primary" and "secondary" headquarter routers. Upon loss of connectivity to the primary router, routing protocols will discover the failure and route to the secondary Cisco series router, thereby providing network redundancy.

It is important to note that more than one router must be employed at HQ to provide resiliency. For VPN resilience, the remote site should be configured with two GRE tunnels, one to the primary HQ VPN router, and the other to the backup HQ VPN router.

This section contains basic steps to configure a GRE tunnel and includes the following tasks:

Configuring the Tunnel Interface, Source, and Destination

Verifying the Tunnel Interface, Source, and Destination

Configuring the Tunnel Interface, Source, and Destination

To configure a GRE tunnel between the headquarters and remote office routers, you must configure a tunnel interface, source, and destination on the headquarters and remote office routers. To do this, complete the following steps starting in global configuration mode.


Note The following procedure assumes the tunnel interface, source, and destination on the remote office router are configured with the values listed in Table 


  Command Purpose

Step 1 

interface tunnel 0 hq-sanjose(config-if)# ip address

Specify a tunnel interface number, enter interface configuration mode, and configure an IP address and subnet mask on the tunnel interface. This example configures IP address and subnet mask for tunnel interface 0 on the headquarters router.

Step 2 

tunnel source

Specify the tunnel interface source address and subnet mask. This example uses the IP address and subnet mask of T3 serial interface 1/0 of the headquarters router.

Step 3 

tunnel destination

Specify the tunnel interface destination address. This example uses the IP address and subnet mask of T3 serial interface 1/0 of the remote office router.

Step 4 

tunnel mode gre ip

Configure GRE as the tunnel mode.

GRE is the default tunnel encapsulation mode, so this command is considered optional.

Step 5 

interface tunnel 0 hq-sanjose(config-if)# no shutdown %LINKUPDOWN: Interface Tunnel0, changed state to up

Bring up the tunnel interface.1

Step 6 

exit hq-sanjose(config)# ip r

Exit back to global configuration mode and configure traffic from the remote office network through the tunnel. This example configures traffic from the remote office Fast Ethernet network ( ) through GRE tunnel 0.

1This command changes the state of the tunnel interface from administratively down to up.


Note When configuring GRE, you must have only Cisco routers or access servers at both ends of the tunnel connection.


Verifying the Tunnel Interface, Source, and Destination

To verify the configuration:

Enter the show interfaces tunnel 0 EXEC command to view the tunnel interface status, configured IP addresses, and encapsulation type. Both the interface and the interface line protocol should be "up."

ski03_#show interfaces tunnel 1 Tunnel1 is up, line protocol is up Hardware is Tunnel MTU bytes, BW 9 Kbit, DLY usec, reliability /, txload 1/, rxload 1/ Encapsulation TUNNEL, loopback not set Keepalive not set Tunnel source , destination Tunnel protocol/transport IPSEC/IPV6 Tunnel TTL Tunnel transmit bandwidth (kbps) Tunnel receive bandwidth (kbps) Tunnel protection via IPSec (profile "tunpro") Last input , output , output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 3 Queueing strategy: fifo Output queue: 0/0 (size/max) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 39 packets input, bytes, 0 no buffer Received 0 broadcasts, 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort 57 packets output, bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 0 output buffer failures, 0 output buffers swapped out

Try pinging the tunnel interface of the remote office router (this example uses the IP address of tunnel interface 1 []):

hq-sanjose(config)# ping
Type escape sequence to abort. Sending 5, byte ICMP Echos to , timeout is 2 seconds: !!!!! Success rate is percent (5/5), round-trip min/avg/max = 4/5/8 ms

Tip If you have trouble, make sure you are using the correct IP address and that you enabled the tunnel interface with the no shutdown command.


Configuring an IPSec Tunnel

IPSec can be configured in tunnel mode or transport mode. IPSec tunnel mode can be used as an alternative to a GRE tunnel, or in conjunction with a GRE tunnel. In IPSec tunnel mode, the entire original IP datagram is encrypted, and it becomes the payload in a new IP packet. This mode allows a network device, such as a router, to act as an IPSec proxy. That is, the router performs encryption on behalf of the hosts. The source router encrypts packets and forwards them along the IPSec tunnel. The destination router decrypts the original IP datagram and forwards it on to the destination system. Tunnel mode protects against traffic analysis; with tunnel mode, an attacker can only determine the tunnel endpoints and not the true source and destination of the packets passing through the tunnel, even if they are the same as the tunnel endpoints.


Note IPSec tunnel mode configuration instructions are described in detail in the "Configuring IPSec and IPSec Tunnel Mode" section.


In IPSec transport mode, only the IP payload is encrypted, and the original IP headers are left intact. (See Figure ) This mode has the advantage of adding only a few bytes to each packet. It also allows devices on the public network to see the final source and destination of the packet. With this capability, you can enable special processing in the intermediate network based on the information in the IP header. However, the Layer 4 header will be encrypted, limiting the examination of the packet. Unfortunately, by passing the IP header in the clear, transport mode allows an attacker to perform some traffic analysis. (See the "Defining Transform Sets and Configuring IPSec Tunnel Mode" section for an IPSec transport mode configuration example.)

Figure  IPSec in Tunnel and Transport Modes

Step 2—Configuring Network Address Translation


Note NAT is used if you have conflicting private address spaces in the extranet scenario. If you have no conflicting private address spaces, proceed to the "Step 3—Configuring Encryption and IPSec" section.


Network Address Translation (NAT) enables private IP internetworks with addresses that are not globally unique to connect to the Internet by translating those addresses into globally routable address space. NAT is configured on the router at the border of a stub domain (referred to as the inside network) and a public network such as the Internet (referred to as the outside network). NAT translates the internal local addresses to globally unique IP addresses before sending packets to the outside network. NAT also allows a more graceful renumbering strategy for organizations that are changing service providers or voluntarily renumbering into classless interdomain routing (CIDR) blocks.

This section only explains how to configure static translation to translate internal local IP addresses into globally unique IP addresses before sending packets to an outside network, and includes the following tasks:

Configuring Static Inside Source Address Translation

Verifying Static Inside Source Address Translation

Static translation establishes a one-to-one mapping between your internal local address and an inside global address. Static translation is useful when a host on the inside must be accessible by a fixed address from the outside.


Note For detailed, additional configuration information on NAT—for example, instructions on how to configure dynamic translation—refer to the "Configuring IP Addressing" chapter in the Network Protocols Configuration Guide, Part 1. NAT is also described in RFC


NAT uses the following definitions:

Inside local address—The IP address that is assigned to a host on the inside network. The address is probably not a legitimate IP address assigned by the Network Information Center (NIC) or service provider.

Inside global address—A legitimate IP address (assigned by the NIC or service provider) that represents one or more inside local IP addresses to the outside world.

Outside local address—The IP address of an outside host as it appears to the inside network. Not necessarily a legitimate address, it was allocated from address space routable on the inside.

Outside global address—The IP address assigned to a host on the outside network by the host owner. The address was allocated from a globally routable address or network space.

Figure  illustrates a router that is translating a source address inside a network to a source address outside the network.

Figure  NAT Inside Source Translation

The following process describes inside source address translation, as shown in Figure 

1. The user at Host opens a connection to Host B.

2. The first packet that the router receives from Host causes the router to check its NAT table.

If a static translation entry was configured, the router goes to Step 3.

If no translation entry exists, the router determines that source address (SA) must be translated dynamically, selects a legal, global address from the dynamic address pool, and creates a translation entry. This type of entry is called a simple entry.

3. The router replaces the inside local source address of Host with the translation entry global address, and forwards the packet.

4. Host B receives the packet and responds to Host by using the inside global IP destination address (DA)

5. When the router receives the packet with the inside global IP address, it performs a NAT table lookup by using the inside global address as a key. It then translates the address to the inside local address of Host  and forwards the packet to Host

6. Host receives the packet and continues the conversation. The router performs Steps 2 through 5 for each packet.

This section contains the following topics:

Configuring Static Inside Source Address Translation

Verifying Static Inside Source Address Translation

Configuring Static Inside Source Address Translation

To configure static inside source address translation, complete the following steps starting in global configuration mode:

  Command Purpose

Step 1 

ip nat inside source static

Establish static translation between an inside local address and an inside global address. This example translates inside local address (the server) to inside global address

Step 2 

interface fastethernet 0/1

Specify the inside interface. This example specifies Fast Ethernet interface 0/1 on the headquarters router.

Step 3 

ip nat inside

Mark the interface as connected to the inside.

Step 4 

interface serial 2/0

Specify the outside interface. This example specifies serial interface 2/0 on the headquarters router.

Step 5 

ip nat outside

Mark the interface as connected to the outside.

Step 6 

exit hq-sanjose(config)#

Exit back to global configuration mode.

The previous steps are the minimum you must configure for static inside source address translation. You could configure multiple inside and outside interfaces.

Verifying Static Inside Source Address Translation

To verify the configuration:

Enter the show ip nat translations verbose EXEC command to see the global and local address translations and to confirm static translation is configured.

hq-sanjose# show ip nat translations verbose Pro Inside global Inside local Outside local Outside global create , use , flags: static

Enter the show running-config EXEC command to see the inside and outside interfaces, global and local address translations, and to confirm static translation is configured (display text has been omitted from the following sample output for clarity).

hq-sanjose# show running-config
interface FastEthernet0/1 ip address no ip directed-broadcast ip nat inside
interface serial2/0 ip address ip nat outside
ip nat inside source static

Step 3—Configuring Encryption and IPSec

IPSec is a framework of open standards, developed by the Internet Engineering Task Force (IETF), that provides data confidentiality, data integrity, and data authentication between participating peers. IPSec provides these security services at the IP layer; it uses IKE to handle negotiation of protocols and algorithms based on local policy, and to generate the encryption and authentication keys to be used by IPSec. IPSec can be used to protect one or more data flows between a pair of hosts, between a pair of security Cisco series routers, or between a security Cisco series router and a host.

IKE is a hybrid security protocol that implements Oakley and SKEME key exchanges inside the Internet Security Association and Key Management Protocol (ISAKMP) framework. While IKE can be used with other protocols, its initial implementation is with the IPSec protocol. IKE provides authentication of the IPSec peers, negotiates IPSec security associations, establishes IPSec keys, and provides IKE keepalives. IPSec can be configured without IKE, but IKE enhances IPSec by providing additional features, flexibility, ease of configuration for the IPSec standard, and keepalives, which are integral in achieving network resilience when configured with GRE.

Certification authority (CA) interoperability is provided by the ISM in support of the IPSec standard. It permits Cisco IOS devices and CAs to communicate so that your Cisco IOS device can obtain and use digital certificates from the CA. Although IPSec can be implemented in your network without the use of a CA, using a CA provides manageability and scalability for IPSec.

The CA must be properly configured to issue certificates. You must also configure the peers to obtain certificates from the CA. Configure this certificate support as described in the "Configuring Certification Authority Interoperability" chapter of the Cisco IOSSecurity Configuration Guide (see "Related Documentation" section on page xi for additional information on how to access these documents.

To provide encryption and IPSec tunneling services on a Cisco series router, you must complete the following tasks:

Configuring IKE Policies

Verifying IKE Policies

Configuring IPSec and IPSec Tunnel Mode

Configuring Crypto Maps


Note You can configure a static crypto map, create a dynamic crypto map, or add a dynamic crypto map into a static crypto map. Refer to the "Configuring Crypto Maps" section.


Optionally, you can configure CA interoperability. This guide does not explain how to configure CA interoperability on your Cisco series router. Refer to the "IP Security and Encryption" part of the Security Configuration Guide and the Cisco IOS Security Command Reference publication for detailed information on configuring CA interoperabilty. See "Related Documentation" section on page xi for additional information on how to access these publications.


Note This section only contains basic configuration information for enabling encryption and IPSec tunneling services. Refer to the "IP Security and Encryption" part of the Cisco IOS Security Configuration Guide and the Security Command Reference publications for detailed configuration information on IPSec, IKE, and CA. See "Related Documentation" section on page xi for information on how to access these publications.

Refer to the Integrated Service Adapter and Integrated Service Module Installation and Configuration publication for detailed configuration information on the ISM.


This section contains the following topics:

Configuring IKE Policies

Verifying IKE Policies

Configuring IPSec and IPSec Tunnel Mode

Configuring Crypto Maps

Configuring IKE Policies

Источник: [cipsas.com]
BS/1 Small Business 2.2 serial key or number

You can view and update your serial number from within your software without reinstalling.


When New Serial Numbers Are Needed

Here are some of the situations in which you may need to enter a new serial number for your Autodesk software:

  • Converting an educational license to a commercial license
  • Starting a new subscription after the previous one expired
  • Issued a new serial number by a Contract Manager or Software Coordinator
  • Term extended on an existing license

Note about Suites: If you have a suite of products that use a single serial number, you must update the serial number for each product in the suite.

Return to Top

Update Serial Number from Renew License Screen

If your software subscription expires or your user permissions change, you may receive an activation screen with one of the following messages:

  • Renew your license
  • Contact your administrator to request permission to use this product

 

To renew your license for your Autodesk software:

  1. Get a new serial number. Individual users can renew their software subscription to replace an expired license. Enterprise users can contact their contract manager to get a new serial number.
  2. Enter the serial number in the renew license dialog box and click Activate.

Return to Top

Update Serial Number from Software Menu

You can change the serial number for your Autodesk software from within the Help menu of most products.

To change your serial number from the software menu:

  1. Start your Autodesk software.
  2. Follow one of these paths in your software Help menu (path may vary by product):
    • Help > About
    • Help > About [Product Name]
    • Help > Autodesk Product Information > About [Product Name]
  3. In the About window, click Manage License.


     
  4. In the License Manager window, click the arrow next to the product name to display product details.  Then click Update, next to Serial Number.
     
  5. Enter your product serial number and click the Activate button.

    See: Find Serial Numbers & Product Keys

    Note: In some cases, you must restart the product to display the updated serial number.

Return to Top

Additional Activation Resources

Источник: [cipsas.com]
.

What’s New in the BS/1 Small Business 2.2 serial key or number?

Screen Shot

System Requirements for BS/1 Small Business 2.2 serial key or number

Add a Comment

Your email address will not be published. Required fields are marked *