This is the second in a series of five articles on the search for reliable health information.

EDITOR’S NOTE: This article is focused on that component of the health information domain related to the acquisition of data.   

This is the most involved article of the set, because it represents the critical requirement for any healthcare information. If the data captured at its source is incomplete or incorrect, no amount of technical or analytic intervention will make it right. Some might say that this is a clinical, rather than a health information, issue, but you can’t separate the two. Without clinical processes, there is no health information.

The process of acquiring data involves:

  • Observation of events and objects related to a patient’s health state, and the related interventions to preserve or improve that health state.
  • Documentation of those observations.
  • Codification and standardization of the documentation to support database storage and query, and allow analysis across time and enterprises.

This portion of the information process can be overlooked. There may be incorrect assumptions about the accuracy and completeness of factual data from the patient or other sources, as well as from clinical investigation, documentation, and the consistency of coding. If this first step in the information process is compromised, then all subsequent analytic efforts will be compromised. “Big data” and sophisticated technology are no solution for data that is compromised at the source.

As mentioned in the first article, the following are high-level requirements for data acquisition:

  • Complete and accurate observations relevant to the area of study.
  • Complete and accurate documentation of those observations.
  • Standardization of the observed facts through consistent comparable coding and terminology.

This article provides case studies as examples to illustrate the potential differences in the process of data acquisition. The first case represents a less-than-desirable approach, as compared the second. While fictitious, these examples are based on variations found in reviews of actual patient charts.

Complete and Accurate Observations Relevant to the Area of Study

The first requirement relates to the process of gathering data relating to an encounter by the patient with any component of the healthcare delivery system, concerning the facts related to the patient’s health condition. This data can include clinical observations and findings, symptoms, personal and family history, demographic information, socioeconomic information, relevant financial transactions, or any other data that may be appropriate to the provision of healthcare to that patient.


The following illustrate some of the challenges related to observations about the patient’s health condition.

  • Inconsistent and incomplete data gathering from the patient and other sources about the patient.

A review of patient records across different care providers shows a wide variation in data capture, from the patient interview and other sources relevant to the patient’s health condition at any point. We might assume that clinicians were trained to collect information about the same condition in a consistent fashion, but the training is variable; the application of that training is even more variable. Clinicians often say that they simply do not have enough time to capture the ideal scope of patient information in delivering care, and therefore need to compromise to meet their productivity requirements. While this may be true, it represents a sorry state of the professionalism required to provide high-quality care. Data that was not captured in the record may significantly (or even adversely) impact the decisions made regarding the course of treatment, resulting in a less-than-optimal outcome.

  • Inconsistent and incomplete physical evaluation or study, relevant to the patient’s health condition.

The actual examination of the patient, including the performance of appropriate studies, also varies greatly among providers, in relation to the same type of condition, due to the same reasons explained above. Unfortunately, clinicians’ use of ordered studies may also vary because of other incentives. The opportunity for financial benefit to the clinician or the healthcare organization as a whole may influence the nature and frequency of ordered studies. Similarly, the fear of malpractice risk may influence the ordering of studies that may or may not be relevant to the patient’s condition.

Case Studies

Case 1: Doctor A is seeing a young patient with an earache and asks the patient’s mother how long this complaint has been going on, and if there has been any fever noted. Based on this discussion, he orders an antibiotic and schedules a follow-up visit. He does not actually inspect the ear and does not order any tests.

Case 2: Doctor B is seeing an identical patient and asks a number of questions about the patient, including:

  • When did the complaint start?
  • Has there been any fever noted?
  • Does the pain keep the patient awake at night?
  • Has he had an ear infection or similar condition in the past?
  • Is there anyone in the family that has reported an infection of any type?
  • Has there been any drainage from the ear?
  • Which ear is involved?
  • Does he have any other health problems at this time?
  • What medications does he take? Is there any prior use of antibiotics? Any allergies?
  • Are there any other symptoms the mother has noticed?
  • — and a number of other questions relevant to the patient’s case.

The clinician’s examination includes:

  • Measurement of basic vital signs.
  • Detailed examination of the ear canal and eardrum.
  • Culture of drainage from the ear and from around the eardrum.
  • Basic physical examination of the eyes, throat, neck, chest, etc.

It is clear that the second case produced a much richer source of data, which could result in a substantial difference in the decisions made about appropriate patient care. As shown, unfortunately, there is little consistency in how two very similar cases would be handled in encounters by two different providers. Due to this lack of consistency in treatment approaches, not only can the patient’s care be compromised, but the value of data subsequently used in population-based analysis of different health conditions and across patient populations is significantly compromised.

Potential Solutions

The variation in healthcare delivery, influenced by training, productivity pressures, financial reimbursement, and individual provider temperament, results in difficult challenges in achieving a better state of data acquisition at the source. To improve data acquisition, a number of things need to happen:

  • Clinician training and testing of individual competencies need to be more standardized nationally. If a “best approach” can be identified, then those standards should be applied to all clinicians.
  • Clinical audits should be conducted, with actionable findings evaluated by clinical leadership, to confirm that appropriate evaluations of patients are being performed, in accordance with the nature of the clinical conditions.
  • Assessment of clinical data capture should be performed to ensure its completeness and accuracy, with proper incentives in place to reward good performance, as well as to remediate any inadequacies identified.
  • The appropriateness of productivity requirements should be evaluated to ensure that they align with requirements for good quality care and appropriate data capture.
  • Technology should be used to focus on easing the burden of data capture and providing guidance and support for accomplishing appropriate clinical observations that are consistent with the patient conditions.


Complete and Accurrate Documentation of Observations

The best observations mean nothing if those observations are not documented. The general rule is: “if it wasn’t documented, it wasn’t done.” Although some clinicians may claim that they “know the patient,” and therefore, detailed documentation is not necessary to effectively manage their care, no clinician can remember all the facts relevant to a patient’s case. 


The following illustrate some of the challenges related to documentation of patient care encounters:

  • Clinician documentation: the documentation of what was observed is also highly variable across providers for the same patient case. While the level of clinical documentation required in medical school was much more complete, as the financial and administrative pressures of the real world impact the practice of medicine, documentation begins to suffer. Since there are no comprehensive, universal standards for what should be documented for any given health condition, it is difficult to hold clinicians to a “higher standard.”
  • System-driven errors: as the use of electronic health records (EHRs) has become more pervasive, there has been an unintended consequence: the “copy-and-paste” error. In the attempt to reduce the documentation burden on providers, some EHR systems provide a means to electronically copy clinical data from a previous encounter, or insert default “normal values” into an observation field. In doing so, such “default data” can copy forward historical data that may no longer be correct, inadvertently incorporating inaccurate data. Such errors can include examinations and other activities that were not performed, historical patient status different than the patient’s present state, or clinical conditions that were never observed. The inclusion of inaccurate data is clearly not appropriate – and, in some cases, can result in allegations and convictions of fraud. Unfortunately, these types of documentation issues are becoming more frequently seen.

Case Studies

Case 1: Based on the same clinical scenario listed above, Doctor A most commonly documented that the patient had “otalgia,” and that he prescribed an antibiotic. At some point, Doctor A acquired an EHR. Although he did not perform any more extensive observation of the patient, the EHR system offered the use of “default data” via a set of templates built around commonly occurring clinical parameters. As a result, while the documentation appears extensive, few of the observations recorded in the current encounter record were actually present.

Case 2: While Doctor B uses the same EHR system as Doctor A, the documentation was more complete, and the data recorded represented actual observed facts.

Potential Solutions

There is no doubt that the level of documentation, when compared across a wide range of clinical encounters, even those with more extensive evaluations, varies greatly in completeness and accuracy. Electronic health records offer the opportunity for better documentation, but do not necessarily improve the documentation of the actual facts.

  • The importance of documentation for achieving quality patient care should be constantly encouraged by leadership, and supported through training and audit-based incentives.
  • There needs to be a better system of control over the certification of EHRs to prevent the inappropriate inclusion of “default data.” Terms such as “within normal limits” and similar default values should not be available for automatic insertion by the EHR system. The clinician should be required to input the observed facts. For example, documentation of an eye examination should require the clinician to include “pupils equal, round reactive to light and accommodation” or “PERRLA,” only based on actual observed findings.
  • Some current audit procedures include the use of a “secret patient,” whereby a person pretends to be a patient, has an encounter, and then reports back, to determine if the activities documented were actually performed. While some might view this approach as offensive, these types of audits are actually occurring now, as part of fraud and abuse investigations. Healthcare organizational leadership may wish to replicate the model as a component of their own internal auditing, for training purposes, and to help avoid potential allegations of fraud or litigation.
  • While reducing the burden of documentation is an important and desirable goal, there must be a balance to ensure that it is not too easy to insert inaccurate data into the record.

Standardization of Observed Facts Through Consistent Comparable Coding and Terminology

Assuming the level of documentation is accurate, the resultant findings need to be converted to a coding standard, such as ICD or CPT codes, so that population data across enterprises and over time produces comparable data that accurately reflects patient conditions and treatments. Medical terminology is so variable, there is no way to uniformly represent clinical concepts for analysis in a manner that ensures that data comparisons are valid. While there is a great deal of interest in “natural language processing,” this technology has significant limitations in identifying common concepts across diverse documentation sources.


Even when assuming that the best codification of clinical evaluation and documentation is assigned to a standard bit of data, achieving a standardized representation of the clinical language as data has significant challenges:

  • You can’t process language that isn’t in the record.
  • The clinical language used is extremely difficult to normalize, and may require clinical knowledge that the processing system may not have in its logic. For example:
    • The concept of “hip fracture” may have documentation that indicates:
      • Hip fracture, or…
      • Fracture of the upper femur, or…
      • Fracture of the neck of the femur, or…
      • Intertrochanteric fracture, or…
      • Intracapsular proximal femur fracture, or…
      • Any of a number of other phrases that all indicate the presence of a hip fracture.
    • The concept of a drug-induced condition may have documentation that states:
      • <Drug> induced
      • Caused by <drug>
      • Complication of <drug>
      • <Drug> response
      • Sequelae of <drug>
      • Secondary to <drug>
      • … or a long list of other terms for each named drug that all represent the medical concept of a drug-induced condition.

There are many examples in which the language used in clinical documentation compared to the language used in coding is extremely variable. The reconciliation of this variable terminology to a normalized concept of data reporting is extremely complex. Despite systems that claim to have the magic solution, most fail actual scenario-based testing.

  • Clinicians with little or no coding experience are often asked to pick codes available within their EHR system. Coding requires differentiation across numerous codes, adherence to complex guidelines, and specific training; most clinicians are not trained as coders. Many times, the code selected by the clinician does not follow the standard coding guidelines.
  • For example, it is far easier for a clinician to pick a non-specific code such as “otitis” (ear inflammation) or “otalgia” (ear pain), rather than search for and select the more specific code that represents the actual condition of “acute suppurative otitis media without spontaneous rupture of eardrum, right ear.” In many instances, the EHR directs the clinician to a less specific code. Unspecified codes severely limit the level of analysis that can be performed using code-based data, because the rich degree of actual clinical details is not included in the non-specific codes.

Case Studies

Case 1: Based on the same clinical scenario shown above, Doctor A selects a code for “otalgia” (ear pain). This code is obviously vague or non-specific but is the only code that could be used, based on what was observed and documented. Even in the face of better observation and documentation, Doctor A might still select this non-specific code, because:

  • It is the easiest code to use.
  • It is the code that the EHR system presented to him.
  • He has no incentive to search for a more specific code.

Case 2: Doctor B has an EHR system that supports a much more detailed search of clinically applicable codes, based on the clinical concepts that he has documented.

Potential Solutions

While solutions are continuing to evolve, currently, the coding dilemma remains a challenge. A number of changes will be needed to improve coding and standardization:

  • The coding standards need a great deal of improvement to achieve a truly usable set of codes for clinical practice. While it is beyond the scope of this article, there has been a great deal written about the problems with the current design of service and diagnosis code sets.
  • Clinicians should be incentivized to use the most accurate and specific codes that represent their assessment of the patient condition and the actions they undertook and ordered to treat the condition.
  • EHR systems should provide a robust, “concept-based” search that converts commonly used medical terminology into standardized “metadata” tags that have been mapped to existing codes. For example, if the clinician determines that a patient has a middle ear infection, that the condition is acute rather than chronic, that there is a rupture of the eardrum, that there is purulent drainage, and that it involves the right ear, the search engine should be able to find the most specific code to fit the reported clinical concepts, independent of the specific language used.
  • The interface between clinical documentation and coding, be it code selection performed by an individual, or machine-driven, needs to be appropriately designed and thoroughly tested to achieve the data compilation described above.
  • Clinicians should not be required to collect volumes of irrelevant data, but should focus on all the data that would make a difference in understanding the nature of the patient condition and how that condition is treated.


The acquisition of data at the source of clinical activity is by far the most complex part of acquiring more reliable healthcare information. The challenges are significant, and the potential solutions will require a substantial commitment of resources by individuals and organizations. This source data is the key first step in a critical path. If the source data is incomplete and inaccurate, no level of technology or statistical analysis can overcome that obstacle.

Programming Note:

Listen to Dr. Nichols report this story live today during Talk Ten Tuesday, 10-10:30 a.m. EDT.


You May Also Like

HCCs: The Role of CDI and Risk Scores

HCCs: The Role of CDI and Risk Scores

Predicting coding patterns using the HCC risk scores can be a valuable endeavor. EDITOR’S NOTE: Longtime RACmonitor contributing correspondent Frank Cohen, a senior healthcare analyst,

Read More

Leave a Reply

Your Name(Required)
Your Email(Required)