Predicting coding patterns using the HCC risk scores can be a valuable endeavor.
EDITOR’S NOTE: Longtime RACmonitor contributing correspondent Frank Cohen, a senior healthcare analyst, is sharing his thoughts on a recent study he conducted of Hierarchical Condition Categories (HCCs) that revealed for the first time the likelihood of over- and under-coding by providers.
You can use the following information to help mitigate the risks of engaging in overzealous clinical documentation integrity (CDI) projects for providers who may already be optimizing their diagnosis coding – while at the same time, identifying those physicians who most likely need your assistance.
Since 2000, the Centers for Medicare & Medicaid Services (CMS) has implemented a health-based risk adjustment model. The purpose is to estimate the relative health status of a given beneficiary. This is done by factoring in diagnosis codes associated to care received and certain demographic information about each beneficiary. In 2004, CMS created a new risk model, the Hierarchical Condition Categories (HCC) model, which adjusts Medicare capitation payments to Medicare Advantage (MA) organizations for the variation in health expenditure risk of enrollees in their plans.
In general, the higher the HCC risk score, the ”sicker” the patient, the higher the estimated costs, and subsequently, the higher the payment rate to the MA organization. Because a significant portion of the risk score was calculated based on the number of diseases or conditions (excluding ESRD, end-stage renal disease) reported for a patient in the form of ICD codes, proper coding became a core concern among MA organizations and providers. Basically, the higher the score, the more the MA organization gets paid, and so, all of a sudden, payors became interested in how providers were documenting and coding for patient visits – and not for the reasons one might think. This wasn’t to audit the provider, or to see whether their documentation supported the payment (as in a recoupment audit), but to see whether the provider documented and coded such that they were maximizing (or maybe optimizing) the risk score. Remember, higher risk score, higher payments. It’s not difficult to see the incentive here.
Out of this was born a niche market called CDI. According to the American Academy of Professional Coders (AAPC), “clinical documentation is the information a person responsible for a patient’s medical care enters in a medical record, which is a repository for an individual’s health information. The entries contained in the medical record may be authored by a physician, dentist, chiropractor, or other healthcare professional. Regulations, accreditation requirements, internal policies, and other rules may define who is allowed to document in the medical record in specific cases.”
They go on to define CDI as: “the process of reviewing medical record documentation for completeness and accuracy. CDI includes a review of disease process, diagnostic findings, and what the documentation might be missing.”
Reading the above paragraph, one could easily conclude, then, that CDI is a good thing, and CDI consultants are helping to improve the overall quality of care that a patient receives. But the reality is that there are some CDI consultants and programs that are more focused on improving payment than the quality of care. I know you are getting tired of reading this, but higher scores, higher payment. And a major component of the risk score calculation is the number of diseases or conditions reported in the patient chart. The reason I know this is a problem is the number of lawsuits the government has filed against providers, consultants, and MA organizations for abusing the HCC risk adjustment program. I did a cursory search and found 27 active lawsuits over this issue.
One of those involves a whistleblower lawsuit wherein the whistleblower (Kathy Ormsby) claimed that the organization (Palo Alto Medical Foundation and Sutter Health) was inflating the number of ICD codes reported in a patient’s record in order to increase the amount of payment under their MA plan. The Government stated that their investigation confirmed what Ormsby had claimed; the organization systematically added false diagnosis codes to the records of their patients. In fact, one audit showed that some 90 percent of all cancer diagnoses were invalid. That same audit also found that 96 percent of stroke diagnoses and 66 percent of diagnoses for fractures were invalid (or falsified).
Another big one was brought against UnitedHealth Group (UHG) by the Government for basically the same complaint: falsifying patient records to increase the risk score and subsequently increasing payments. In this September 2017 case, the Government alleged the following: “in particular, the lawsuit contends that UHG funded chart reviews conducted by HealthCare Partners (HCP), one of the largest providers of services to UHG beneficiaries in California, to increase the risk adjustment payments received from the Medicare program for beneficiaries under HCP’s care. According to the case laid forth by the DOJ (U.S. Department of Justice), UHG allegedly ignored information from these chart reviews about invalid diagnoses, and thus avoided repaying Medicare monies to which it was not entitled.”
So, how do you know whether a provider (or at least the patient charts) lean towards over-coding? The ideal method would be an audit. The problem is that you can’t audit all of a provider’s charts. One alternative is to draw a statistically valid random sample and then extrapolate the results to the universe of claims submitted by that provider. I’m all for that. But how do you know which providers to audit?
And this brings us back to the realm of risk-based auditing. Except in this case, rather than trying to determine the risk of an audit, we are trying to predict whether the provider’s charts accurately reflect the reality of the composite visits. The real question, then, is this: is there some way to estimate (or predict) whether a provider may be under-coding or over-coding, based on some benchmark; and the answer is “probably.” The following is what I did to test this.
First, I imported the data from the most recent Public Use File (PUF), which contains National Provider Identifier (NPI) numbers for over a million physicians in the United States (including Puerto Rico, Guam, and the Federated States of Micronesia). Then I did a whole bunch of filtering, and what I ended up with was a pretty solid database that contained, among other things, unique beneficiaries, total Medicare payments, and the average HCC score by provider. In total, I used these data for around 780,000 providers in 57 different specialties.
From these data, I created a table that reported, by specialty, the average risk score, along with the median standard deviation and inter-quartile range. I then created both a high- and a low-risk indicator for each of those specialties. While what I did was a bit more complicated, in general, I multiplied the standard deviation by two and then subtracted from the mean to get the low risk score, and added it to the mean to get the high risk indicator. Going back to the data set with all of the providers, I would simply test their average risk score (based on their specialty) against the high and low thresholds I created in the summary table. If that provider’s average risk score was below the lower threshold, I would tag that provider as a potential under-coder. If their average risk score was above the upper threshold, I would tag that provider as an over-coder. For example, for cardiology, the lower threshold is 1.139 and the upper threshold is 2.837. Picking a cardiologist at random from the data, I see that their average risk score (for their entire patient population) is 1.097. Since this is below the lower threshold (1.139), I would label this provider as a potential under-coder. For another provider, I get an average risk score of 3.235, which is higher than the upper threshold, and as such, I would label this provider a potential over-coder.
To confirm the validity of my findings, I compared those indicators against the percentile ranking for each of those providers. The reason I did this was because the above method is most accurate for a normal distribution, but the data points were not normally distributed. Note that the percentile rankings were also specialty-specific. I was actually a bit surprised at the results of this test, as those that I tagged as potential over-coders had percentiles that ranged from 85th to 100th, meaning that it would be reasonable to investigate those providers that met either criteria. For the lower threshold, however, the percentile range was far narrower: from 1st to 9th. My conclusion is that the upper test is likely more accurate than the lower test.
What I didn’t do was conduct actual chart audits on these providers, so I can’t say with any degree of certainty that a provider I call an over-coder is actually over-coding (and the same for what I call an under-coder). The value I see in this type of analysis is the ability to prioritize work effort by looking at those that are most likely (predicted) to be either under- or over-coding. Under-coders have the opportunity to improve their clinical documentation to increase their risk scores (legitimately), thereby increasing payments. For over-coders, a risk-based approach is needed to determine if they are, in fact, over-coding.
Whatever the approach, I continue to work toward means and methods that will help to improve the efficiency and accuracy with which healthcare organizations document, code, and bill for their services. And ultimately, I would think that our goals would align with the opportunity to not only improve under-coding docs, but to mitigate the risk that potential over-coding docs face with respect to Government and third-party audits. We have the technology (and sometimes the data) to achieve amazing results.
I guess in the end, it’s just a matter of how much effort we want to expend in the front end to mitigate damages in the back end. According to Benjamin Franklin, “an ounce of prevention is worth a pound of cure.”
And that’s the world according to Frank.