A Study of Contractor Consistency in Reviewing Extrapolated Overpayments

By Frank Cohen, MPA
January 28, 2021

CMS levies billions of dollars in overpayments a year against healthcare providers, based on the use of extrapolation audits.

The use of extrapolation in Medicare and private payer audits has been around for quite some time now. And lest you be of the opinion that extrapolation is not appropriate for claims-based audits, there are many, many court cases that have supported its use, both specifically and in general. Arguing that extrapolation should not have been used in a given audit, unless that argument is supported by specific statistical challenges, is mostly a waste of time.

For background purposes, extrapolation, as it is used in statistics, is a “statistical technique aimed at inferring the unknown from the known. It attempts to predict future data by relying on historical data, such as estimating the size of a population a few years in the future on the basis of the current population size and its rate of growth,” according to a definition created by Eurostat, a component of the European Union. For our purposes, extrapolation is used to estimate what the actual overpayment amount might likely be for a population of claims, based on auditing a smaller sample of that population. For example, say a Uniform Program Integrity Contractor (UPIC) pulls 30 claims from a medical practice from a population of 10,000 claims. The audit finds that 10 of those claims had some type of coding error, resulting in an overpayment of $500. To extrapolate this to the entire population of claims, one might take the average overpayment, which is the $500 divided by the 30 claims ($16.67 per claim) and multiply this by the total number of claims in the population. In this case, we would multiply the $16.67 per claim by 10,000 for an extrapolated overpayment estimate of $166,667.

The big question that normally crops up around extrapolation is this: how accurate are the estimates? And the answer is (wait for it …), it depends. It depends on just how well the sample was created, meaning: was the sample size appropriate, were the units pulled properly from the population, was the sample truly random, and was it representative of the population? The last point is particularly important, because if the sample is not representative of the population (in other words, if the sample data does not look like the population data), then it is likely that the extrapolated estimate will be anything but accurate.

To account for this issue, referred to as “sample error,” statisticians will calculate something called a confidence interval (CI), which is a range within which there is some acceptable amount of error. The higher the confidence value, the larger the potential range of error. For example, in the hypothetical audit outlined above, maybe the real average for a 90-percent confidence interval is somewhere between $15 and $18, while, for a 95-percent confidence interval, the true average is somewhere between $14 and $19. And if we were to calculate for a 99-percent confidence interval, the range might be somewhere between $12 and $21. So, the greater the range, the more confident I feel about my average estimate. Some express the confidence interval as a sense of true confidence, like “I am 90 percent confident the real average is somewhere between $15 and $18,” and while this is not necessarily wrong, per se, it does not communicate the real value of the CI. I have found that the best way to define it would be more like “if I were to pull 100 random samples of 30 claims and audit all of them, 90 percent would have a true average of somewhere between $15 and $18,” meaning that the true average for some 1 out of 10 would fall outside of that range – either below the lower boundary or above the upper boundary. The main reason that auditors use this technique is to avoid challenges based on sample error.

To the crux of the issue, the Centers for Medicare & Medicaid Services (CMS) levies billions of dollars in overpayments a year against healthcare providers, based on the use of extrapolation audits. And while the use of extrapolation is well-established and well-accepted, its use in an audit is not an automatic, and depends upon the creation of a statistically valid and representative sample. Thousands of extrapolation audits are completed each year, and for many of these, the targeted provider or organization will appeal the use of extrapolation. In most cases, the appeal is focused on one or more flaws in the methodology used to create the sample and calculate the extrapolated overpayment estimate. For government audits, such as with UPICs, there is a specific appeal process, as outlined in their Medical Learning Network booklet, titled “Medicare Parts A & B Appeals Process.”

On Aug.0 20, 2020, the U.S. Department of Health and Human Services Office of Inspector General (HHS OIG) released a report titled “Medicare Contractors Were Not Consistent in How They Reviewed Extrapolated Overpayments in the Provider Appeals Process.” This report opens with the following statement: “although MACs (Medicare Administrative Contractors) and QICs (Qualified Independent Contractors) generally reviewed appealed extrapolated overpayments in a manner that conforms with existing CMS requirements, CMS did not always provide sufficient guidance and oversight to ensure that these reviews were performed in a consistent manner.” These inconsistencies were associated with $42 million in extrapolated payments from fiscal years 2017 and 2018 that were overturned in favor of the provider. It’s important to note that at this point, we are only talking about appeal determinations at the first and second level, known as redetermination and reconsideration, respectively.

Redetermination is the first level of appeal, and is adjudicated by the MAC. And while the staff that review the appeals at this level are supposed to have not been involved in the initial claim determination, I believe that most would agree that this step is mostly a rubber stamp of approval for the extrapolation results. In fact, of the hundreds of post-audit extrapolation mitigation cases in which I have been the statistical expert, not a single one was ever overturned at redetermination.

The second level of appeal, reconsideration, is handled by a QIC. In theory, the QIC is supposed to independently review the administrative records, including the appeal results of redetermination. Continuing with the prior paragraph, I have to date had only several extrapolation appeals reversed at reconsideration; however, all were due to the fact that the auditor failed to provide the practice with the requisite data, and not due to any specific issues with the statistical methodology. In two of those cases, the QIC notified the auditor that if they were to get the required information to them, they would reconsider their decision. And in two other cases, the auditor appealed the decision, and it was reversed again. Only the fifth case held without objection and was adjudicated in favor of the provider.

Maybe this is a good place to note that the entire process for conducting extrapolations in government audits is covered under Chapter 8 of the Medicare Program Integrity Manual (PIM). Altogether, there are only 12 pages within the entire Manual that actually deal with the statistical methodology behind sampling and extrapolation; this is certainly not enough to provide the degree of guidance required to ensure consistency among the different government contractors that perform such audits. And this is what the OIG report is talking about.

Back to the $42 million that was overturned at either redetermination or reconsideration: the OIG report found that this was due to a “type of simulation testing that was performed only by a subset of contractors.” The report goes on to say that “CMS did not intend that the contractors use this procedure, (so) these extrapolations should not have been overturned. Conversely, if CMS intended that contractors use this procedure, it is possible that other extrapolations should have been overturned but were not.” This was quite confusing for me at first, because this “simulation” testing was not well-defined, and also because it seemed to say that if this procedure was appropriate to use, then more contractors should have used it, which would have resulted in more reversals in favor of the provider.

Interestingly, CMS seems to have written itself an out in Chapter 8, section 8.4.1.1 of the PIM, which states that “[f]ailure by a contractor to follow one or more of the requirements contained herein does not necessarily affect the validity of the statistical sampling that was conducted or the projection of the overpayment.” The use of the term “does not necessarily” leaves wide open the fact that the failure by a contractor to follow one or more of the requirements may affect the validity of the statistical sample, which will affect the validity of the extrapolated overpayment estimate.

Regarding the simulation testing, the report stated that “one MAC performed this type of simulation testing for all extrapolation reviews, and two MACs recently changed their policies to include simulation testing for sample designs that are not well-supported by the program integrity contractor. In contrast, both QICs and three MACs did not perform simulation testing and had no plans to start using it in the future.” And even though it was referenced some 20 times, with the exception of an example given as Figure 2 on page 10, the report never did describe in any detail the type of simulation testing that went on. From the example, it was evident to me that the MACs and QICs involved were using what is known as a Monte Carlo simulation. In statistics, simulation is used to assess the performance of a method, typically when there is a lack of theoretical background. With simulations, the statistician knows and controls the truth. Simulation is used advantageously in a number of situations, including providing the empirical estimation of sampling distributions. Footnote 10 in the report stated that ”reviewers used the specific simulation test referenced here to provide information about whether the lower limit for a given sampling design was likely to achieve the target confidence level.” If you are really interested in learning more about it, there is a great paper called
“The design of simulation studies in medical statistics” by Burton et al. (2006).

Its application in these types of audits is to “simulate” the audit many thousands of times to see if the mean audit results fall within the expected confidence interval range, thereby validating the audit results within what is known as the Central Limit Theorem (CLT).

Often, the sample sizes used in recoupment-type audits are too small, and this is usually due to a conflict between the sample size calculations and the distributions of the data. For example, in RAT-STATS, the statistical program maintained by the OIG, and a favorite of government auditors, sample size estimates are based on an assumption that the data are normally (or near normally) distributed. A normal distribution is defined by the mean and the standard deviation, and includes a bunch of characteristics that make sample size calculations relatively straightforward. But the truth is, because most auditors use the paid amount as the variable of interest, population data are rarely, if ever, normally distributed. Unfortunately, there is simply not enough room or time to get into the details of distributions, but suffice it to say that, because paid data are bounded on the left with zero (meaning that payments are never less than zero), paid data sets are almost always right-skewed. This means that the distribution tail continues on to the right for a very long distance.

In these types of skewed situations, sample size normally has to be much larger in order to meet the CLT requirements. So, what one can do is simulate the random sample over and over again to see whether the sampling results ever end up reporting a normal distribution – and if not, it means that the results of that sample should not be used for extrapolation. And this seems to be what the OIG was talking about in this report. Basically, they said that some but not all of the appeals entities (MACs and QICs) did this type of simulation testing, and others did not. But for those that did perform the tests, the report stated that $41.5 million of the $42 million involved in the reversals of the extrapolations were due to the use of this simulation testing. The OIG seems to be saying this: if this was an unintended consequence, meaning that there wasn’t any guidance in place authorizing this type of testing, then it should not have been done, and those extrapolations should not have been overturned. But if it should have been done, meaning that there should have been some written guidance to authorize that type of testing, then it means that there are likely many other extrapolations that should have been reversed in favor of the provider. A sticky wicket, at best.

Under the heading “Opportunity To Improve Contractor Understanding of Policy Updates,” the report also stated that ”the MACs and QICs have interpreted these requirements differently. The MAC that previously used simulation testing to identify the coverage of the lower limit stated that it planned to continue to use that approach. Two MACs that previously did not perform simulation testing indicated that they would start using such testing if they had concerns about a program integrity contractor’s sample design. Two other MACs, which did not use simulation testing, did not plan to change their review procedures.” One QIC indicated that it would defer to the administrative QIC (AdQIC, the central manager for all Medicare fee-for-service claim case files appealed to the QIC) regarding any changes. But it ended this paragraph by stating that “AdQIC did not plan to change the QIC Manual in response to the updated PIM.”

With respect to this issue and this issue alone, the OIG submitted two specific recommendations, as follows:

Provide additional guidance to MACs and QICs to ensure reasonable consistency in procedures used to review extrapolated overpayments during the first two levels of the Medicare Parts A and B appeals process; and
Take steps to identify and resolve discrepancies in the procedures that MACs and QICs use to review extrapolations during the appeals process.

In the end, I am not encouraged that we will see any degree of consistency between and within the QIC and MAC appeals in the near future.

Basically, it would appear that the OIG, while having some oversight in the area of recommendations, doesn’t really have any teeth when it comes to enforcing change. I expect that while some reviewers may respond appropriately to the use of simulation testing, most will not, if it means a reversal of the extrapolated findings. In these cases, it is incumbent upon the provider to ensure that these issues are brought up during the Administrative Law Judge (ALJ) appeal.

Programming Note: Listen to Frank Cohen report this story live during the next edition of Monitor Mondays, 10 a.m. Eastern.

TAGS: Auditing

Frank Cohen, MPA

Frank D. Cohen is Senior Director of Analytics and Business Intelligence at VMG Health, LLC, and is Chief Statistician for Advanced Healthcare Analytics. He has served as a testifying expert witness in more than 300 healthcare compliance litigation matters spanning nearly five decades in computational statistics and predictive analytics.

Featured Webcasts

2027 IPPS Masterclass: Final Rule Update with Expert Insights and Analysis

Prepare for FY 2027 IPPS changes with a comprehensive 3-part masterclass covering ICD-10-CM/PCS updates, MS-DRG shifts, NTAPs, compliance risks, and reimbursement strategies.

August 18, 2026

2027 IPPS Masterclass Day 3: MS-DRG Shifts and NTAPs

Stay ahead of FY 2027 reimbursement changes with expert analysis of MS-DRG shifts, NTAP updates, Medicare Code Edits, and emerging technologies impacting inpatient payment accuracy.

August 20, 2026

2027 IPPS Masterclass Day 2: Master ICD-10-PCS Changes

Stay ahead of FY 2027 ICD-10-PCS changes with expert analysis of new procedure codes, revised guidelines, and high-impact updates affecting reimbursement, compliance, and inpatient coding accuracy.

August 19, 2026

2027 IPPS Masterclass Day 1: Master ICD-10-CM Changes

Master the FY 2027 ICD-10-CM changes, including new diagnosis codes, CC/MCC updates, and coding guideline revisions, with practical insights from nationally recognized coding and CDI experts.

August 18, 2026

Featured Webcasts

Ask Dr. Hirsch: Clarifying Medicare’s Most Misunderstood Rules – Part 2

Medicare regulations are complex and even seasoned professionals struggle to apply them consistently. Due to overwhelming demand, Dr. Hirsch returns for Part 2 of Ask Dr. Hirsch: Clarifying Medicare’s Most Misunderstood Rules to answer even more of Medicare’s most misunderstood questions, covering inpatient status, observation, SNF access, Medicare Advantage denials, and more. Join Dr. Hirsch as he provides clear, referenced answers to real-world questions submitted by your peers, helping you navigate Medicare compliance with confidence and clarity.

June 18, 2026

Reengineering Utilization Management: Building an Adaptive Model for the New Payer Era

Traditional utilization management models can no longer keep pace with regulatory shifts, payer scrutiny, and operational pressures. In this webcast, Tiffany Ferguson, LMSW, CMAC, ACM, ACPA-C, introduces an Adaptive Model strategy that modernizes UM through role specialization, technology-driven workflows, and proactive, team-based processes. Attendees will learn how to restructure programs to improve efficiency, strengthen clinical collaboration, and enhance financial performance in a rapidly changing healthcare environment.

May 20, 2026

Compliance for the Inpatient Psychiatric Facility (IPF-PPS): Minimizing Federal Audit Findings by Strengthening Best Practices

Federal auditors are intensifying their focus on inpatient psychiatric facilities, using advanced data analytics to spotlight outliers and pursue high‑dollar repayments. In this high‑impact webcast, Michael Calahan, PA, MBA, Compliance Officer and V.P., Hospital & Physician Compliance, breaks down what regulators are really targeting in IPF-PPS admissions, documentation, treatment and discharge planning. Attendees will learn practical steps to tighten processes, avoid common audit triggers and protect reimbursement and reduce the risk of multimillion-dollar repayment demands.

April 9, 2026

Mastering MDM for Accurate Professional Fee Coding

In this timely session, Stacey Shillito, CDIP, CPMA, CCS, CCS-P, CPEDC, COPC, breaks down the complexities of Medical Decision Making (MDM) documentation so providers can confidently capture the true complexity of their care. Attendees will learn practical, efficient strategies to ensure documentation aligns with current E/M guidelines, supports accurate coding, and reduces audit risk, all without adding to charting time.

March 31, 2026

A Study of Contractor Consistency in Reviewing Extrapolated Overpayments

Frank Cohen, MPA

Related Stories

The Medical Record: How Artificial Intelligence (AI) Impacts Coding

The Risk of Declining Reimbursement

Leave a Reply

Featured Webcasts

2027 IPPS Masterclass: Final Rule Update with Expert Insights and Analysis

2027 IPPS Masterclass Day 3: MS-DRG Shifts and NTAPs

2027 IPPS Masterclass Day 2: Master ICD-10-PCS Changes

2027 IPPS Masterclass Day 1: Master ICD-10-CM Changes

Trending News

Just Because You Can Screen Doesn’t Necessarily Mean You Should

Key Insights on the Use of AI in the Audit and Compliance Industry

Is the Medical Record a Clinical Tool or an Invoice?

An IPPS Proposed Rule Preview – Body Mass Index (BMI)

Featured Webcasts

Ask Dr. Hirsch: Clarifying Medicare’s Most Misunderstood Rules – Part 2

Reengineering Utilization Management: Building an Adaptive Model for the New Payer Era

Compliance for the Inpatient Psychiatric Facility (IPF-PPS): Minimizing Federal Audit Findings by Strengthening Best Practices

Mastering MDM for Accurate Professional Fee Coding

Trending News

URGENT: Hospitals Should Start Preparing for the FY 2027 IPPS Proposed Rule Now

DOJ Tells Data-Mining Whistleblowers to Up Their Game

Workplace Stress: When Burnout Becomes an Organizational Issue

Healthcare Compliance in the Machine-Learning Era

Stay Connected

News

Account

Info