What We Don’t Know About Why Incentives “Didn’t Work”: A Commentary by Alliance Medical Director Dr. Sharon Eloranta
In this time of innovation in health care, many organizations are developing, testing and evaluating models intended to foster the delivery of high-value care – that is, better quality care at lower cost. The Centers for Medicare & Medicaid Services (CMS), the nation’s largest purchaser/payer, is no exception. A recent article commented on the results of a study published in Health Affairs titled “The Medicare Advantage Quality Bonus Program Has Not Improved Plan Quality.” I was immediately intrigued – what did they find? And what could be learned?
The quality bonus program was applied at the Medicare Advantage (MA) health plan level, meaning that CMS offered the plans monetary bonuses based on performance on certain quality metrics. The evaluators determined that there was no consistent difference in improvement in quality for the Medicare Advantage enrollees overall as compared with commercial enrollees over the same time period (2009 – 2018). It was a very large study, involving nearly nine million individuals – so no problem with sample size; it was conducted at a reputable institution (University of Michigan); and asked a much-needed question given the large amounts of taxpayer money being spent on the incentives.
After reading the report, I had more questions than answers. For the record, I am not a statistician nor a researcher. My background is in quality improvement – and the Iron Law of quality improvement tells us that when one looks at large improvement initiatives, they nearly always produce NO results – no change – in quality improvement indicators when studied at the initiative level. According to Rocco Perla, “This phenomenon was described in the 1980s by American program evaluator Peter Rossi as the ‘Iron Law’ of evaluation in studies of the impact of social programs, arguing that as a new model is implemented widely across a broad range of settings, the effect will tend toward zero.”
So what’s a researcher to do when the wide implementation results in a net effect of zero? One needs to take the data apart and look for the range of performances – that is, some organizations within the program DID make improvements; some stayed the same; and some likely did worse. Our job here is to find the success stories and identify what those participants did differently in the same model – why did they succeed – rather than throw out the entire program.
Quoting Perla again, “we argue that just because a pilot does not work everywhere does not mean it should be wholly abandoned. Instead, data should be reviewed at the individual site level. Evaluators and policymakers should understand where and why a model has worked and use that information to guide the adaptation and spread of a model.”
This is just the first of the intriguing issues that could be pursued regarding the study. Others are:
- Why did they pick these metrics (some have low denominators such as medications for rheumatoid arthritis)?
- Why is the comparison with Commercially insured populations when MA population is older and members may have more illness burden?
- Were there plans that did better on the overall measure set or were there some measures that no one did well on?
- Does this report show that incentives don’t work when applied at the plan level? There is a body of psychology work that suggests that paying individuals to do things can have the opposite effect.
So, what does this tell us? This tells us we need to pursue more answers and look more deeply. While others will look at this and say that this ongoing program is a failure, which it may be, more investigation needs to be done. Healthcare is extremely complex and the science of incentives to impact behavior change is a factor that further complicates the business of determining “what works.” This is an important effort–let’s keep asking the questions!