Friday, March 30, 2007

New Data, More Doubts About Pay-for-Performance (P4P)

Two recent health care research articles and their accompanying editorials once again question the premises which undergird the currently fashionable pay-for-performance (P4P) movement.

We have posted skeptically here, here, and here about P4P. Briefly, our concerns were that P4P could lead to perverse incentives (by rewarding apparently better rates of good outcomes which could be created by avoiding the sickest patients, or emphasizing a few measured processes and thus distracting physicians from doing anything else, or being based on inaccurate or irrelevant data), could emphasize cost cutting over quality, and could emphasize processes which have been better studied, potentially penalizing the specialties that did the most research about quality.

Rapid Antibiotic Treatment for Patients with Pneumonia

The first article(1), currently only available online, addressed a performance measure already in wide use, whether patients with a provisional diagnosis of community-acquired pneumonia receive an antibiotic within four hours of hospital arrival. This is a core quality of measure per the Center for Medicare and Medicaid Services (CMS) and the Joint Commission on Accreditation of Healthcare Organizations (JCAHO). The University HealthSystem Consortium's goal is 90% achievement of this measure. The authors' hospital already bases physician payments on the achievement of this measure.

Fee and Weber constructed a retrospective cohort of patients eligible for this measure according to JCAHO and CMS standards. They then examined the 34.9% of them who failed to meet the standard, i.e., failed to get antibiotics within four hours. They found that 58.5% in turn of these outliers, or 20.4% of the total patient cohort, had not been diagnosed with community acquired pneumonia at the time they left the Emergency Department (ED). Thus the performance measure for treatment of pneumonia was being applied to patients who did not clearly have pneumonia at the time the treatment decisions had to be made.

The accompanying editorial(2) noted many problems with the "four hour rule," in part based on this data. Most important is that the diagnosis of pneumonia is not always obvious in patients presenting to the ED. Yet the four hour rule standard as currently applied requires physicians to give antibiotics promptly to patients who do not obviously have pneumonia at the time the physicians saw them, and whose diagnosis was made only later. It seems obviously unfair to require physicians to be clairvoyant. The risks are that emphasis on the four hour rule, including payments made to physicians who fulfill the standard more often, will induce physicians to unnecessarily treat lots of patients with possible pneumonia with antibiotics in the hopes that some will turn out to have pneumonia, raising costs, causing side effects, and creating more antibiotic resistance. Thus, this data suggested how this particular performance measure was likely to have perverse effects.

Aggressive Glucose Lowering Treatment for Diabetes

The second article(3), published this month, addressed another well-known performance measure. This is the proportion of patients with diabetes mellitus who have hemoglobin A1C values less than 7%, as endorsed by the National Quality Forum and the National Committee on Quality Assurance.

Pogach et al retrospectively identified a cohort of patient with diabetes, then assessed the proportion of patients in this cohort who might not benefit from the application of the standard, because the results of the clinical research on which the standard was based might not apply to them. In particular, they identified patients with decreased life-expectancies or multiple or severe co-morbid conditions. Their criteria for identifying these patients were based on the exclusion criteria used by the major controlled trial of aggressive glucose reduction in patients with adult-onset diabetes, the UKPDS study. They found that 21.7% of their cohort had major co-morbid illnesses, 7.9% had major mental health problems, and another 4.4% had multiple co-morbid conditions, suggesting that 34.1% of the patients to whom the standard might usually be applied might not benefit from it. This is important because the aggressive control of diabetes needed to achieve a low hemoglobin A1C value at best only benefits patients by decreasing their risk of certain diabetic complications long term, but raises the risks of hypoglycemia (low blood sugar), which can be dangerous or fatal, for as long as treatment remains aggressive. The results of this study suggest that the performance measure could be applied to many patients who may never benefit from it, but would be at continuous risk of aggressive lowering of blood sugar. Again, this suggests how this performance measure might have perverse effects.

This article was accompanied by a pithy editorial by Rodney Hayward. Hawyard dissected problems with outcome-based performance measures that can lead to perverse results. Some of his observations particularly deserve quoting:

This editorial ... will discuss the broader question of why this intuitively appealing approach - using 'optimal' treatment goals as performance measures - will almost always require more sophisticated approaches ... or else risk generating performanc meausres that are inaccurate, promote waste, and perhaps cause substantial patient harm.

Readers may be perplexed as to why 2 new outcome measures lacking any risk adjustment were adopted. In truth, these new measures were a compromise between advocates of optimal goals (disease advocates) and advocates of simple, inexpensive performance measures (health plan leadership). Experts in evidence were not included in the compromise, which is part of the problem.

Payers, disease advocates, consumer groups, and political leaders are often dismissive of the complex reality of measuring care....

Wishful thinking will not transform poor performance measures into useful ones, and that well-meaning people have a profound aptitude for letting their desires and ideology blind them to unwanted facts and complexities that are so vexingly common in the real world.

Promoting optimal care using performance measures requires considering the very real tensions among treatment-related benefits and treatment-related burdens, risks, and costs. HL Mencken once said, 'for every problem, there is a solutiont that is simple, neat, and wrong,' and using unadjusted, 'all-or-nothin' optimal treatment targets as performance measures is such an example.

Some leaders in performance measurement have asked me, 'do you really think that these measures will lead clinicians and health systems to overtreat?' I am frankly amazed by this question. Spiraling healthcare costs and overtreatment are probably the defining feature of the US healthcare system. Industry-funded 'experts' and disease advocates have been effectively promoting overtreatment for decades, and performance measurement was supposed to be a tool to bring better value to healthcare spending. Although performance measurement has proved to be a very powerful took, like all tools it provides opportunities for both benefit and harm.

Again, as we have noted before, developing performance measures that will truly benefit patients will require detailed understanding of the clinical context, keen skeptical analysis of the available relevant research data, and careful balancing of benefits, harms and costs. All this would be very hard under the best of circumstances. But the continual attempts by those with vested ideological and financial interests to influence performance measures to advance their own interests make it unlikely that the whole P4P movement will have any good effects on patients.

The first improvement needed in the P4P movement is clear, detailed disclosure of all conflicts of interest affecting those involved in the movement at any stage.

At this point, patients and physicians should be very skeptical about who is likely to benefit from any new performance measure, particularly measures that are lavishly promoted.


1. Fee C, Weber EJ. Identification of 90% of patients ultimately diagnosed with community-acquired pneumonia within four hours of emergency department arrival may not be feasible. Ann Emerg Med 2007. [now available on-line only]
2. Pines JM. Measuring antibiotic timing for pneumonia in the Emergency Department: another nail in the coffin. Ann Emerg Med 2007. [now available on-line only]
3. Pogach LM, Tiwari A, Maney M et al. Should mitigating comorbidities be considered in assessing healthcare plan performance in achieving optimal glycemic control? Am J Managed Care 2007; 13: 133-140. [link here]
4. Hayward RA. All-or-nothing treatment targets make bad performance measures. Am J Managed Care 2007; 13: 126-128. [link here]

1 comment:

Anonymous said...

As a patient I feel you have grossly understated the problem. Reviewing my last physical a number of the acceptable ranges listed were lower than in years past. As a consumer I am well aware, of the not so subtle influence, of those with a financial interest in creating a larger customer base for drugs, testing supplies, medical devices, etc. at the expense of the general publics health.

P4P will only accelerate the cookie cutter approach to medicine, as well as increasing cost and negative outcomes.

Steve Lucas