The study in the Annals of Internal Medicine [Choudry NK, Fletcher RH, Soumerai SB. Systematic review: the relationship between clinical experience and quality of health care. Ann Intern Med 2005; 142: 260-273] searched the literature search to find 62 articles that analyzed physician knowledge or performance according to the the physicians' age. Summaries of the 62 studies were broken down by study purpose: 12 involved written tests of knowledge; 17, adherence to guidelines or practice standards for diagnosis, screening, or prevention assessed by self-report, e.g., by surveys or interviews; 7, adherence to such standards assessed by chart audit; 5, adherence to guidelines or practice standards for treatment assessed by self-report; 13, adherence to guidelines or practice standards for treatment assessed by chart audit; and 7, directly measured patient outcomes.
The review did not screen out articles of poor methodologic quality, or rate the methodologic quality of any article. So it did not eliminate articles whose specific standards for physician performance were not evidence-based, such as tests of knowledge not related to the physicians' practices. Furthermore, it included articles regardless of their study architecture, age, sample size, patient selection criteria, whether and how they controlled for patients' characteristics, and effect size and its precision. Thus, this review's results could well have been biased by poorly designed or performed studies, and studies which are unlikely to generalize to modern physicians.
I did not have time to re-review every article, but a quick perusal made me more concerned that the most striking results showing older physicians performing worse were contributed by the methodologically weakest articles. For example, of the 13 articles that looked at adherence to standards for treatment by chart audit, only 6 showed what the authors called a consistently negative effect of increasing age. Of these,
- one was published 34 years ago, and included only 37 physicians;
- one, of treatment of depression, did not account for the severity of the patients' symptoms, and had a very small effect size (OR=1.12, CI 1.01, 1.24);
- one used a standard of care for inappropriate drug selection that might be debated;
- one used that same standard, did not adjust for patients' clinical characteristics, and had a very small effect size (OR=1.14);
- one was published 21 years ago, and used practice standards defined by consensus, not evidence; and
- one was published 20 years ago, included only 66 physicians, and again used practice standards defined by a panel, not evidence.
Yet an accompanying commentary, [Weinberger SE, Duffy FD, Cassel CK. "Practice makes perfect" ... or does it? Ann Intern Med 2005; 142: 302-303.], hailed the article as showing that physicians "must embrace the concepts behind maintenance of certification, which provides an opportunity to prevent the outcomes demonstrated...." Since Choudry's review did not include any studies of recertification, I think this conclusion goes even farther beyond its data.
Even though physicians seem beset on all sides by powerful organizations, sometimes that stand to profit by reducing physician autonomy, I believe that our professional values mandate serious, ongoing examination of our own performance. (I have actually published a few studies which do just that.) However, the principles of clinical epidemiology apply to such studies just as they apply to studies of patients. We do no one any favors by rushing to negative conclusions about physician performance without examining the strength of the relevant evidence.