Thursday, February 24, 2011

Windows 7 Service Pack 1 "Glitches": Why Personal Computers are Problematic, and Perhaps Should Not Be Mission Critical Components in Hospitals

A technical note on computer unreliability, and a series of followup critical questions relative to health IT:

I run Windows 7 Professional on one of my computers, a very unspecial 4-5 year old Micro Center machine, the PowerSpec 6001, using conventional components. The machine was upgraded with 2 Gb RAM and an ATI Radeon 9600 series video card, to run the Aero "eye candy."

It has run satisfactorily since I installed Windows 7 Professional (32-bit version) on it last year.

I am not a computer amateur. [I do, however, admit to being a Radio Amateur - Extra class - ed.] Further, I meticulously keep the machine current with Microsoft security patches, use Symantec anti-virus which I also keep updated, check my disk for errors, and only visit major well-known, nationally prominent websites using the machine.

So yesterday, Windows 7 Service Pack 1 appeared in my "Software updates" list from Microsoft, after being released to the general public.

It was explained that this Service Pack improves performance, reliability and security [i.e., it is intended to update all the many, many bugs, unreliabilities and security holes of the operating system since its inception - ed.]

I confirmed my machine met the specs for it, and allowed the computer to download and install Service Pack 1.

That was my mistake.

After installation, the machine could no longer reboot Windows.

The flying color patches of the first screen appeared ... and then the machine suffered a hard reboot (going back to the BIOS-initiated memory checks and screens, etc.), as if I'd pressed the front panel hardware RESET button this machine has.

An automated "wizard" that came up and tried to figure out why the machine would not restart failed to do so.

I asked the machine, therefore, to roll back to the state it was in prior to the Service Pack 1 (SP1) "upgrade" via a "Restore Point", a feature Windows permits using the "system restore" capability. The SP1 installation automatically creates a Restore Point (image of the prior state of the OS) on the machine just before installing itself.

The machine works again, but ... (and that is a big "but"):

  • The Service Pack is not installed. Therefore I am not running the latest protections, and do not know what will happen in the future with respect to patches and upgrades. Maybe the Service Pack will get its own Service Pack at some point to fix it, so it can fix my computer;
  • The Service pack installation, without warning and rather rudely, erased my prior computer Restore Points of the past few weeks, leaving only the Restore Point it created just before inflicting itself on my machine. Just to thumb its nose at me, it also left two Restore Points from back in November, which would require me to then re-install a lot of software and updates at the very least. I cannot even try a Restore Point of, say, last week after other updates;
  • I attempted to view the system error logs to determine what caused the failed Service Pack Install. Surprise! The Event Viewer and Task Scheduler management consoles I use to review system operation no longer operated, instead producing this lovely, extremely explanatory message: "MMC cannot initialize the snap-in", followed by hexadecimal gibberish that any doctor or citizen can easily decipher:

Such explanatory error messages! "MMC cannot initialize the snap-in. The snap-in might not have been installed correctly. Name: Event viewer. CLSID: FX: {b05566ad-fe9c-4363-be05-7a4cbb7cb510}." Click to enlarge.

  • Attempts to look up the error on the Web produce gobs and gobs of amateurish "legible gibberish", indirection, misdirection, guesswork, and speculation, some of it from Microsoft itself;
  • To add insult to injury, typical of poor user interaction design, I could not copy-and-paste the error, but had to type it (partially, fortunately, thanks to Google);
  • Much of the material was in very broken English (where's those language translators promised to us for some 50 years now by computer scientists?)
  • My attempts at repairing the damage by running the command "SFC /scannow" (system file check) to check and repair critical windows files showed that the Service Pack "Upgrade" also corrupted a number of critical Windows system files - despite the "Restore Point" rollback. Great Scott!
  • SFC repaired the files and produced a log of gibberish that's thousands of pages long for me to ferret out what got damaged (ironically, just like the records from a few weeks of a relative's EMR-error-related hospitalization, at appx. 2,900 pages of legible gibberish);
  • The repair did not restore the missing functionality;
  • Attempts to reinstall supporting packages such as .Net framework also do not restore the functionality;
  • Attempts to reinstall the Service Pack produce the same results, a crash on initial restart after the installation and need to roll back, erasure of several Restore Points I manually created, along with re-corruption of the critical system files previously repaired by the SFC /scannow command.
  • I have no way of knowing what else is broken or may malfunction;

Russian Roulette, anyone?

All this was after many months of Microsoft "Beta testing" the Service Pack. (Perhaps it was really "Alpha testing?")

Similar issues occurred with the former Microsoft OS, Windows XP (now in its third major service pack since its release in 2002, with patches still coming on an almost weekly basis).

Fortunately, I have backup images of my entire disk, but the inconvenience and time wasted is quite irritating - and I will still not have the latest security patches after I roll back my machine to my latest disk image.


One should note that these "glitches" are just in the Operating System (OS) itself. Third-party applications (such as EMR and CPOE sytems, middleware, interfaces, etc.) suffer the same type of problems...for instance, the life-and-limb-threatening "glitches" that occurred at Trinity Healthcare after an EMR "upgrade."

Further, OS "glitches" can cause unexpected application "glitches", and vice versa. Complexity on top of complexity...

Note that machines running similar software are on the "servers" that are the heart of major enterprise systems such as EMR's and CPOE's, that communicate with enduser workstations.

Now, several simple questions:

  • Who knows what other "glitches" the Service Pack introduced to my machine, that will "bite me" (or patients) later?
  • Are these the machines we want our doctors and nurses to depend upon, since they increasingly regulate every medical transaction that occurs?
  • Has the software become too complex to be entirely reliable, maintainable and secure?
  • Does the average hospital have the staff to effectively deal with issues such as the above?
  • Do these "glitches" raise the risk and the cost - therefore reducing the ROI, already low (see reading list) - of experimental health IT to even more unsatisfactory levels?

Finally:

  • Would the average person tolerate such behavior from their car? In their aircraft? (Oops, the brakes don't work properly in 13.5% of cars after the parts upgrade, and that altimeter is simply crazy ...)
-- SS

Feb. 24 late night addendum:

Deciding to play with this mayhem, and knowing I was going to be wasting a lot of time, I first backed up my deranged machine to an external disk (~ half an hour) to preserve my files. I then restored my machine from an external disk backup image to its condition in mid-Nov. 2010 [thinking perhaps something more recent caused the SP1 to fail]. That took another half an hour. I then attempted to install the SP1 again. That took another hour or more.

Same results - machine crash after the "circling window panes" display.

I let the "Startup Repair" wizard run. It failed with the following informative messages. In a superb example of poor user design, I had to jot the messages down on paper, as it made no offer to print them, or load them into a thumb drive, etc. - although it did offer to send the error messages to Microsoft, a neat trick as the software components to drive the computer's wireless network adapter were not loaded:

Problem details - System Repair
Problem signature:

1- 6.1.7600.16385
2- 6.1.7600.16385
3 - unknown
4 - 21201077
5- AutoFailure
6 - 3
7 - BadPatch

OS version - 6.1.7600.2.0.0
Local ID - 1033256.1
Root cause - a patch is preventing the system from starting [no fooling - ed.]
Repair Action
System file integrity check and repair
Result = Failed.
Error code = 0xa

Then for added fun, I started the machine up in 'Safe Mode' (using the F8 key at startup). It came up, but told me it was doing a System Restore due to the failure to configure the Service Pack. After about 20 minutes of frantic disk activity, the machine rebooted - and immediately crashed as before.

I am rerunning the Startup Repair wizard again, asking it to restore my system, but I predict it will do so with the original remaining problems of non-functioning components that started this whole mess - if it works at all.

This is all absurd. It is a massive waste of time, a result of poor programming, uninformative, cryptic error messages (what? computers don't have enough storage for useful error messages?), poor (nonexistent) documentation, inadequate attention to the user experience, condescension of the user, inability to report the problems back to HQ automatically due to lack of forethought about a compromised machine's ability to access the network, software unreliability, and probably a host of other issues I haven't thought of yet because I'm tired after all this fritter.

Not to mention, it is potentially destructive of data to those who suffer this problem but did not keep backups. They warn you beforehand - but the installation agreement you "sign" is of the Ross Koppel/David Kreda "hold the vendor harmless" variety.

This experience is a metaphor for the state of health IT (with "glitches", "workarounds", unexplained errors, etc.), and of the dangers of computer worship.

-- SS

Feb 25 addendum - further experimentation based on web comments about SP1, such as running a pre-SP1 readiness checking utility by Microsoft, emptying the /temp folders, renaming the "software distribution" folder, clean booting, etc. all produce the same result: crash of the machine on reboot.

And there's no computer doctor to call for an appointment to fix the problem.

-- SS

12 comments:

Nick said...

I'm 3/3 in upgrading my personal/laptop/test machines at work for SP1 and all have been fine. Sorry for the problem with SP1. But you're right about the analogies to EMRs. 100% correct.

InformaticsMD said...

Let's hope I have better luck on my other machines.

-- SS

Anonymous said...

In this week NCIS Gibbs was faced with a computer counting down to hack all of the Pentagon and the need to input several commands to stop this action. His solution was to shot the computer, several times.

Today I helped my wife’s aunt get her 99 year old uncle to the big high tech hospital for a minor procedure. Fist they found the appointment, but could not find the paperwork. After finding the paperwork she was told to wait and then see a person at a desk around the corner. After that she had a copy of the paperwork and we waited some more until a person came to take the uncle back.

So the computers have not eliminated the paperwork. The paperwork’s major function is for billing and this hospital was passing out CD’s to patients with their records. So much for this great integrated system.

To top it off there was confusion about the appointment time so we were 30 minutes early and then had to wait 30 minutes past the new appointment time before the uncle was seen.

I am liking Gibbs solution more and more.

Steve Lucas

Anonymous said...

It is about sales, sales, and more sales while the deaths, deaths, and strokes from neglect are covered up by the vendors who have partnered with the hospitals.

Anonymous said...

Why does one machine upgrade ok and another one does not.

Matt in Western PA
Solo FP, EMR user (but I print all my notes out same day -- go figure!)

InformaticsMD said...

Matt writes:

Why does one machine upgrade ok and another one does not.

Because (presuming no good-old-fashioned outright carelessness-caused bugs), the hardware and software configurations of the machines are often highly different (even in the Mac world, but to a lesser extent), and the interactions of these with the "upgrade" create "problems."

I presume some patch or hardware driver is causing my computer, with its specific motherboard and support chips by various companies in various "steppings" or silicon error fix version, CPU "stepping" (CPU's have bugs too, corrected over time as documented in errata sheets), etc. is causing the machine to crash on reboot after SP1 install.

As computers and software become more complex and more diverse, maintaining them and keeping them reliable does not get easier.

I have these lines in my email .sig:

"Critical thinking always, or your patient's dead." - Victor P. Satinsky, M.D., Hahnemann Medical College.

and, most appropriately:

"Either you're in control of your information systems, or they're in control of you." - Anonymous

-- SS

InformaticsMD said...

Steve Lucas writes:

I am liking Gibbs solution more and more.

My computer teacher of yore had a saying:

Hitting the computer with a baseball bat won't work; that only works for TV sets.

-- SS

Live it or live with IT said...

I denied my computer's pleadings to install the service pack today.

It seemed sad. Methinks Watson is planning revenge by denying me health care someday.

My bet: Watson's kin are already hard at work in the health insurance business already dictating care decisions to be carried out by the subservient MD drones whose payments are already controlled by that son-of-a-watson.

Anonymous said...

On the 32-bit machines the upgrade does not have as much of an issue. It is the 64-bit machines that do. Forbid it is a netbook or laptop though due to numerous complaints about the performance reduction on those. There are network issues that arise, false battery readings / warnings, video issues, processor utilizing more threads and creating more heat than normal, and more. Most of this is from personal experience of having 7 computers between me and my wife. 2 very powerful self-built desktops with 64 bit installed, one took the SP1 and one did not. 2 mid-range laptops with 64 bit and each took, but slow and more. And two netbooks, which I tried to rollback and my network had functioned, not internet, java and activex were corrupt after the rollback. Needless to say, we did not keep much on the netbooks, so it is an easy wipe. In the long run, I feel for such emergency systems that are using Microsoft products. Most of them should utilize unix or linux on the severs which can host windows apps and provide a more stable environment. (This brought nightmares of Vista and ME back).

InformaticsMD said...

Re: Anonymous February 26, 2011 6:56:00 AM EST

there are network issues that arise, false battery readings / warnings, video issues, processor utilizing more threads and creating more heat than normal, and more.

I still have several machines on which to attempt the upgrade, most 64-bit but one 32-bit of a relative.

I think I'll hold off awhile.

-- SS

Anonymous said...

I could not get me Lenovo T60P to upgrade to Windows 7 SP1 and tried a bunch of fixes and then gave up. It works fine in regular Windows 7 though and I have been installing regular updates so the service pack probably adds little for me.

In any case, SP1 is *officially incompatible* with a key piece of software that I use, my symantec endpoint virus and malware protection distributed by my university.

While I agree that it is important to regularly update your computer, I also think sometimes it is better not to try and install updates as soon as they come out because it is valuable to see other users' experience and wait for any further fixes to the fixes to arrive. Especially with Windows, where software developers often need time after a new Microsoft release to update their applications accordingly.

Corporations and universities *do not* run the most recent versions of windows, office, etc. at the same time that these upgrades are first made available to consumers, because their systems are managed by network professionals and upgrades are done in a more sophisticated manner then just having your store bought windows attempt to update itself online.

The problems with Windows 7 SP1 consumer updates mean little for electronic medical records and other types of health information technology. I would like to see an example where a failure to install Windows 7 SP1 has compromised the security of a medical record.

InformaticsMD said...

Anonymous March 9, 2011 12:52:00 AM EST:

Your comment typifies the difference between those who think abstractly and in broad terms, such as about the cause of disease, and those who think concretely, narrowly and at a lower level of abstraction, such as at the level of a symptom.

My post is conceptually about the inherent dangers of sloppy software engineering, testing and quality control. I use the Win 7 SP1 case as merely an example.

The dangers of poor software engineering and quality in healthcare are most clearly demonstrated by a recent research report by Dr. Jon Patrick in Australia as I wrote about here.

Finally, while it may be true that Windows 7 hasn't affected patients - yet - upgrades to HIT like this one can certainly kill.

I'd enjoy hearing your explanation of how these examples should not be considered hair-raising by government, physicians and patients alike.

-- SS