January 2009 Archives
January 19, 2009
Funny you should ask; I was asking myself that very thing this weekend. Here's how I answered myself ....
January 18, 2009
Servers and systems are all responding again as of 3:00 pm Sunday following the hardware problem this morning. The disk responsible for failures today and Dec. 30th has been replaced.
I was unable to finish the preventative maintenance during the Friday evening and planned to reschedule for next week. The hardware in question decided otherwise and died this morning. All services should be up mid-afternoon; web sites are up as of 11:30.
January 17, 2009
The workstations did not come up properly after the power outage and server work of Friday evening. Things were working again by noon.
January 16, 2009
The server and workstations are going down at 4:30 pm today. Web sites will remain readable through the upgrade (4:30 - 6:30) and power outage (6:00 - 9:00).
January 15, 2009
A disk failed in one of the main file-server arrays this morning and until it is fixed the server will be under performance strain and sensitive to data loss. The files on that array are all to do with the linux workstations, not mail, web or home directories. Normally, we could fix such a failure by swapping out disks on the fly, but in this case the disk in question is the one the server boots from.
A primary disk in one of the main file-server arrays failed Thursday morning. I will be shutting the file server down at 4:30 pm Friday (prior to the power outage) in order to replace the disk and perform some upgrades.
Facility Services will cut power in HH and BSB between 6 pm and 9 pm Friday evening. In order to perform some server maintenance at the same time, I will be taking down the file server at 4:30 pm. Web sites hosted by the departmental server will be accessible during the downtime and power outage but not workstations, email or file-server access.
January 12, 2009
There are some hangovers resulting from the overheating and shutdowns: some mail will have bounced back to senders; mail service will be slow while a backlog of mail (mostly spam) is processed; workstations will be slow while the file server corrects some disk errors caused by the heat-related crash.
As of 9:50 am today, most systems are running again; web service was restored at 9:30. The servers were shutdown Saturday morning after an air-conditioning failure in the server room. We are investigating ways of preventing or ameliorating such problems.