September 2010 Archives
September 30, 2010
We had spotty/wacky internet access between campus and the rest of the world from about 3:30 pm to 9:30 pm today - spotty in that sometimes one could get out from on campus, wacky in that some web sites and servers were accessible from off-campus the whole time ... unfortunately, www.math.mcmaster.ca and mathmail.mcmaster.ca were not among the latter*.
* Interestingly, one could ssh to mathserv2 the whole while, and from there to ms in order to ready mail via pine.
UTS says ...
Our primary ISP Cogent is down, are aware of the problem and are working on it (no estimate yet). There is some traffic through our backup supply, but that has only ~20% of primary capacity.
September 29, 2010
The HH-214 printer is jamming repeatedly and cannot be used. The department administration will consider the options.
September 28, 2010
I will be moving user home directories from mathserv to our new (borrowed, really) file server on Wednesday and Thursday mornings.
Between 6:00 and 7:30 on Wednesday and Thursday mornings, the following will be unavailable
- access from mail clients
- ssh login (and thus pine, etc.)
- incoming mail (delivery will simply be deferred
- workstation access
- access to network printers
Web access will not be affected except briefly for sites in ~/public_html folders (while individual folders are in the process of being moved).
September 23, 2010
Access to mail via webmail and mail clients will wink out for up to ten minutes now and again Thursday evening as we bring a new file server up.
Mathserv is not gone yet, but the doors are closing one by one. SSH/SFTP to mathserv are now blocked; instead, please use ms.mcmaster.ca.
September 21, 2010
Mail is spotty this evening as we work out some mail-load issues on our new server. Web mail at mathmail.mcmaster.ca is particularly slow and pine on ms is laggy. Mail clients (Thunderbird, OS X Mail, Outlook, etc.) are fine for the most part.
There will be occasionally periods (of a few seconds or minutes) when inbound and outbound mail will be stopped.
September 20, 2010
Yikes. Our spam filter wasn't running for a few hours this evening - you may have a spate of spam to deal with. Consider it a taste of what you normally miss :)
I will be rebooting the new main server (ms.mcmaster.ca) today at 5 pm in order to implement a performance tweak. Web and email will be unavailable for two to five minutes.
Workstations may pause during this period.
September 18, 2010
The new server is now handling most of the services formerly handled by mathserv.
Most people have either switched to using ms.mcmaster.ca or are using alias which now point to the new server. But a few people are connecting directly to mathserv.mcmaster.ca for mail, printing or file access. If you are one of those people, I'll be emailing you directly, asking you to move over change your configurations (or habits) as described in the earlier blog entry, "Server Upgrades: Things You Need to Change".
September 17, 2010
One of our two file servers - one which was to be taken off-line next week - failed rather spectacularly this afternoon. About half of our home directories (starting with m - z, mostly) were unavailable from 12:50 pm to 3:00 pm.
What Was AffectedBecause I've not yet recovered the failed file system and may not be able to do so anytime soon, I've reverted to last night's backups. If your home directory was on that disk, you will have lost changes/additions to your files and email from ca. 2:30 am to 12:50 pm.
All users whose files might have been affected will receive email from me with more information.
When (and if) I recover the failed file system, I will make files and mail boxes updated during that period available to you.
What Was Not AffectedNote that almost all of the web site was unaffected. Email sent between 12:50 pm and 3:00 pm will not have been lost but simply queued for later delivery. MS workstations still running the previous OS were down between 12:50 pm and 1:20 pm; systems running the new OS should not have been affected (except that some users could not login).
So we're going through all of the grief of the upgrade to get off of an unstable server ... and that server crashed. The problem primarily affects people whose last name starts with m - z, though other people might see problems, too; e.g.
- workstations which have not been upgraded will likely need to be rebooted
- mail will be turned off periodically
We are working on the problem and will brings things back up ASAP.
I see that while printing is working from the standard ms workstations and most other RHPCS-managed hosts, printing via Windows printer sharing is not (whether you use Windows, OS X or linux). I'll be working on this after lunch.
September 16, 2010
We might have missed installing your favourite application during the workstations upgrade. If you can't find something you need or if something appears to be not working right, please email firstname.lastname@example.org.
We are upgrading the workstation operating systems to Mandriva 2010.1 over the next few days.
During the upgrade process - which takes about 30 minutes - your computer will reboot and spend most of its time sitting on a black login screen. Don't login at this point.
Once the upgrade is complete, your computer will reboot a second time and come up with a plain, blue login screen (i.e. without the DNA graphic which was there before). At this point you can login.
Following the upgrade, you should find that you workstation is more responsive and slightly cuter.
At this point, we are only upgrading systems which don't have anyone logged into them. We'll announce a plan to deal with stragglers next week.
The HH-303 printer was off line for a few hours due to a hung network switch. It's back on-line. The queued print jobs are in the left-hand printout racks (underneath the one which reads "Scrap Paper"). I deleted most duplicate (and triplicate and n-plicate) copies from the queue before printing.
September 15, 2010
Mathserv will be going down for extensive upgrades on the morning of Tuesday, September 21st ... after which it will no longer be mathserv. Please make sure that you are using ms.mcmaster.ca - the new server - in place of mathserv.mcmaster.ca for ssh/sftp, pine, mail clients, etc. before then.
The wikis at wiki.math.mcmaster.ca will be moving to the new server today (Thursday). There will be several interruptions of a few seconds to a few minutes. I recommend that you avoid making updates today until I announce (on this blog and at wiki.math.mcmaster.ca) that the move is complete.
Email and the post-doc/grad-student linux workstations will unavailable between 5:00 pm and 6:00 pm Thursday, September 16th and between 7:00 am and 8:00 am on Friday, September 17th while we move over to the new file server. Web sites will stay up except for very brief interruptions.
The new mail server is using a new self-signed security certificate. Your mail client or browser may throw up a warning to the effect the effect that the certificate cannot be verified or has changed. In either case, you may accept/confirm/authorsize the security exception.
The printer is still jamming, even after having an apparently faulty part removed. The service technician did not find the source of the problem and is going to consult with HP.
If you find that the printer is jammed ...
- read the message on the LCD screen
- follow the directions; usually the jammed paper is near the output tray
- tug gently on the output tray on the left-hand side of the printer and slide it back
- you should see paper wrapped around the mechanism; remove it
- push the tray back in
- the latch is on the right-hand side of the printer, near the number "1"
UTS has confirmed that McMaster's link to the Internet is down. Do not panic. I repeat, DO NOT PANIC. Grab a coffee; read a book; call a friend. IT WILL BE OK.
The HH-303 scanner/printer has had parts replaced yesterday and that is meant to stop the frequent paper jams we've been dealing with. We've had two paper jams since the repair; if they persist, I'll have the service technician come back.
For the most part, you don't need to do anything differently in order to use the new main Math & Stats server - aside for exercise a little bit of patience now and then when services go off line or surprises crop up.
But there are a few changes ...
- use ms.mcmaster.ca instead of mathserv.mcmaster.ca for ssh/sftp, pine, smb (Windows file sharing)
- make sure that your mail client no longer uses mathserv.mcmaster.ca for the inbound (POP/IMAP) and outbound (SMTP) servers
- for IMAP/POP, use mathmail.mcmaster.ca
- for SMTP, use smtp1.mcmaster.ca (the campus mail gateway)
Today: In-bound mail to @math.mcmaster.ca adresses will be paused from 3:30 pm to 4:00 pm today (Wednesday). You will still be able to access your inbox and mail folders via mail clients, pine, and web mail. If the out-bound/SMTP address of your mail client is mathmail or smtp1 (and not mail.math.mcmaster.ca or mathserv), then you will still be send mail, too.
Tomorrow: The workstations, email and access to home directories will be down from 4:30 pm to 6:00 pm tomorrow (Thursday). Web sites will still up, though with very brief interruptions.
After 4:00 today, all mail/spam processing will be handled by our new, faster server. After tomorrow at 6:00, we will be using a borrowed (and faster) file server so that we can upgrade our own file servers.
September 14, 2010
The new server, ms.mcmaster.ca, will be rebooted at noon. Web and email will be down for a few minutes.
September 13, 2010
We are in the process of putting our new admin server into production: web, email, wiki, file sharing, etc. will be moving from the current server, mathserv.mcmaster.ca, to the new server, ms.mcmaster.ca. The new server - together with some configuration and file-server changes - will speed some things up immediately and allow us to expand and improve other things in the coming months (things = web, wiki, mail, workstations, etc.).
Over the next week, there will a number of brief interruptions to individual services (mail, web, wiki, file-server access) as well as a one- to two-hour shutdown of email and workstation access. There will be a few more brief and short-term interruptions over the next two months as we increase the size and speed of our file servers.
The brief interruptions - that is, between a few seconds and a few minutes - will not, in general, be announced; I will post/email announcements about extended downtime.
Mathserv runs dozens of web sites and other services. We tested the major components on the new server ahead of time, but we're certain to have missed something. Please email email@example.com if you come across anything weird or wonky.
We're still waiting for parts so that we can put Tray 2 back in place (currently, Tray 2 jams frequently).
The printer has begun a new weird thing: some jobs cause the printer to just sit there with the Data light flashing and the control screen reading "Processing ...".
If you find that the printer is sitting there with the Data light flashing for more than 30 seconds ...
- press the orange Stop button on the printer
- "Resume" will be selected
- press OK on control screen
You will still need to press OK twice (on the printer control screen) for print jobs which insist on trying to print from Tray 2.
September 10, 2010
Our main server blew a disk this morning and is struggling while a spare is built into the main storage array. In order to allow the array to rebuild more quickly, I will be turning off mail services for up to half an hour at a time. Other services (workstations, Windows file sharing) may also be interrupted.
I will probably leave the interface at mail.math.mcmaster.ca up all the while, though.
September 9, 2010
Some people are having problems logging into their linux workstations as of yesterday: after logging in, the desktop is blank and there are no menus or icons. Not everyone is affected and I don't know the source of the problem yet.
You can work around the problem in the meantime by choosing the KDE desktop from the Session menu on the login screen.
The jamminess of the printer is apparently related to a problem with tray 2. Parts have been ordered and for now tray 2 is removed; the printer will still work and should simply take paper from tray 3 instead.
In some cases, it may be necessary to press OK twice in order to force the job to use the second tray.