I walked into the office today to receive wonderful news: our email server was down. The good news was that it was a problem with the equipment, not DNS so we were still receiving emails because the smtp server was up and queuing our mail.
I did some preliminary troubleshooting to see if I could see a problem. I tracerouted to the box, I telneted to port 110 to see if I could open a connection but everything was dead. Our hosting company has many different pop server IP addresses, so I thought maybe it was mis-configured but luckily I didn't go down that road too far before I made some fatal mistakes (like changing DNS records to point to what I thought was the correct, working server).
I called our web developer and told him about the issue. He told me to contact our hosting company (superb.net) directly to see if they're working on the problem. Rather than go through phone support I went to their website and used their myCP control panel which gives a link to support. When I clicked on the support link I noticed this message:
We are experiencing a networking problem with this server. It may be a problem with the network card or the access switch that the server is connected to. We are working urgently to fix this problem. Thank you for your patience.
When I see something like this, it makes me rest easier because I know they're working on the problem. As someone who has worked in Tech Support on issues such as these, I know how important it is not to bug them and let them do their job. If it would have been down for another hour or so I would have probably called to get an ETA but luckily it came back up within seconds of pasting this text to everyone in our corporate IM client: Vypress Messenger.
Pro-active Monitoring
I have a linux machine that is our firewall for the network but it also doubles as a box where I can run services and applications to help in maintenance. I recently installed nagios to monitor for situations such as this. The only problem with this is even if I could tell that pop services on our mail server were down, I'd have no way of knowing if the hosting company knows or not. I will utilize nagios as much as I can but there are some things a network monitoring tool just can't tell you, like the text above.
How RSS comes into this
It doesn't have to be RSS, but any feed based service would be ideal in this situation. Rather than logging into their site and going through the motions I went through, I could simply setup a feed for their support. I could then use RSS Bandit or any other aggregator to periodically probe for support issues that directly affect our service. Rather than bug tech support with phone calls or use the web, I could know of a problem almost as quick as it happened. I would be polling once every hour most likely, but I could refresh the feed when nagios reported a problem so that I could tell what they're doing about it before I take the time to stop what I'm doing and find their phone numbers or go through their website.
I think RSS or feeds in general will benefit from this type of situation but usually only if their website is up. It is a hosting company and their entire network could be down which would negate a feed based support service but if they were smart they would store it off site somewhere in the event of a global network failure.
I would have loved this when I worked for Bellsouth.net. I would tell people where to point their browser or aggregator in the event of a failure and make sure the feed is constantly updated as more information is found. It would drastically reduce the volume of calls and hits on their website checking for the support in their area. They could even customize it for each city so that whenever users from say Atlanta looked at it, they could tell if there were immediate problems in their area. It wouldn't have helped me when our phones were completely dead, but nothing would have really.