Short story: 8.5.1 upgrade was fast and easy but Windows Update killed my server.

The long version: Saturday afternoon, I decided to upgrade one of my development servers from 7.04 to 8.5.1 today. I am pleased to say that as far as I can tell the upgrade was easy and fast. Once I backed up my data, it only took me about 10 minutes to install 8.5.1 and let the server do its thing at startup. I then restarted  the Domino service  and all appeared fine. How's that for a seamless upgrade? Sweet!

The last step was to reboot the server to confirm that the Domino service would restart automatically. As I went to do this, I noticed the pop-up from Microsoft, telling me I had Windows updates pending. I decided to go ahead and apply these and reboot.

That's when the fun began.

Apparently the server boots part way and hangs, saying that it is at step 3 of 10 for the Windows update.This dev server is a dedicated server at a major hosting provider. Unlike some of the fine hosting service providers, my provider just rents boxes and it's up to you to manage your own box. Their support is pretty much limited to a power cycle and, if all else fails, a reprovision (complete wipe of the OS and your data and reload of the box). While they make the reprovision process as easy as one click, I did not want to have to start over and lose a current 30GB backup. So, I have been working with tech support.  It looks like I may need to reprovision the server and start over. Ouch!

The good news in all of this is that I have Lotus Notes/Domino.

Thanks to Notes and Domino, things continue to run
. Mail routing and replication has failed over to the stand-by server and I am the only one a little stressed.  If I do need to reprovision the server and start over, I will restore an old backup (any will do) and wait for the data to replicate across from the live server. This means that I could be running within 30 min of reprovision and have my data back in 3-4 hours. Try that with any other system.

I cannot even begin to imagine what this would have been like if this were another brand of collaboration server that had crashed. In any case, it wasn't the collaboration server that crashed, it was Windows Server.

Lessons learned:

1. The Domino 8.5.1 upgrade process was easy.
2. Lotus Notes/Domino can't be beat for redundancy, failover, and ease of recovery
3. From now on, I will take the time to FTP data to another server before I do an upgrade and not temporary backups on the same server -- even for a 10-minute upgrade.  A Windows update could wipe out your OS and you could lose everything.
4. I will explore another OS for Domino. I know Windows well,  I'm not convinced that an update hiccup couldn't take down a Linux box in the same way.
5. I will think about either moving this in-house or to an ISP that is in the business of hosting Domino who can take care of something like this for me.


Update 01:55 AM PST:
The tech just called me back. Server OS is hosed. They cannot even login. They offered to pull the drives and move them to another server. I requested a reprovision instead - I'd rather start with a clean box. When I wake up, I'll FTP my backup and install directory to the server and reload Domino. The good news is that I don't need to worry about my data. I should not have lost anything. Thanks to Domino and Replication, I'll be back up and running with no work on my part. I'll just need to wait for replication to finish.

Update 11:30 PM PST.
Once the ISP reprovisioned the server, it took me just over 30 minutes to download and endless stream of Windows updates and tweak the settings the way I like them. It took 3 hours restore an old backup to the server via FTP, another 10 minutes to download Domino 8.5.1 and 10 minutes to deploy Domino 8.5.1 which was the easiest part of my weekend. With Domino up and running (using a week old back-up, mind you) I watched as the servers replicated and all of my data was restored without my involvement. Now that's fantastic. So, all in all, my experience is that the Domino upgrade itself was painless and took me about 10 minutes.

Thank you Lotus
.

Discussion/Comments (15):

Flemming Riis (): 10/19/2009 1:14:40 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Yikes.

i dont recall the last time i seen a update that killed the OS but as mentioned here sadly it can occur.

Personally i run everything in a hypervisor that way at least i can trouble shoot if the OS dies , not sure if its a option if the box is hosted like here.


Darren Duke (http://blog.darrenduke.net): 10/19/2009 5:05:57 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

ESXi or ESX is your friend. That way you can snapshot before Windows crash-dates.

If it fails, simply revert the snapshot. If it works fine, merge the snapshots.


Henning Heinz (): 10/19/2009 5:29:51 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Since I went from Suse(Novell) to Debian I have never seen an update that killed the OS. But IBM Lotus still is a Windows shop.


Bill Greenberg - Good Computer Guy (http://blog.goodcomputerguy.com): 10/19/2009 6:34:49 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Ouch. I also am moving everything (mine and my clients) to virtual for ease of management and disaster recover. I'm also going to stop doing regular OS updates. I had both Windows Server and Ubuntu problems recently. From now on, if it works, don't touch it! Similar experiences upgrading Domino, BTW. There's just enough time to enjoy a cup of coffee and then it's done. Why doesn't IBM market this??


Keith Brooks (http://www.vanessabrooks.com): 10/19/2009 6:35:56 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Eric,

You missed the #1 rule of updates.

Never mix them.

Backup first which you did do.

Update java first on server. Reboot if required, all good? go to step 2

Check for OS updates and decide what you need or don't need, install, reboot, all good? go to step 3

Update Domino, reboot, all good?

I have seen MS updates kill machines, sometimes IE8 did it, but usually the error can be reverted if you are running the system restore points.

Sorry to hear it went south, but you recovered well and maybe your ISP will have a better view of Domino because of this as well.


Eric Mack (www.ica.com): 10/19/2009 9:25:08 AM
re: 10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

A hypervisor sounds like a good way to go. I could do that if the box were hosted in house. Somday, I expect that ISPs will offer hosted vmware servers. (Lots of exciting development in that arena.) Then, you could move a process from server to server in the time it takes to transfer the data.


Eric Mack (www.ica.com): 10/19/2009 9:26:35 AM
re: 10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Hi Darren, ESX/ESXi are not options in a hosted environment... yet.


Eric Mack (www.ica.com): 10/19/2009 9:27:24 AM
re: 10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

I've generally had great success with Windows Updates. As one commenter remarked, it could have been the combination of Domino install and Windows update. Normally I always do these independently. Thankfully I had Domino to Recover


Eric Mack (www.ica.com): 10/19/2009 9:28:35 AM
re: 10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Bill, I agree. IBM SHOULD market how this is a value to customers. In their defence, I have seen Ed Bllog about this frequently. Domino upgrades are generally quite seamless. I have not seen this formally marketed, though.


Eric Mack (www.ica.com): 10/19/2009 9:29:44 AM
re: 10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Hi Keith, I generally follow the steps as you have outlined. System restore points are not available on a hosted server - once you run into a problem there's no way to boot into safe mode. Now, if the ISPs would offer a hosted KVM then all kinds of things would be possible, like hypervisors and restore points, etc.


Bill Greenberg - Good Computer Guy (http://blog.goodcomputerguy.com): 10/19/2009 9:36:12 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Eric, true - but Ed is preaching to the choir. How many non-Notes shops read Ed Brill's blog? Actually, you didn't even mention Ed's last name in your reply - how many non-Notes people knew who you were talking about? :) I want to see network commercials showing this stuff! (Good ones, like the iPhone commercials, not the dry cryptic stuff like the usual IBM commercials.) Sorry I hijacked your article though - got a little off topic. I think I'll go get another cup of coffee and upgrade another Domino server. :)


Fred (): 10/19/2009 9:54:09 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

Hi, we had exactly the same problem on one of hour hosted servers. Installing 11 windows pending updates before restarting server and the server did not start afterwards. Managed to fix i, but that was by the support team on site (that I had to mobilize on Saturday). . .


Henning Heinz (): 10/19/2009 10:33:11 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

I am surprised about your hosting providers. I have my own rack (so I have reboot interfaces and a kvm over ip) but in general you can get everything in Germany for about 100$/month (including dedicated hardware and 1 TB traffic). I use Xen for virtualization and it works quite well too (although I just use it for Windows guests).


Eric Mack (www.ica.com): 10/19/2009 11:41:31 AM
Lotus Knows how easy it is to upgrade my Domino server

Bill, I think you bring up a good point. I think Ed does a great job evangelizing within the Notes community of believers. I don't know that his message works its way through IBM Lotus marketing to the public. Perhaps this is a topic for the Lotus Knows Campaign: "Lotus knows how easy it is to upgrade my server" ?


Ove Størholt (http://www.inforte.no): 2/17/2010 8:19:56 AM
10 Min to upgrade to Domino 8.5.1; 20+ hours to recover

I just had a strange problem:

I upgraded from 8.5FP1 on xLinux (SuSE) to 8.5.1. When the server starts first time it still reports 8.5FP1??

Anyone who has seen this?


Add a comment