jump to navigation

Upgrading to vSphere on ESXi March 22, 2010

Posted by vneil in ESXi, upgrade, VMware.
2 comments

We’ve recently gone through the process of upgrading our complete VMware infrastucture to vSphere and this is a short summary of how the process went. We needed to upgrade 2 vCenter Servers, 2 Update Managers and 30 ESXi hosts servers not including any test clusters. We had been running vSphere4 on our test servers for a while and didn’t come across any problems so planned to upgrade our production environment. Luckily vSphere 4.0 Update 1 come out shortly before we were to deploy in production so after testing this on the test servers we were able to go straight to Update 1 for our production servers.
The plan was:

  • Upgrade vcenter

    which included the following :

    • stop vCenter services
    • backup vcenter db
    • backup ssl certs
    • snapshot vCenter VM
    • upgrade vCenter Server
    • Run vCenter upgrade
  • Upgrade VUM

    which included:

    • stop VUM service
    • backup VUM db
    • upgrade Update Manager
  • Upgrade converter/guided consolidation
  • Configure new licenses in vCenter
  • Configure VUM with host upgrade ISO/zip file for ESXi4.0 U1 (more on this later)
  • Backup ESXi host server configurations with vicfg-cfgbackup on the vMA server
  • Put host into maintenance mode and reboot into a network boot rescue Linux image to update Bios, nic firmware, fc firmware
  • Reboot host and remediate host with VUM to upgrade ESXi

This was mostly planned with the upgrade guide PDF and on the whole it went very well, there were a few things which I would probably add to this plan if I were to do it again.

vCenter upgrade
This went very smoothly and we didn’t come across any big problems. The only issue was with the added roles and permissions for the datastores and networks, which meant that our Windows and Linux admins couldn’t provision new disks or add networks to VMs. This was quickly resolved by clone and adding the new roles of Datastore consumer and Network consumer for each of the VM admin groups to the folders of the respective datastores and networks which were created as part of the upgrade. This is documented in the PDF (which I must have missed) on page 65.

New Roles

I really like this new feature of permissions for datastores, it means we can  have much better control with giving different datastores to different VM admins.

VUM / Hardware
Since upgrading all the hosts we have started to see various strange hardware error alarms being displayed in vCenter. The errors are things like “Host Battery Status : I2C errors” or “System Memory” errors on no particular hardware device.  Upon looking into it I thought this might have something to do with the updated CIM implementation and realised that we had loaded the standard ESXi4.0 U1 ISO image from the VMware site into VUM, maybe we should have used an IBM specific image as all our servers are the same xSeries models from IBM and came with ESXi installed on internal USB sticks. The errors we see, come and go on different servers which means it’s difficult to see if there are any real problems.

I opened a service call with IBM (who provide our VMware support) to ask if there was a customised version of ESXi for xSeries servers or if there were any known problems with ESXi4 on IBM hardware. After quite a bit of internal communication within IBM it seems there is an “IBM recovery CD” for ESXi but it doesn’t seem to come with any customisation apart from a message somewhere saying to report problems to IBM support rather than VMware support.

Now I am waiting for the CD to be sent just to see if it makes any difference while I also have open service calls with IBM for the individual problems we see. It seems the problem is to do with a “handshake problem between VMware and the IPMI” according to the service call and they are working on fixing the issue but it basically means we are stuck with these false alarms.

This is what we see:

System Memory Error image

The System Memory alarm in vCenter

Host Battery Alarm

The Host Battery Alarm in vCenter

Upgrading Firmware on ESXi

As ESXi does not have the Service Console it is harder to do things like upgrading the BIOS or changing settings on FC adapters. For this I setup our SLES Autoyast server to network boot the SLES Rescue image on the ESXi server. This allowed me to NFS mount a share with Linux versions of the BIOS update, Broadcom adapters firmware and the scli program to updated the Qlogic cards and check their settings.

Virtual Machine Upgrading
You may have noticed this missing from the upgrade plan, which is kind of true, all of the previous steps were down without any outage of applications or services (apart from vCenter but this is a controlled set of users). The next step of upgrading VM Tools and the virtual machine hardware will require further planning by the respective Windows and Linux admins (and follow in another post :-))

Logging in ESXi March 3, 2010

Posted by vneil in ESXi, logging, VMware.
add a comment

After listening to the VMware Communities Roundtable podcast of last week I thought I would create a short post on ESXi logging. It was episode #83 which focused on ESXi and was very interesting, it was geared towards administrators who are familiar with ESX and the pros and cons of moving to ESXi.

One thing that was mentioned by the main guest VMware’s Charu, which I haven’t tried before, was to setup persistent logging for an ESXi host. As he mentioned, although there is logging in ESXi which gets stored in the /var/log/ directory it is all in memory so if the host crashes, all logs are lost. While I already had the remote syslog setup it seems this local logging is a different log file.

On one of our test servers I went into Configuration->Advanced Settings (Software) -> Syslog -> Local and entered a directory and filename on a VMFS datastore and it started logging immediately to the specified file.

I then tried to see if it was a copy of a log that was already being created somewhere on the host and while it looks similar to the /var/log/vmware/hostd.log file it seems to be more verbose.

The other logs that I am aware of that are useful are:
/var/log/messages
/var/log/vmware/hostd.log
/var/log/vmware/vpx/vpxa.log

These are available via the browser interface (http://hostname/host). The messages file is what is sent via to the remote syslog if configured. There are also logs in /var/log/vmware/aam/that relate to the HA agent on the host and if there problems with a host enabling the agent it is sometimes fruitful to look in here at the latest files.

I would say that setting up both local persistent and remote syslog logging as a very good recommendation in configuring ESXi hosts especially in a production environment.