ESXi management network issues when using EtherChannel and NIC teaming

ESXi behavior with NIC trunking

Sometimes very challenging problems will arise.  Things that make you scratch your head, want to hurl your coffee cup, or just have a nice cold adult beverage.  Customers can change a projects requirements mid-way through, a vendor’s storage array code upgrade can go awry, or a two can creep into the ones and zeros.

In this section, we present examples of those crazy situations with the hopes of helping out our fellow engineers in the field before they become as frustrated as we have!

Recently in working with a customer, the request was for a new cluster comprised of ESXi 4.1 hosts.  They would be using just two onboard NICs for the vmkernel and virtual machine traffic.  These two NICs would feed into a pair of Cisco Nexus 5020’s, using virtual port channel (VPC).

Because of the use of VPC, the virtual switch load balancing needs to be set to IP Hash for the NIC teaming policy.  Okay, no sweat!  After installing ESXi and completing the initial configuration on the hosts, it was time to add the second NIC to vSwitch0 and plug it in.  (Note this configuration is all being done on the hosts directly as no vCenter server has been built yet).  After adding the second adapter to the active section of vSwitch0, and changing the NIC teaming policy to IP hash, we plugged in the second cable.

The host immediately lost connection from our vSphere client, and dropped off entirely from being able to be contacted.  No ping, no nothing!  This was most puzzling indeed:  we unplugged the second cable and the host started to ping again.  We thought maybe there was something wrong with the NIC itself, and so setup a separate NIC to take its place.  This had the same result, and we then thought to look at the switch.  After discussing the current configuration with the network engineer, we felt that his configuration was correct.  The configuration (and more!) can be found in the white paper put out by Cisco and VMware: “Deploying 10 Gigabit Ethernet on VMware vSphere 4.0 with Cisco Nexus 1000V and VMware vNetwork Standard and Distributed Switches – Version 1.0” This doc has been a very helpful during the implementation of this project.

So!  With the network being deemed not the problem and wearing a sheepish smile on my face after the network guy commented “it’s always the network isn’t it?” I returned to the host.  I then tried setting up both NICs on a non-nexus switch that is being used for out of band management, and they worked just fine using virtual port id for NIC teaming.  So at that point, I fired up the googalizer and did some checking.  I came across this KB article from VMware:

VMware KB 1022751:  NIC teaming using EtherChannel leads to intermittent network connectivity in ESXi

Details:

When trying to team NICs using EtherChannel, the network connectivity is disrupted on an ESXi host. This issue occurs because NIC teaming properties do not propagate to the Management Network portgroup in ESXi.
When you configure the ESXi host for NIC teaming by setting the Load Balancing to Route based on IP hash, this configuration is not propagated to Management Network portgroup.

So, based on this very helpful information, I followed the instructions listed in the KB and had great success.  Now my ESXi hosts are talking on both NICs via IP Hash and life is good.

Read More

How to balance VMware ESX hosts paths on HP EVA arrays

Here at 64k, in our smaller cube near the vending machines, we storage-oriented folks like to mull over ideas big and small, 4k at a time.  We also deal in a great number of puns, so consider yourself warned.  Today, in our maiden voyage, I’d like to talk about some of my experience with HP’s line of EVA storage arrays.  As many of our readers know, the EVA line is a middle tier offering from HP.  Though likely to be usurped in the near future by 3PAR’s goodies, I am not here to begin that debate.  Rather, let us delve into a few common gotcha’s that can be overlooked in environments where EVAs live.

ONE]

The tight rope act begins with the storage array, our bright and shiny EVA.  At a fundamental level, an EVA is comprised of two controllers.  The operating environment of the EVA is such that it can, in a semi-intelligent fashion, manage vdisk ownership between the two controllers itself.  By default, vdisks are set to no preference for a failover/mode setting at the time of creation.   This means the EVA will decide which controllers get which vdisks when it (the EVA itself) boots.  Every vdisk is assigned to a controller (and only one controller).  If the non-owning controller is receiving the IO for a server(s) talking to a vdisk, it will after a period of time change the ownership of the vdisk.  This will reduce the load crossing the mirror ports.   While the EVA can run in this fashion, it is sub-optimal.

The other side of the tight rope of this balancing act is the hosts.  IO can walk many paths from host to array, some optimal and others not.  The start of such begins at the host’s adapter.  If it is a dual port (or multiple single port) host, then you have all the more paths to choose from.  Even in the case of a single port host, you can still cover multiple paths to arrive at the vdisk.  The handling of the proper path comes in the form of multipathing software.  From HP for Microsoft operating systems, we have Device Specific Module (DSM), which uses MS’s MPIO stack as its basis.  HP makes specific DSM’s for each of its line of arrays.  Without the MPIO stack, the host will see a drive presented once for each host port.  In an 8×00 series array, that is 8!  So clearly the MPIO software and HP’s DSM is needed for correct operation.  The default install does not enable Adaptive Load Balance (ALB).  This hampers read operations by not passing through the correct controller for a vdisk.  Note that non-MS based operating systems (like VMware) have their own multipathing stacks.  In the case of VMware ESX(i) 3.x, the options are fixed and mru.  In the case of vSphere, we get round robin added to the mix.  In pre-vSphere environments, the fixed path does not by default balance load across the host ports.  You can end up with all your VM traffic running over one host port!  Yikes!

TWO]

Now, to balance things out, let me start with the array.  A good habit to get into involves understanding your environment from an IO perspective.  You need to understand the profile, or workload, of your IO, so that you can balance between the controllers (among other things!).  Make sure to capture your performance data using evaperf (or other tools) to allow you the view of your controller’s current load.  As you add new vdisks, you can balance them by setting the failover/mode setting to the controller with failover + failback.  This will allow the balancing to remain should you lose and regain a controller.  Further, this specifies the controller for the vdisk in terms of mastership.  This helps from the host side as the controller it needs to talk through is clearly defined.  One thing to keep in mind also is the need to accept all load on one controller should failure occur.  This should be something you are aware of via your performance data.  A good rule of thumb is a controller should be no more than 30% ideally (at least in my experience).   And as always, have the latest Command View and XCS code.  One other thing to check for balance is to make sure the host ports are set to their top speed (4GB, except the very old EVA models) as well as properly balanced on the fabric (equal ports on both sides).  One customer I came across had all ports from controller A on fabric A and all ports of controller B on fabric B!  Definitely a big problem there!

For the host side, there is a bit more that can be done.  There is some work to be done on the array as well, which I will address.  The hosts should have the latest firmware, drivers, and software for their HBAs.  Additionally, make sure you have the latest HP DSM software.   Within the DSM software, you will want to enable Automatic Load Balancing.  As I stated before, this is not enabled by default.  To enable, just right click on each LUN (listed by WWN) that is listed and choose Enable ALB.

So, as a quick explanation:  write requests from hosts will hit the controller that owns the vdisk in question, but that write will propagate over the mirror link into both controllers’ cache.  This is in case a controller is lost, the write can still be committed.  Read requests will hit whichever controller, and if it is the wrong controller, will have to travel over the mirror ports to the correct controller.  This is sub-optimal, but is alleviated by enabling ALB.  ALB communicates with the array and will always communicate its read requests through the owning controller.  Very handy!

Now, from a VMware standpoint, let’s talk about fixed and then round robin (two most common multipathing situations found today).  For Fixed, you will need to balance IO to your datastores over the host ports of the controllers.  Also keep in mind which controller you selected at the array.  As an example, if I have 8 datastores of average IO (no virtualized heavy apps) then I would want 4 datastores on each controller.  To further balance, I would have each datastore talking over one of the host ports for each of the controllers (4 ports per controller x 2 controllers).  The IO is evenly balanced.  To set this, simply go into each datastore properties (via the VI Client) and pick the WWN for the corresponding host port).  Under heavy IO circumstances, you may not be able to move your traffic to a different host port.  Just try again at a later date.  When it comes to round robin, the IO works a bit differently.  Round Robin will send IO to each host port in turn after a certain amount of IOPS.   In the HP best practices for vSphere on the EVA, it states to change this value to 1 (and thus pushing even IOPS over every host port visible).  There was a bug which would, after a reboot of the ESX(i) host, reset this to a very high number.  I have found in my experience that leaving it as-is seems to work fairly well.  I would guess there is good reason that HP came up with that figure, and so at this point, with vSphere 4.1, I would suspect you could set this without issue.

Summary

Presented here are some of the findings I have come across in working with different customers.  I figure that having these kinds of storage discussions can help to make for a very engaging conversation.  Let me know what you think (and if I make any errors, which being human, am prone to!

Read More

VM Zombie Survival Guide (Part 1)

Administrators unite against the great VM Zombie menace! Long have we toiled to create the pillars of virtual infrastructure! We plan, and overcommit, and squeeze every last resource out of our designs, our environments, our data centers. And yet, we still face a considerable foe in wasted resources. Zombies!

How many times have you stood up a new environment, and migrated VMs, only to come across an old crusty Windows 2000 Advanced Server with Pervasive SQL (oh god, btrieve!) that still lurks in the lower regions of your vm sprawl. Yes, this lower denizen, or Zombie, had its roots in good intentions. You see, in the olden days, VMs sprung into consciousness for all sorts of development duties. However, due to a lack of regulation, and attention, they began to languish. Further, these zombie’s could be multi-headed (from decaying snapshots) and fiendishly hungry (4 vcpu’s for your cold fusion vm or 8GB ram for your 3 SQL instances that just have to mimic production). Before we dig out the shotgun and gas up the chainsaw, let’s look at the characteristics of a VM Zombie.

Know your Enemy: ZOMBIES!

First, it’s important to understand the distinction between a VM Zombie and a Zombie process. Think of them as greater and lesser zombies. A VM Zombie is a virtual machine that has been left to suck resources, yet perform no real task. Its idle hands merely wish to be used to devour tasty ram and cpu cycles. Let it also be known that VM Zombies can drop chunks of body parts (folders/vmdk/files) on your storage array as they stumble through your environment. Gross! A zombie process on an ESX host is a process that is dead, and you cannot kill the dead. We will be focusing on the VM Zombies, so grab your laptop (blunt object trauma!), survival rations (beer), and let’s learn about Virtualization’s Great Menace!

To kill zombies, we need tools. Big, sharp, loud, gunpowder-based tools. We’ve got two two great tools on tap: one paid and one free that I turn to in times of the undead feasting on the living flesh of my hosts. These tools are well known in the community and lots of information can be found about them.

VKERNEL – Optimization Pack

VKernel make a number of excellent tools, both paid for and free. When it comes to quality for decapitation of the Zombie Menace, I have to say I am very pleased with what this application brings to the table. First and foremost, as your virtual infrastructure scales (and sometimes sprawls), it can lead to lots of Zombie action. Wastefinder (a part of this pack) is absolutely brilliant J Not only does it help you find the roving Zombie hoard (snapshots included), it provides the empirical and historical data to validate. This goes hand in hand with rightsizing VMs, which can be troublesome if you get pushback from application owners, management, etc. Virtualizetips.com has had the privilege of previewing these tools, and I agree that their Gold award from VMworld 2010 is well deserved. Take that you brain-slurping bastards!

http://www.vkernel.com/products/optimization-pack

RVTOOLS

I have been using this tool in my home lab for quite some time, and also in my various locales of employ. Being free, this is a very solid app for tracking down zombie bits (both hunks of dead flesh on your storage array, and also snapshots and unregistered, rogue VMs). It will also export in a nice CSV format so you excel nerds can get your game on. I really like the quick and dirty view of an environment. It lets me zoom in and get what info I want quickly. At a glance, you get so much more also: build numbers, VMware Tools levels on your VMs, and all the hardware content for your VMs. This is a great keep-on-the-laptop tool if you have many clients to visit and need to get a quick assessment of their hotspots.

http://www.robware.net/

Coming in Part 2, more tools, tips, and survival tips to bring the fight to the Zombie Horde! Groovy!

Read More

War declared on VM Zombie Nation!

I’ll teach you VM Zombies what for!  Just wait for me to assemble the VMnomican (Unholy book of the Virtual)!  Soon you will know the weapons we use to thwart you (and it’s not just virgins!)

As you can see from the image below our first line protector is ready to use the full power of our Monitoring and Chargeback tools to take down the Zombies. She is already waiting in the Datacenter to blast the offenders.

Also, big thanks for posting my email retort!

http://www.zombievm.org/contact.html

Read More

SSSU and You!

Now, in a previous posting, I mentioned that I would talk more about SSSU, especially in talking about how to export the information and then put it into a human readable format. SSSU can output xml format, but that requires some type of xml parsing tool. I did mention that Microsoft’s Log Parser tool could be used, but really I’m lazy, and it’s a bit cumbersome. And truth be told, I never got it to work just right.

So last night I sat down and did some old fashioned thinking about how I can get the information I need, but easier and with less effort. I started toying with powershell as it’s a good friend in the VMware space. I have to confess though that I have a Mac. Yes, I enjoy monotasking. Nothing wrong with that 😉 I then realized (as I sometimes forget) that I had a whole delicious CLI on my mac. This includes great tools like grep and diff! Well and a lot more, but I’ll stick to those for now. (Note that you can get grep and diff for Windows, via either the Windows versions of those tools, or using Cygwin). Fear not win32 folks, you are covered!

Let me take a step back and cover the premise for wanting to gather information from SSSU. As we know, Command View is a web-based interface for communicating with the EVAs. While it does provide lots of information, it is troublesome to navigate around and get that information easily. One off kinds of things, certainly. But pulling in lots of info easily, I think not. One big flaw, in my mind, is that CV talks to the EVAs via the SAN, and not via IP. Why is this a flaw? Well, for instance, SSSU can’t talk directly to the EVAs themselves. Rather, it has to talk to the CV server (which is why it prompts you when you fire it up for a “manager”). This also means you can’t use SSSU to do anything if your CV server has bit the dust. But I digress.

From the arrays, I want to gather information on my vdisks, my controllers, snapshots, disks, and my disk groups. I want to gather some information once, some monthly, and some on a more regular basis.

For the vdisks, I run (via SSSU) this command: LS VDISK FULL > vdisk.txt (This will output the information into a text file in the directory where the sssu.exe is located) Then, I fire up my command line, and grep that sucker for some info:

grep “familyname|allocatedcapacity|redundancy|diskgroupname” textfile > date_vdisk.txt

This output will give me a file with the date that has the information I am specifically looking for

As stated before, I am quite lazy and so I could use (or you could use) awk (another great command line text processor) to generate the output in a better format. But instead, I keep it like this. Note that allocatedcapacity is the vdisk size in GB. Now, since I’m generating these files monthly, I can use the diff command to compare two months and see what has changed (disk grows, adds, deletes, etc).

diff -y date_vdisk1.txt date_vdisk2.txt | grep “familyname|allocatedcapacity”

Note the | in there. The older date is on the left, and the newer date is on the right. So it’s easy to see which has changed and by how much. Arguably you could make this even easier, but again, lazy. And this works for me, so your mileage may vary.

Since these are simple text files, it’s easy and pain free to keep them around. Overall, I use this information for vdisks to track growth, easy at a glance for what vdisk is what size/raid level, and you can also pull out info to find out what controller has what disks.

This leads me into talking about what information I grab from my controllers. Now, one thing to note: The EVA4400 series only has one listed Controller (in this case Controller 1). This is because of how it is designed: both controllers are in the same housing, sharing a back plane. We have three 8100 series, each having two physically separate controllers, listed as Controller 1 and Controller 2.

First, to find out ALL the info on your controllers, do LS CONTROLLER FULL in SSSU. The output will be big and full of interesting details. One other thing to note: SSSU denotes them as Controllers 1 and 2. Command View denotes them as Controllers A and B. Lame! For what I need, I don’t need to keep controller info like I do vdisk info. I will do an initial grab after an XCS code update to keep handy.

One pretty handy way to find out what snapshots you have on any given EVA for a point in time is to use LS SNAPSHOT. You can also do an LS SNAPSHOT FULL if you want the full info per snapshot (like the vdisk info). The key difference between a snapshot and a normal vdisk is the line item sharingrelationship. A normal vdisk will have none, but a snapshot will say snapshot

When it comes to gathering information on disks, I use this primarily to check firmware levels. If you are an EVA owner, you know that as part of the care and feeding of an EVA comes making sure all drives are at the most current level of firmware. Updates are also usually done and bundled with XCS updates. One thing to be aware is that with drive failures, replacements may not always come with the latest firmware. They should, but I have not always seen that. Thankfully firmware updates are non-invasive (for the most part). I will cover an XCS code upgrade in a future blog post (our EVA4400 is due).

So, if you do LS DISK FULL from SSSU, you will get all the info from each disk. You can then just grep for fun and profit!

grep “diskname|firmware” disks.txt

So you are saying hey that’s great, but I have multiple kinds of disks in my EVA. So, you need to know what model of drive so you can keep things sorted as to what firmware is for what drive. Easy way is to just sort by disk group. Since you built your EVA correctly, you know only to put drives of the same type and speed into the same group right? 😉

grep “diskname|diskgroupname|firmware” disks.txt

You can also grab the model number for the drives by tossing in modelnumber to the grep.

And finally, since you all are probably bored to tears now, I grab the size of the disk group to know what kind of overall growth or change occurs on each group. I can also use this information to plan out new server storage requests, and manage to 90% capacity. Easier to give management a shiny graph saying “Look, almost out of disk. Checkbook time!”

Okay, so that about wraps up what I use SSSU for. If I think about anything else neat that I do, I’ll be sure to blog about it. The next blog topic will be about WEBES, its use, the install, and the fact that it actually works pretty good. J

Read More
Page 1 of 41234
%d bloggers like this: