Cisco, VMware: Nexus 1000v Tracking VM Interface Errors

So I have found the first use of our new Cisco Nexus 1000v virtual switch. The counters (stats) were reset about a week agon and this is the first time they have been reviewed since. Almost all virtual machines show no errors, but there were a few that were high.

After connecting to the console, the following command was ran. nexus1kv# sh interface counters errors

--------------------------------------------------------------------------------
Port       Align-Err     FCS-Err    Xmit-Err     Rcv-Err   UnderSize OutDiscards
--------------------------------------------------------------------------------
Veth30            --          --          --          --          --     1127136
Veth69            --          --          --          --          --       29654
Veth70            --          --          --          --          --       31966
Veth71            --          --          --          --          --       39625

Above is a subset of the entries due to most of the rest having so little or no errors at all. The main problem interface is Veth30, which turns out to be w2k_serv01 (Windows).

Verifying the VM is indeed “w2k_serv0″
nexus1kv# sh interface Veth30 description

-------------------------------------------------------------------------------
Interface                Description
-------------------------------------------------------------------------------
Vethernet30              w2k_serv01, Network Adapter 1

Below is just to verify what the “sh interface counters errors” reported
nexus1kv# sh interface Veth30 | inc Drops

63 Input Packet Drops 1127136 Output Packet Drops

The other 3 ports are all assigned to linux_host01 (Linux).
nexus1kv# sh interface description | inc ops01

Veth69                   linux_host01, Network Adapter 1
Veth70                   linux_host01, Network Adapter 2
Veth71                   linux_host01, Network Adapter 3

After a little more investigation, it turned out that “w2k_serv01″ was deployed from a legacy template that was created on our old AMD cluster. Looking at the network interface in Windows 2003, it showed up as an “AMD …” network adapter.

To try and correct these errors, I removed the current network adapter through VMware vCenter and added a new one. This forced the OS in seeing the network interface as a new device and install fresh drivers. Now in Windows it shows up as an Intel network inteface.

Now that the new network adapter has been installed, I reset the statistics (and errors) on all the interfaces.
nexus1kv# clear counters

This command will clear "show interface" counters on all interfaces
Do you want to continue? (y/n)  [n] y

In the Nexus 1000v, the interface still shows up as Veth30. It has been a few days since the new network adapter was added and no errors are present.
nexus1kv# sh interface counters errors | inc Veth30

Veth30            --          --          --          --          --           0

I wrote this up just to show that even virtual network interfaces can have errors. Most likely the cause is the VMs OS or driver. Previously, I would have had to use esxcfg-info from the ESX host CLI, try to grep out/sort through to find a VMs network interface, and hope to find an error filed there. Yeah, that was a run-on sentence.  Sorry.

So I have found the first use of our new Cisco Nexus 1000v virtual switch.  The counters (stats) were reset about a week agon and this is the first time they have been reviewed since.  Almost all virtual machines show no errors, but there were a few that were high.

After connecting to the console, the following command was ran.
<pre>nexus1kv# sh interface counters errors
——————————————————————————–
Port       Align-Err     FCS-Err    Xmit-Err     Rcv-Err   UnderSize OutDiscards
——————————————————————————–
Veth30            –          –          –          –          –     1127136
Veth69            –          –          –          –          –       29654
Veth70            –          –          –          –          –       31966
Veth71            –          –          –          –          –       39625</pre>

Above is a subset of the entries due to most of the rest having so little or no errors at all.  The main problem interface is Veth30, which turns out to be w2k_serv01 (Windows).

Verifing the VM is indeed “w2k_serv0″
<pre>nexus1kv# sh interface Veth30 description

——————————————————————————-
Interface                Description
——————————————————————————-
Vethernet30              w2k_serv01, Network Adapter 1</pre>

Below is just to verify what the “sh interface counters errors” reported
<pre>nexus1kv# sh interface Veth30 | inc Drops
63 Input Packet Drops 1127136 Output Packet Drops</pre>

The other 3 ports are all assigned to linux_host01 (Linux).
<pre>nexus1kv# sh interface description | inc ops01
Veth69                   linux_host01, Network Adapter 1
Veth70                   linux_host01, Network Adapter 2
Veth71                   linux_host01, Network Adapter 3</pre>

After a little more investigation, it turned out that “w2k_serv01″ was deployed from a legacy template that was created on our old AMD cluster.  Looking at the network interface in Windows 2003, it showed up as an “AMD …” network adapter.

To try and correct these errors, I removed the current network adapter through VMware vCenter and added a new one.  This forced the OS in seeing the network interface as a new device and install fresh drivers.  Now in Windows it shows up as an Intel network inteface.

Now that the new network adapater has been installed, I reset the statistics (and errors) on all the interfaces.
<pre>nexus1kv# clear counters
This command will clear “show interface” counters on all interfaces
Do you want to continue? (y/n)  [n] y</pre>

In the Nexus 1000v, the interface still shows up as Veth30.  It has been a few days since the new network adapter was added and no errors are present.
<pre>t1prd-nexus01# sh interface counters errors | inc Veth30
Veth30            –          –          –          –          –           0</pre>

I wrote this up just to show that even virtual network interfaces can have errors.  Most likely the cause is the VMs OS or driver.  Previously, I would have had to use esxcfg-info from the ESX host CLI, try to grep out/sort through to find a VMs network interface, and hope to find an error fied there.  Yeah, that was a run-on scentance. Sorry.

Advertisement

~ by Kevin Goodman on June 7, 2010.

3 Responses to “Cisco, VMware: Nexus 1000v Tracking VM Interface Errors”

  1. [...] This post was mentioned on Twitter by Kevin Goodman, unix player. unix player said: RT @colovirt: Cisco, VMware: Nexus 1000v Tracking VM Interface Errors – http://wp.me/pm3nc-dx [...]

  2. That’s really cool! I didn’t realize the 1000v kept track of interface errors like the physical switches do. Thanks for posting about it.

  3. [...] is the original post: Cisco, VMware: Nexus 1000v Tracking VM Interface Errors … Tags: cisco, interface, nexus, realize-the-, really-cool, topsy-com, tracking, [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.

Join 1,031 other followers