Cisco UCS 6120 Startup Times

•July 30, 2010 • Leave a Comment

So I like to time almost everything. So naturally during my last maintenenace in our UCS environment I wanted to determing the amount of time it took for the 6120 to reboot. I wound up having to restart the 6120 multiple times and found the following times to be pretty consistent: Continue reading ‘Cisco UCS 6120 Startup Times’

VMware: ESX Restart A Hung Virtual Machine

•February 9, 2009 • 7 Comments

This is a last case scenario blog. I recommend always start out using the GUI.  If that does’t work, use vmware-cmd to try and stop the VM.  If all else fails, follow below.  First you need to know what ESX server the VM is running on. The easiest way is to click on the VM you are looking for and go to the summary tab. The ESX server it resides on will be in the “General” section under “Host”. The other method is to connect to each of the ESX hosts and run the following:

You have to su to root for this command to work

[root@esxserverroot]# vmware-cmd -l
/vmfs/volumes/4844ace-b448s7d4-74e-000055858ed/exchangeserv/exchangeserv.vmx
/vmfs/volumes/4844ace-b448s7d4-74e-000055858ed/sql01/sql01.vmx
/vmfs/volumes/4844ace-b448s7d4-74e-000055858ed/domaincontrol01/domaincontrol01.vmx
/vmfs/volumes/4844ace-b448s7d4-74e-000055858ed/tstxp04/tstxp04.vmx
/vmfs/volumes/4844ace-b448s7d4-74e-000055858ed/monitor5/monitor5.vmx

This commands lists all the currently running virtual machines on the individual ESX host. Also displayed is the full path to the vmx file, which is very useful when using vmware-cmd.

Now lets pass -axfww to ps. This will give us the

Continue reading ‘VMware: ESX Restart A Hung Virtual Machine’

SAN/Networking/Linux: Using Multipath To Verify and Troubleshoot Connectivity To FC LUNs

•May 11, 2009 • Leave a Comment

A few days ago there was in error on one of our Cisco MDS 9120 fiber switches. The current environment at this datacenter consists of two Cisco MDS 9120 SAN switches with servers redundantly connected between the two. These switches are used to connect the servers to our fiber channel (FC) storage systems. In this case, the servers are generally mapped to an EMC, NetApp, and two RamSans. Below outlines a basic example of what can be expected from multipathing and the Linux environment when there is loss of connectivity to one leg of the fiber network.

Continue reading ‘SAN/Networking/Linux: Using Multipath To Verify and Troubleshoot Connectivity To FC LUNs’

SAN / EMC: Clariion CX4 Solid State DAEs (Shelves)

•June 3, 2009 • Leave a Comment

Going over the solid state offerings for the EMC Clariion lines, Texas Memory RamSans came into the conversation.  This was due to the fact that we currently run 2 RamSans in our Environment and consider them the highest tier storage in our datacenters.  One is 128 gigs of solid state DRAM storage and the other is 2 terabyte solid state Flash storage with a 64 gig DRAM cache.

Per the title, this is really about the EMC Clariion, not RamSans.  Since the RamSan 500 was fronted with the DRAM cache, and the EMC CX4 series contains cache as well, I was curious.  I already knew that each Service Processor (SP) in the EMC has 4 gig of cache, and that a LUN can only be active on one SP at a time.  Also, per a previous blog post, each DAE has a theoretical max throughput of 8 gigabit per second, 4 gigabit if a single LUN stripes across the whole shelf.

Continue reading ‘SAN / EMC: Clariion CX4 Solid State DAEs (Shelves)’

Networking / SAN: Cisco MDS 9000 Serial Number (Licensing)

•August 7, 2009 • Leave a Comment

So you need to find the serial number on your Cisco MDS 9000 series fiber switch? This is easy enough, although “show serial number” would have been better.

Quick way to find your serial number.

tstSwitch01# show license host-id
License hostid: VDH=SOZ115568P9

Continue reading ‘Networking / SAN: Cisco MDS 9000 Serial Number (Licensing)’

Linux / Security: Sudo ‘sudo su -’ vs ‘sudo -s’

•August 18, 2009 • 6 Comments

I always use ‘sudo su -’ when I need to get to a root shell. I have seen a few people before, and a new co-worker recently use ‘sudo -s’. Since I could not remember off hand the actual differences between the two, I had to check. The following will run through the actual limitations.

The big difference when using ‘-s’ are listed below
Continue reading ‘Linux / Security: Sudo ‘sudo su -’ vs ‘sudo -s’’

Networking, SAN: Cisco MDS Switch Scheduled Backups

•November 10, 2009 • Leave a Comment

Most people are good about making backup copies of their configuration before changes, but everyone makes mistakes eventually. To me the risk is not worth it, so this will be dedicated to automating Cisco TFTP backups of configurations. Most server administrators have automated tasks using either Cron (Linux/Unix) or Windows Scheduler. Cisco IOS also has the ability to schedule tasks.

I am very picky when it comes to my Cisco devices. A lot of information I read on this had the schedule execute “copy running-config startup” and would only backup one configuration. This is not a good thing, especially when there are multiple device managers. Below will go through setting up two jobs that backup both the running and saved configurations to different files daily. Continue reading ‘Networking, SAN: Cisco MDS Switch Scheduled Backups’

VMware, Linux: Install VMware Tools On RedHat Based Systems

•January 12, 2010 • 1 Comment

The following is a quick overview of installing VMware Tools on RedHat, CentOS, and Fedora systems.  Specifically for VMware ESX, ESXi, and vSphere systems.

First, go into the VMware console and right-click on the VM (Virtual Machine) that you are going to install VMware tools on.  Select “Install/Upgrade VMware Tools” option from the list.  Below is a screen shot of the menu.

VMware Tools Menu

Continue reading ‘VMware, Linux: Install VMware Tools On RedHat Based Systems’

Storage, Network: What I have Been Doing (EMC,Cisco UCS)

•March 2, 2010 • 1 Comment
This is more of an informational update of things that I have going on right now.  I normally do not publish day-to-day type of things, but here we go.

Storage
We have received both replacement drives for our EMC Clariion CX340 and four new DAEs (disk shelves) for our CX4-240

Clariion CX3
The CX3 was originally bought to speed up our Oracle implementation.  This was accomplished by ordering lots of fast disks (spindles) that were small.  We wound up with 6 DAEs filled with 73gig 15kRPM disks, totalling 90 dedicated drives for Oracle.
This was great for the original purpose but the unit was replaced a year after initial deployment with a RamSan and EMC CX4.  Having been decommissioned from production and moved to the tier 2 site, the need for space over IOPS (speed) drastically increased.  Trying to keep performance and space requirements in balance, the decision has been made to go with a smaller RamSan for Oracle at the tier 2 site.  This gives us the ability to replace the small 73 gigabyte drives with bigger 600 gigabyte 10kRPM disks.  Replacing those disk with the same quantity of 600 gig ones will give us ~8 times as much space.

Networking, UCS: Cisco Nexus License Installation

•April 8, 2010 • 2 Comments

The following is a quick run through of installing a license file for a Cisco Nexus device.

First check and see if there are any current licenses installed. This being a new switch, there aren’t any.

nexus_5000# sh license
nexus_5000#

Licenses are normally tied to “host IDs”. Below shows the host ID of the switch.

nexus_5000# sh license host-id
License hostid: VDH=3WI246599Z

The license file is not local to the switch yet, so below I fetch it via TFTP. Continue reading ‘Networking, UCS: Cisco Nexus License Installation’

Hardware, Linux, Networking: Cisco UCS Time Problem

•April 28, 2010 • 3 Comments

So we have our new Cisco UCS system installed and a weird problem is showing up. The Cisco UCS Manager console shows the correct date (2010), but when setting up a new server, the date is incorrect. Also, an NTP server (working correctly) is set.  Since we mainly run Linux here, the NTP service will not update the date/time from the NTP server because of how long the difference is between the system clock and NTP.  Also, on a Windows 2008 install we had to manually adjust the time/date as well

This is not a major issue, just an annoyance.  We also use RedHat Satellite server and can not join to the patch management system with the incorrect date. So my question is where does the OS get it’s bad time from? I figure that the OS gets the time from the bios and that the bios would obtaion the information from UCS Manger. That does not appear to be the case.

Continue reading ‘Hardware, Linux, Networking: Cisco UCS Time Problem’

Storage, SAN, Linux: EMC PowerPath Configuration On Cisco UCS

•May 4, 2010 • Leave a Comment

The following is a walk through of installing EMC PowerPath software on RedHat based Linux hosts (CentOS/Fedora). This is required to fully utilize multiple paths to EMC SANs. The test server used here is a Cisco UCS B250-M1 blade running FCOE over 10gb Ethernet. The configuration steps work for ISCSI, Fiber Channel, and FCOE connectivity to Clariion systems.

First, copy the RPM installation package over to the server. Below shows the package to be installed.

[root@test_server01 user01]# ll
total 7036
-rw-r--r-- 1 user01 user01 7191661 Apr 27 09:24 EMCpower.LINUX-5.3.1.00.00-111.rhel5.x86_64.rpm

Install the package via “rpm -i”. Continue reading ‘Storage, SAN, Linux: EMC PowerPath Configuration On Cisco UCS’

Cisco, Networking: VMware Cisco Nexus 1000v Increasing Default Max Ports

•May 5, 2010 • 2 Comments

Earlier today we ran into an issue with exceeding the default max number of ports (virtual) on a port profile.  The default is set to 32 and the max is 1024 per vethernet port profile.  I really can’t believe how low the default setting is.  Most VMware people I know consolidate waaayyyy more than 32 host per datacenter.

Below shows the error in VMware Virtual Center.

Continue reading ‘Cisco, Networking: VMware Cisco Nexus 1000v Increasing Default Max Ports’

Cisco UCS Implementation In Pictures: Part 1 Nexus 5010 And 2148 Fabric Extenders

•May 6, 2010 • 2 Comments

Below is a collection of photos from phase 1 of our Cisco UCS implementation.  This consisted of installing Nexus 5010′s with 2148 gigabit Ethernet fabric extenders to replace a few existing switches in the datacenter.  In doing so, we were able to move our existing physical (IBM 3850M2) VMware cluster to 10gE and FCOE.

Doing so allowed us to free up fiber ports on our Cisco MDS 9124 switches needed later on in the UCS blade chassis implementation.  I love pictures of equipment so I figured I would share.

Floor Jack Loader

Continue reading ‘Cisco UCS Implementation In Pictures: Part 1 Nexus 5010 And 2148 Fabric Extenders’

Cisco UCS Implementation In Pictures: Part 2 Nexus B250 And B200 External/Internals

•May 10, 2010 • 1 Comment

Below is a collection of photos from phase 2 of our Cisco UCS implementation.  This consisted of installing the Cisco UCS Blade Chassis, blade switches and server blades.

Unpacked B200-M1 Blades

Continue reading ‘Cisco UCS Implementation In Pictures: Part 2 Nexus B250 And B200 External/Internals’

Storage, Networking, EMC: DataDomain Replication

•May 26, 2010 • 1 Comment

We have a total of 3 DataDomains currently in production. One at our “Tier 1″ site, “Tier 2″(DR) and one in Europe. All DataDomain appliances have the ability to replicate data among themselves. This will be a general overview of how to setup replication between two DataDomains

On the source, I have already setup a directory tree for “/backup/europe_data”. All files destined for our Euroupe office will be placed here. On the DataDomain devices, replicated folders are added manually. By default none are replicated. Continue reading ‘Storage, Networking, EMC: DataDomain Replication’

Cisco, VMware: Cisco UCS B250-M1 VMware Consolidation Ratio (Oracle DBs)

•June 1, 2010 • 5 Comments

So a few days ago, I put out a post on the VMware Virtual Machine consolidation ratio I saw on our Cisco B200-M1 blades. This post will go over the same for full width B250-M1 blades.

The server itself is running mainly Oracle database VMs.  Blade specs are as follows:

  • Dual – Quad core Intel Xeon X5570 2.93 GHZ CPUs
  • 98 gigs of RAMSince the Oracle VMs are running semi-intensive databases, the RAM allocated to the heavy hitters are between 8-10 gigs.
    Continue reading ‘Cisco, VMware: Cisco UCS B250-M1 VMware Consolidation Ratio (Oracle DBs)’
  • Cisco UCS Implementation In Pictures: Part 3 The Finished Product

    •June 3, 2010 • 1 Comment

    We finished up our Cisco UCS installation about 2 months ago, but have been to busy to get the last set of pictures up.  Sorry for the delay!  Just in case you don’t follow me on Twitter, UCS is working out nicely.

    Below are pictures of the full rack that includes:

    • Two Cisco UCS Chassis fully populated
    • Two Cisco Nexus 5010 switches
    • Two Cisco 6120XP switches
    • Two horizontal PDUs (Circuit 1)
    • Two vertical PDUs (Circuit 2)

    Continue reading ‘Cisco UCS Implementation In Pictures: Part 3 The Finished Product’

    Cisco, VMware: Nexus 1000v Tracking VM Interface Errors

    •June 7, 2010 • 3 Comments

    So I have found the first use of our new Cisco Nexus 1000v virtual switch. The counters (stats) were reset about a week agon and this is the first time they have been reviewed since. Almost all virtual machines show no errors, but there were a few that were high.

    After connecting to the console, the following command was ran. Continue reading ‘Cisco, VMware: Nexus 1000v Tracking VM Interface Errors’

    Storage, SAN: EMC Clariion LUN Trespass

    •June 14, 2010 • 1 Comment

    So we are in the process of working with EMC to have 10gb ISCSI interfaces installed in our CX4. I knew that we had a lot of trespassed LUNs and that those had to be corrected before the install could take place. This post will go through the process of using the CLI (Command Line Interface) to find out what is trespassed and to move them back.

    I will be using “naviseccli” for this, but the same syntax should work with “navicli”. Both are available through EMC Powerlink site.

    Below will query the Clariion and return a list of all LUNs that are trespassed. The -h specifies which host (SP) to connect to. You only need to run this on one SP. The other would report back the exact same results. Continue reading ‘Storage, SAN: EMC Clariion LUN Trespass’

    Hardware: Cisco UCS Memory Bug B250 Blades

    •June 16, 2010 • 5 Comments

    We started receiving DIMM (RAM) errors in our UCS environment about a week ago. This has only occurred on our B250-M1 blades. The error could be found both in the UCS System Manager (Java GUI) and from within the CLI (Command Line Interface).

    Initial RAM Error - Server 1

    Continue reading ‘Hardware: Cisco UCS Memory Bug B250 Blades’

    Cisco UCS IO Module Fabric Port Pinning

    •June 30, 2010 • Leave a Comment

    Recently we have had issues with our chassis 2 iom (IO Module) 1 port 1. In the process of troubleshooting, Cisco sent out a replacement IO module hoping it would solve the issue. Turns out one thing that I did not know is that once the “fabric ports” on the IOM have been wired, it must keep the same cabling.

    I had the great (not really) idea to swap cable 1 and 2 on the IOM while I was replacing the module. I figured that if it was a port issue on the 6120 blade switch that the problem would move to the new port. Turns out the IOM freaked out when this happened and downed the ports that were not where they had been.

    The part which made it more difficult to track down was the the “fabric ports” (links from chassis to switch) all showed green. It was 3 of the 8 backend ports on the IOM that were errored.

    Here is a server interface that was pinned to a removed port from the UCS Manager CLI. Below shows the downed interfaces as “Error disabled”.

    ucsm01-A(nxos)# sh interface brief | inc "Error disabled"
    Eth2/1/3      1      eth  access down    Error disabled              10G(D) --

    Once the 10g cables (SFPs) were swapped back to their origional locations, this ports came back. Here the status now shows “up”

    ucsm01-A(nxos)# sh interface brief | inc Eth2/1/3
    Eth2/1/3      1      eth  access up      none                        10G(D) --

    Again, the resolution here was to put the cabling back in their origional locations. Even with this being a new IO module, UCS loaded it with the configuration from the origional card. While talking with Cisco TAC, they said that the development group is working on making the IO module switch uplinks swappable (dynamic) and not statically pinned.

    This really wasn’t a major issue due to UCS having redundancy built in, we never lost connectivity to a server. It’s just one of those “gotchas” that I ran into while troubleshooting a port.

    Cisco UCS and Nexus 5000 DataCenter – Our Implementation

    •July 7, 2010 • 3 Comments

    Our Cisco 10gbE implementation consists of 2 Chassis fully populated UCS with a mix of full and half width blades. The servers are all boot from SAN with no local disks. “PALO” cards are used in all servers which allow us to do FCOE. 7 of the blades are running VMware ESX 4 (vSphere) and the rest are a mix of RedHat Linux and Windows 2008.
    Continue reading ‘Cisco UCS and Nexus 5000 DataCenter – Our Implementation’

    EMC Celerra NX4 NAS Install In Pictures

    •July 27, 2010 • Leave a Comment

    So we received our 12tb raw EMC Celerra NX4 system(s) about 3 weeks ago.  Eager to get going, I went ahead and racked the units.  Below are pictures showing the different pieces that make up the NX4

    Continue reading ‘EMC Celerra NX4 NAS Install In Pictures’

    Cisco UCS B200-M1 vSphere Virtual Machine Density – Upgrade Time!

    •July 28, 2010 • Leave a Comment

    Our Cisco UCS system has been running in production for a while now. Since we have continued to grow, we are now at at a point where a RAM upgrade is having to be pushed through our purchasing process.

    Having limited funds during our initial UCS purchase, we were only able to fill our B200-M1 blades with 4gb RAM modules. Below is an image showing the current RAM to CPU utilization.

    vSphere RAM Usage

    Continue reading ‘Cisco UCS B200-M1 vSphere Virtual Machine Density – Upgrade Time!’