Wednesday, May 4, 2011

Mini book review: "The Book of XEN - A Practical Guide For The System Administrator"

Mini book review: "The Book of XEN - A Practical Guide For The System Administrator" by Chris Takemura and Luke S. Crawford, published by No Starch Press, copyright 2010, ISBN-10 1059327-186-7, ISBN-13 978-1-59327-186-2

I would first like to begin with some credit as to why I started down the path of Xen. My co-worker Chakri Girda was excellent enough to be my introduction into the real world of using Xen for virtualization. We had been a long time user of various VMWare products to get the virtualization job done. He was the lead on our revamped virtualization project and choose Xen. Good move all around I'd say!

The quest for enlightenment can take many paths. My quest generally does depart from the "path-of-google" when it comes to technology learning. That is I google the item of interest and read some links/pages/source/whatever and make it work. For some reason I just wanted a book. I had an urge to understand Xen virtualization in a greater way. Don't misunderstand, I currently manage 7 Dell PowerEdge Xen servers with 30+ domU's. Not huge by any standard, but enough to make me want to learn some of the "best practices" and not just "it's running and has fail-over". So, A quick google lead me to this book.

Chris and Luke are very clear in the beginning on the CentOS/Redhat Enterprise Linux (slight) slant of the book. But face it, Xen works on everything. And the xm command args don't not change from distro to distro. The sidebars throughout the book provide welcome distractions and helpful titbits as a general rule.

The authors set the practical and pragmatic tone from the start. The introduction clearly sets the stage for the rest of the book by creating a brief and down to earth (mercifully) short background and summary of Xen. No extra fluff. As a bonus, there is an actual "But I Am Impatient!" section that provides a reader with what chapter to turn for "step-by-step instructions" or chapters depending on the type of deployment being considered.

Chapter 1 is the the real introduction to virtualization. A well thought out overview with just enough background. As as overview, you are told that some general concepts are a bit over-simplified to speed you along.

Chapter 2 has to be the most foundational portion of the book. This begins the "Getting Started" with Xen. Well written and laid out for reading. This in spite of being referred as a "step-by-step instructions". It turns out the entire book is written in a very readable manner that goes beyond the regurgitation of man pages or help files. Quite interesting how some of the paragraphs flow more like a novel and less like a how-to. I say this in a good way! As part of the title states, this is, "A Practical Guide". So, chapter 2 is the short version of follow-the-Xen-leader.

Chapter 3 is "Provisioning DomUs" it seems to go out of its way to be distro inclusive when setting up domU's. There is a tar install section, a Redhat/CentOS and even a Debian install section. The information provided even provides insight into the large scale deployment options. The quick and dirty version of SystemImager or Cobbler/koan. The array of choices for the newbie is quite large in this chapter.

Chapter 4 is "Storage With Xen" and begins the, "I know I have shown you the easy way, but that may not be the best way" approach to storage spelled out from chapter 2. What you get here is the blktap -vs- LVM information. The chapter is still presented in a very readable way. However, the spoiler of the chapter, in my opinion, is, "blktap is easy to set up and good for testing, while LVM is scalable and good for production". This chapter not only presents some basic commands, but also provides the foundation of *why*. The "why" something is done is often as important is what is done. So knowing both pieces allows a sysadmin to actually make the best decisions. In this chapter, the, "why" deals with storage allocation choices.

Chapter 5 is simply "Networking". Networking is really the whole focus of this chapter. The authors begin with the startup script hierarchy. Then explains when Xen preforms certain actions - on xend startup or domU startup. Followed by a well laid out how and why to name virtual interfaces. I had to dog ear one of the pages (page 65) as it perfectly explained an odd interface issue being reported by our MAC address monitoring. Nice, very nice.

Chapter 6 is "DomU Management: Tools and Frontends". Again, this chapter steps out to be more disto independent. They make sure to say that there are choices. The authors even provide installation instructions for the frontend tools. Really a continuation of the "step-by-step" approach taken throughout the book. Suffice it to say that there are frontend options and you will need to choose one.

Chapter 7 is "Hosting Untrusted Users under Xen: Lessons from the Trenches". This chapter could have been entitled, "Cover your A$$" or even "Share". The focus of this chapter is sharing (well, maybe imposing restrictions is another way to put it). The reader is gently stepped through a why and how to restrict CPU, IO and even networking. I love the whole system approach to handling DomU's taken in this chapter. The authors clearly provide the basics on how to generally handle each type of resource in a fundamental way. This means using system tools most likely in use already. I suspect this chapter is built from, "college of hard knocks" style learning. The authors are unusually nice when referring to the general user base that lead to this chapter. I would not have been so nice... I consider this chapter a must read no matter if your are a external (public) or internal (company) Xen administrator.

Chapter 8 is "Beyond Linux: Using Xen with other Unix-Like OS's". Honestly, I found this the least interesting. I was lead to this book with the general belief of being CentOS/Redhat centric since these are my OS platform of choice. Needless to say I do not need to know about Solaris or NetBSD.

Chapter 9 is "Xen Migration". This chapter does a good job of exploring the foundation of the migration options ("live" and "cold"). There are command examples for each option provided and well thought out reasoning. A nice addition was the inclusion of storage options in the migration mix. This included basic setup and configuration of ATA over ethernet (AoE) as well as iSCSI. Again, both informational context and actual implementation is provided.

Chapter 10 is "Profiling and Benchmarking under Xen". I was almost expecting a, "this is my kickass benchmark score for my servers" chapter. Luckily, I was wrong. Instead, the chapter lays a well balanced look at testing in general and how it can relate to Xen. Careful attention is noted at the beginning to state that no benchmark is a substitute for a production workload of a server. As common through the book, several options are provided and discussed. I do like inclusion of actual results and a real world application. The OProfile section is a nice addition and goes beyond the usual benchmarking spew. This is the section that provides the real world experience on how a problem was solved. My only issue with this chapter is the need to make all auxiliary program additions a download, configure, make and make install treadmill. Not a good idea for any production server to have 1.)build tools and 2.) "dead programs" that will never get updated and 3.) it's a bad idea.

Chapter 11 is "Citrix Xenserver" Xen For The Enterprise" - What?! A whole chapter devoted to non-Xen. Maybe a longer Introduction that included some of the Xenserver stuff. In spite of stray from Xen at it's core, the chapter does a good job to balance the open source -vs- commercial aspect of Xen core virtualization. I like the "gladiator combat" reference. It's good to have a sense of humor.

Chapter 12 is "HVM: Beyond Paravirtualization". This chapter represents a brief introduction into the HVM install process and setup. The only part that seems out of place is the 5 paragraph
Xen HVM vs. KVM in the middle of the chapter.

Chapter 13 is "Xen and Windows". This short chapter is another nice plunge into a specific implementation are for Xen. Well laid out chapter with good examples and very handy discussion of options. I managed to dog-ear a couple of the pages for later quick reference. The small section on HALs is a good addition in this chapter.

Chapter 14 is "Tips". I always like the real world problem -> resolution sections and this is the pinnacle for this book. This chapter has tidbits of knowledge for all to enjoy. I found the "PCI Forwarding" (passthrough) and a couple of other sections to be of great interest. Good chapter overall and one that *ALL* people should read IMHO.

Chapter 15 is "Troubleshooting". This chapter has the most dog-ears for me with 4. I will likely return to this chapter a couple of times in the near future. I have had (and ignore today) some of the issues noted in this chapter. Issue like the DomU interface number incrementing has just be an annoyance. While the VM restarting too fast has been a major battle on occasion. Another must read for the Xen (want-to-be) server administrator as far as I am concerned.

In summary, a well written book with lots of tidbits of wisdom and years of experience. This was actually an easy read. I would find no issue in recommending this book to a new -> intermediate Xen admin.

Friday, April 29, 2011

Dell E6410 Fedora 13 32-bit Linux install and mini review

This is a catch up article. More of a documentation piece of (past) success than a current Fedora 13 review.

This Fedora install is a stock 32-bit DVD install (PXE install failed issue with flashing video problem). The target system is a nice Dell E6410 with 4GB RAM, Broadcom BCM4313 802.11b/g, nVidia GT218 [NVS 3100M] rev 162 dedicated video card and Intel I7 CPU. Let me just add the note here and state, for the record, that the Intel I7 ROCKS! The E6410 is a physically solid chassis with precise feel. No cheap feeling plastic on this bad boy. The LCD hinge is firm and smooth with no lateral wiggle or bounce issues. The 14.1 inch screen is bright with no dead pixels. Screen glare is fairly minimal and the function brightness control is handy. I have been a big fan of the Dell Latitude laptops for a long time. The product line continues to excel in many ways. However, please don't mistake this unit for any ultra-portable! Especially with the longer runtime extended battery.

Install with Fedora DVD is best accomplished with the "Basic Video Driver" install selection. Otherwise the screen flashing techni-color screen makes it impossible to install. I choose to install with whole disk encryption as a general rule for any laptop. Let me just say that it is easier to type in a pass-phrase every once in a while than it is to explain why all of the confidential emails or data from a boss is floating around the internet. I will file whole drive encryption under the "CYA" category. Basically, install is click-button-simple. Besides my video issue and encryption desire, there was no real need to not click "next" button when presented. All of the Linux distributions have made HUGE progress in ease of installation over the years. And Fedora is no exception!

Reboot after install leads to a rather disappointing 800x600 resolution login screen. At least there is a login screen at all. First a quick key sequence to dump me to console in order to log in as root in order to do a full update and reboot just to see if that will help my screen resolution issue with the *default* install. I do need to note, I have a standard practice that on any NVIDIA installation, I get the kmod-nvidia drivers or the official NVIDIA driver installed as quickly as possible. I want to see about "stock" install at this point ... but no luck... Sadly, in order to get the 1280x800 capable on this 14.1 WXGA (not WXGA+!) LCD panel I install the kmod-nvidia-PAE drivers (I'm running PAE kernel). However, I still end up with the long dreaded invalid pointer problem and need to remove the "InputDevice" lines in the /etc/X11/xorg.conf file. A reboot an all is better now with video.

The Broadcom BCM4313 is not working out of the box. A quick bit of google'n and an install from the rpmfusion-nonfree-updates repository used for the kmod-nvidia above:
yum -y install broadcom-wl kmod-wl-`uname -r`
modprobe wl
makes the wifi issue go away without the need to reboot. Cutting the ethernet cord an goin' mobile now.

The bluetooth functions without issue. I could pair with my Droidx in seconds. Nice.

Helpful links:
E-Family Reimage “How-To” Guide
E6410 Manuals

Wednesday, April 27, 2011

Fedora 15 beta first impressions - aka mini review

First and foremost please remember THIS IS A BETA. There are going to be issues. However, I find myself more fixated with my issues with the GNOME 3.0 interface than anything else. I know Fedora HAD to move on and stay with the latest and greatest GNOME release. After all that is what is Fedora is known for (and the reason I prefer CentOS/RHEL for servers!). More on GNOME shortly.

My Fedora 15 beta test attempt is happening on real hardware. No virtual container. It's a Dell D630 laptop with 2GB RAM, a dual core Intel T7500 and dedicated nVidia G86M Quadro NVS 135M video card. WIFI is handled by the Broadcom BCM4312 802.11a/b/g chipset. Modest by today's standards.

Normally for laptops, I will do whole disk encryption using a kickstart file. This is just a plain, "select some defaults" and install test. Install worked without issue (as expected). In fact, installs have been very easy for a long to the point of being almost boring. Really just a few clicks and I was done.

Start-up for Fedora 15 beta is very fast. Though not timed, the perceived boot time difference from Fedora 13 to Fedora 15 seemed significant. More specially, Fedora 15 boot time is faster. If I get bored, I may consider actually install Fedora 13 or 14 again just to time the boot-to-login screen time. Many people dig that.

Initially, I was interested in how the nouveau video driver was going to handle my Quadro NVS 135M. Not a real common chipset. The former nv driver has woefully inadequate to say the least. The install and/or nouveau found the LCD SVGA+ 1440x900 without any problems. For any other NVIDIA installation, I would get the kmod-nvidia drivers or the official NVIDIA driver installed as quickly as possible. I'm going to stick it out with nouveau for now. No urgent video need to bail just yet. Not a big compiz fan, however, so that's off.

The install found my wireless card (Broadcom model BCM4312 a/b/g) without any issue. I was able to see AP's in the area. It's nice to see this, "just work". However, wifi passphrase didn't seem to work for some reason. I had to put in the hex key instead in order to connect to the test wifi router. In addition, getting connected to an access point seems to take several seconds longer that with Fedora 13 and GNOME 2.30.

Now for the BIG CHANGE... GNOME 3.0! Not sure if it's Fedora's ruff edges on this beta or my shear newness with GNOME 3.0, but wow it's hard to get used too. Don't get me wrong, a window manager that stays out of your way is good! But one that you can't figure out how to change (yet)?! I was kinda overwhelmed with the Activities and trying to guess under what category a program would be under. Select all and just keep looking. All I wanted to do was fix my touchpad issues. Need to middle click paste with the double mouse button touch pad on the Dell? Can only use the bottom set of buttons... Not sure why yet. I've gotta tell you I use middle mouse/double mouse button paste A LOT as my typing speed and skill is not the best. Also, spent 10 minutes trying to figure out if I could get System Monitor back on my screen or CPU scaling info... and never did...

My solution, at the moment, to GNOME 3.0 is to run LXDE instead! Thank goodness for choice at this time. My frustration rate is high GNOME and back to basics is good. I will likely make more attempts to use the new GNOME, just not tonight.

Tuesday, April 19, 2011

CentOS 5.x bugzilla 3.2 blank page...

This may be more of a rant... I mean the whole purpose of having a "checksetup.pl" in bugzilla is to, well, check the bugzilla setup... It looks like they should add at least one more check (or I should take better care to read the instructions)...

So, a basic/minimal install of CentOS 5.6 and bugzilla I'm greeted with a non-helpful "failed to connect" page and a /var/log/httpd/error_log message of:
[Tue Apr 19 13:10:15 2011] [error] [client xx.xx.xx.xx] [Tue Apr 19 13:10:15 2011] index.cgi: Use of uninitialized value in substitution (s///) at (eval 42) line 44.
[Tue Apr 19 13:10:15 2011] [error] [client xx.xx.xx.xx] [Tue Apr 19 13:10:15 2011] index.cgi: Use of uninitialized value in concatenation (.) or string at Bugzilla/CGI.pm line 312.

Following google search links proves useless (which is not often the case).

I added all of the interesting perl modules that bugzilla may want. Modules like perl-DateTime, perl-List-MoreUtils, perl-PatchReader, perl-HTML-Scrubber, perl-Template-GD, perl-GD-Graph, perl-Chart and ImageMagick-perl (plus their dependencies). /usr/share/bugzilla/checksetup.pl ran without issue, produced no error, fixed and rebuilt etc... still the error... paying a tiny bit more attention to the area around line 312 in Bugzilla/CGI.pm, I finally notice https redirection code... Wait, bugzilla defaults to requiring https but does not check for https... finally problem solved. The solution is just to install "mod_ssl".

Saturday, April 16, 2011

CentOS 5.6 GitCO xen 3.4.3 kernel, libvirt-client and xen-libs update issues

A recent CentOS 5.5 to 5.6 update has caused one of my xen servers to stop functioning... This has happened few times in the past... and I keep forgetting to look before I reboot. Maybe documenting it here will help me remember and help others QUICKLY get past the xend failing to start with the error of:
[2011-04-16 15:34:19 5082] INFO (SrvDaemon:336) Xend changeset: unavailable.
[2011-04-16 15:34:19 5082] ERROR (SrvDaemon:349) Exception starting xend ((13, 'Permission denied'))
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/xend/server/SrvDaemon.py", line 341, in run
servers = SrvServer.create()
File "/usr/lib64/python2.4/site-packages/xen/xend/server/SrvServer.py", line 251, in create
root.putChild('xend', SrvRoot())
File "/usr/lib64/python2.4/site-packages/xen/xend/server/SrvRoot.py", line 40, in __init__
self.get(name)
File "/usr/lib64/python2.4/site-packages/xen/web/SrvDir.py", line 84, in get
val = val.getobj()
File "/usr/lib64/python2.4/site-packages/xen/web/SrvDir.py", line 52, in getobj
self.obj = klassobj()
File "/usr/lib64/python2.4/site-packages/xen/xend/server/SrvNode.py", line 30, in __init__

Here is the deal. We have been using the Xen 3.4.3 repo from Gitco. For some reason yum will not keep the "kernel" line that matches up with the xen install from xen-3.4.3. So, the VERY simple fix (till the next kernel update) is just to make sure that your grub.conf has a line of "kernel /xen.gz-3.4.3" and not whatever gets dumped there by yum. Hope this helps people with this error.

The second issue with the (finally) released CentOS 5.6 update related to using the xen 3.4.3 from GitCO is the inclusion of the libvirt-client-0.7.0-6.el5. This is now an older version with the 5.6 update now out. The error you see from try a yum update is:
Transaction Check Error:
file /usr/bin/virsh from install of libvirt-0.8.2-15.el5.3.x86_64 conflicts with file from package libvirt-client-0.7.0-6.el5.x86_64
file /usr/bin/virt-xml-validate from install of libvirt-0.8.2-15.el5.3.x86_64 conflicts with file from package libvirt-client-0.7.0-6.el5.x86_64
file /usr/lib64/libvirt.so.0 from install of libvirt-0.8.2-15.el5.3.x86_64 conflicts with file from package libvirt-client-0.7.0-6.el5.x86_64
file /usr/share/libvirt/schemas/capability.rng from install of libvirt-0.8.2-15.el5.3.x86_64 conflicts with file from package libvirt-client-0.7.0-6.el5.x86_64

The fix is:
rpm -e --nodeps libvirt-client yum update

If you accidentally got the old i386 xen-libs, just run:
rpm -e xen-libs.i386 yum update

Thursday, March 24, 2011

CentOS 5.x and Firefox 4.0 howto!

Don't be confused. I did not come up with this! I just want to spread the word as much as possible that one very enlightened fellow referenced as "rkl" at forums.mozillazine.org has made it possible for the rest of us to hit the "easy button" and get FireFox 4.0 working on CentOS 5.x (CentOS 5.5 at this time). HERE is the original link that I lucked upon while trying to google it on my own.

This is the a quote of the general (FF4beta version) steps if you are too lazy to click another link:
1. Unpack the Firefox 4.0b9.tar.bz2 somewhere (e.g. /usr/local/firefox). With the "en_GB" release, I throw in a "dictionaries" sub-dir under there with en-GB.aff and en-GB.dic in there (and en-US.aff/.dic soft-linked to the en-GB ones) otherwise, sadly, Firefox 4 uses US spellings on what's supposed to be an en_GB release :-(

2. Download this 32-bit Fedora 9 libstdc++ RPM and unpack it with this command:

rpm2cpio libstdc++-4.3.0-8.i386.rpm | cpio -i --make-directories

3. Move the unpacked shared library into /usr/local/firefox thus:

mv usr/lib/libstdc++.so.6.0.10 /usr/local/firefox/libstdc++.so.6

Note: It's "usr/lib/libstdc++.so.6.0.10" above (i.e. the unpacked tree from the RPM, not the system /usr/lib tree) - do NOT put a leading slash there!

4. Run Firefox 4.0b9 with:

/usr/local/firefox/firefox

rkl references the beta version, but all steps still apply for official release.

I will say that FF4 is very nice! I am very impressed with what I have seen so far!

Tuesday, March 15, 2011

Basic OMSA install and usage with RHEL or CentOS 5.x

I am a big fan of Dell Poweredge hardware. It's well designed. It runs CentOS rock solid. It just works well. Dell must have figured out that many people like me feel this way (companies really). To their credit, Dell has a very full featured Linux support offering in the OpenManage Server Administrator (OMSA) for free.

To be fair, OMSA has historically had some speed bumps. Full 64-bit install has only recently been possible. And some past upgrades have not been smooth and actually required removing packages and manually removing directories and files in order to upgrade.

Note: make sure "plugins=1" is in the /etc/yum.conf

As root from a terminal prompt:
wget -q -O - http://linux.dell.com/repo/hardware/latest/bootstrap.cgi | bash wget -q -O - http://linux.dell.com/repo/community/bootstrap.cgi | bash wget -q -O - http://linux.dell.com/repo/firmware/bootstrap.cgi | bash yum install srvadmin-all OpenIPMI OpenIPMI-tools dell_ft_install

Install BIOS and firmware files
yum -y install $(bootstrap_firmware) update_firmware --yes
Reboot if needed/requested. If you don't reboot, make sure that you exit and log back in to have the srvadmin bin and sbin directories set for you.

Just to make sure some startup services will be started on reboot.
chkconfig dsm_sa_ipmi on chkconfig ipmi on srvadmin-services.sh stop && service ipmi start && srvadmin-services.sh start

The web GUI is good. Just point your browser to http://yourhostname_or_IP:1311 and login as root and check it out from there.

The command line tools, however, can offer some very good information when you don't have that option or need some scripting magic. For example, "omreport" can provide MUCH information about your server and it's state. Here is a quickie on controller:
omreport storage controller
Will spew lots of data like:
Controller PERC 5/i Integrated (Embedded)

Controllers
ID : 0
Status : Ok
Name : PERC 5/i Integrated
Slot ID : Embedded
State : Ready
Firmware Version : 5.2.2-0072
Minimum Required Firmware Version : Not Applicable
Driver Version : 00.00.04.17-4.31.z-RH1

Minimum Required Driver Version : Not Applicable
Storport Driver Version : Not Applicable
Minimum Required Storport Driver Version : Not Applicable
Number of Connectors : 2
Rebuild Rate : 30%
BGI Rate : 30%
Check Consistency Rate : 30%
Reconstruct Rate : 30%
Alarm State : Not Applicable
Cluster Mode : Not Applicable
SCSI Initiator ID : Not Applicable
Cache Memory Size : 256 MB
Patrol Read Mode : Auto
Patrol Read State : Stopped
Patrol Read Rate : 30%
Patrol Read Iterations : 41
Abort Check Consistency on Error : Not Applicable
Allow Revertible Hot Spare and Replace Member : Not Applicable
Load Balance : Not Applicable
Auto Replace Member on Predictive Failure : Not Applicable
Redundant Path view : Not Applicable
CacheCade Capable : Not Applicable
Persistent Hot Spare : Not Applicable
Encryption Capable : Not Applicable
Encryption Key Present : Not Applicable
Encryption Mode : Not Applicable
Spin Down Unconfigured Drives : Not Applicable
Spin Down Hot Spares : Not Applicable

To see information on the status of the virtual disk created on your PERC RAID controllers:
omreport storage vdisk
List of Virtual Disks in the System

Controller PERC 5/i Integrated (Embedded)
ID : 0
Status : Ok
Name : vd0
State : Ready
Encrypted : Not Applicable
Layout : RAID-1
Size : 67.75 GB (72746008576 bytes)
Device Name : /dev/sda
Bus Protocol : SAS
Media : HDD
Read Policy : Adaptive Read Ahead
Write Policy : Write Back
Cache Policy : Not Applicable
Stripe Element Size : 64 KB
Disk Cache Policy : Enabled

ID : 1
Status : Ok
Name : vd1
State : Ready
Encrypted : Not Applicable
Layout : RAID-10
Size : 1,396.25 GB (1499212021760 bytes)
Device Name : /dev/sdb
Bus Protocol : SATA
Media : HDD
Read Policy : Adaptive Read Ahead
Write Policy : Write Back
Cache Policy : Not Applicable
Stripe Element Size : 128 KB
Disk Cache Policy : Enabled

The parameter changing mate to omreport is "omconfig". You get to set or change options and can even trigger events like MAKING SURE YOU HAVE VIRTUAL DRIVE PARITY (AKA parity scrub)!

So to verify/validate/fix drive parity issues, add a cron job (and/or run the command now) to make sure that the RAID groups on your PERC RAID controllers are in perfect shape. You don't want any failed disk rebuild surprises!
crontab -e
00 17 * * 0 /opt/dell/srvadmin/bin/omconfig storage vdisk action=checkconsistency controller=0 vdisk=0 > /tmp/omconfig-vdisk0.out 2>&1 || cat /tmp/omconfig-vdisk0.out |mail -s "omconfig issue on `hostname`" admin
30 17 * * 0 /opt/dell/srvadmin/bin/omconfig storage vdisk action=checkconsistency controller=0 vdisk=1 > /tmp/omconfig-vdisk1.out 2>&1 || cat /tmp/omconfig-vdisk1.out |mail -s "omconfig issue on `hostname`" admin

Or better yet, get crazy with a shell script to dynamically ask for *each* controller and virtual disk to be checked. I will leave that up to you to figure out.

Again, just a basic install and use for OMSA. Take the time and go through the links, get the OMSA manual and enjoy.