nytefyre/net3.0

15Nov/110

puppet tricks: debugging

Posted by xaeth

Update: (2012/9/30) I came up with this around the time I was using 0.25.  Apparently now you can do similar utilizing the --debug switch on the client along with debug() calls. I thought the function was only part of PuppetLab's stdlib, but apparently its in base, at least in 2.7+. I'll probably do a part 2 to this with more info, although there isn't much more.

Update: (2012/12/20) So the debug() function from stdlib is lame. I spent a while troubleshooting my new environment not getting messages and realized that rolling back to notice() worked. Could have sworn I tested it when I posted that. I did also run into an issue that naming the fact debug is actually a bad idea and so have updated this blog accordingly.

Update: Found this bug that talks about the facts not returning as the appropriate types.

Disclaimer: I am not a ruby programmer... so there might be "easier" or "shorter" ways to do some of the things I do with ruby, but my aim is for readability, comprehensibility by non-programmers, and consistency.

In my time playing with puppet I have had to do a few things I was not pleased with.  Mainly I had to write several hundred lines of custom facts and functions.  Debugging was one of the biggest pains, until I found a wonderful blog post that helped me out with that.  Actually, when he helped me out with debugging I had already been to the site once because I ran into a bug related to the actual topic of his post, "calling custom functions from inside other custom functions".  Back to the matter at hand... when I first started working on custom functions I would leave exceptions all over my code and use them to step through the functions during debugging sessions.  While the code itself was short, this a tedious process as I would have to comment out each exception to move to the next one and then re-run the test.  It looked like this:

checkval = someaction(var)
#raise Puppet::ParseError, "checkval = #{checkval}"
result = anotheraction(checkval)
raise Puppet::ParseError, "result = #{result}"

Then I found function_notice, which got rid of the commenting of exceptions by allowing me to log debug statements.  So I replaced all of my exceptions with if wrapped function_notice calls, resulting with:

debug = true
checkval = someaction(var)
if debug
   function_notice(["checkval = #{checkval}"])
end
result = anotheraction(checkval)
if debug
   function_notice(["result = #{result}"])
end

An important thing to remember about function_notice in a custom function is that the variable you pass to function_notice must be a list.  I have not done anything other than send a single string inside a single list, so I could not speak to its other behaviors.  The length of the code increases greatly, and I do not actually do a debug for everything.  Overall this is a much better place to be.  However, now to enable debug I have to edit the custom functions on the puppet master which requires a restart the service (puppetmasterd, apache, etc), and logs are generated for every client.  That is still a pain.  This is when I had a "supposed to be sleeping" late at night revelation.  You can lookup facts and variables inside your custom functions!  So I created a very simple fact named debug.rb that looks like this:

Facter.add('puppet_debug') do
    debug = false
    if File.exists?('/etc/puppet/debug')
        debug = true
    end
    setcode do
        debug
    end
end

So what that means is that on any of my puppet clients I can enable debugging of my puppet setup by touching the file /etc/puppet/debug, and disable it by deleting that file.  To enable this in my custom function I change the definition of debug.

debug = false
if lookup('puppet_debug') == 'true'
    debug = true
end
checkval = someaction(var)
if debug
    function_notice(["checkval = #{checkval}"])
end
result = anotheraction(checkval)
if debug
   function_notice(["result = #{result}"])
end

Now, this may seem like a kinda odd way to go about setting the debug value, but while the code in the custom fact is working with the boolean value of true/false, when called as a fact it returns the string "true" or "false".  Since the string "false" is true from a boolean sense you could end up getting flooded with logs if you do a simply true/false check against the lookup() result.  Thus, we default to false as that should be our normal working mode, and if the fact returns the string "true", we set debug to true.  Now there is a custom fact providing debug, and a custom function utilizing it to log messages on the puppet server. Yay!  But wait, there is more!  Now that you have the custom fact defined, you can utilize it inside your puppet manifests in the same way!  Let take a look:

class resolver {
$nameservers = $gateway ? {
/^192.168.1./ = ['192.168.1.25', '192.168.2.25'],
/^192.168.2./ = ['192.168.2.25', '192.168.1.25'],
}
    define print() { notify { "The value is: '${name}'": } }
    if ${::puppet_debug} {
# On the server
notice("${::hostname} is in ${::gateway} network")
# On the client
print { ${nameservers}: }
}
}

Wait, what? Sorry.. threw a few curve balls at you. The notify call, which is not a local function, logs on the client side. Then I wrapped it in a define called  print, because I was going to pass an array to it. By wrapping it in the define it takes the array and performs the notify call on each object in the array. You can read more about this on this page, under the sections What is the value of a variable? and Whats in an array?.  The article has some nice explanations of a few other things as well.

Also, if you'd rather check for $::debug than $::puppet_debug then add the following to your site.pp:

$::debug = $::puppet_debug

 

11Nov/110

puppet tricks: staging puppet

Posted by xaeth

As I have been learning puppet @dayjob one of the things I have been striving to deal with is order of operations.  Puppet supports a few resource references, such as before, after, notify, and subscribe. But my classes were quickly becoming slightly painful to define all these in, when the reality was there was not always hard dependencies so much as a preferred order.  After having issues with this for a while and researching other parts of puppet I stumbled across some mention of run stages, which were added in the 2.6.0 release of puppet.  If you read through the language guide they are mentioned.  There has always been a single default stage, main.  But now you add as many as you want.  To define a stage you have to go into a manifest such as your site.pp and define the stages, like so:

stage { [pre, post]: }

That defines the existence of two stages, a pre stage for before main and a post for after main.  But I have not defined any ordering.  To do that we can do the following, still in site.pp:

Stage[pre] -> Stage[main] -> Stage[post]

Thus telling puppet how to order these stages.  An alternate way would be:

stage { 'pre': before => Stage['main'] }
stage { 'post': require => Stage['main'] }

It all depends on your style. So now that we have created the alternate stages, and told puppet what the ordering of these stages is, how do we associate our classes inside them?  It is fairly simple, when you are including a class or module you pass it in as a class parameter.  To do this they introduced an alternate method of "including" a class.  Before you would use one of these two methods:

class base {
    require users
    include packages
}

In this the base class requires that the users class is done before it, and then includes the packages class. Its fairly basic. Transitioning this to stages comes out like this:

class base {
    class { users: stage => pre }
    include packages
}

It is very similar to calling a define.  In production I ended up where adding my base class in the pre stage of a lot of classes, and which became kinda burdensome. I knew that there were universal bits that belonged in the pre stage, and universal bits that did not. To simplify I settled on the following:

class pre-base {
    include users
}
class base {
    class { pre-base: stage => pre }
    include packages
}

With this setup I do not have to worry about defining the stages multiple times. I even took it further by doing the same concept for the different groups that are also applied to systems, so the universal base and the group base are both configured as in the last example. I have not tried it with the post stage, as I do not use one yet, but I would imagine it would work just as above. Here is an untested example:

class pre-base {
    include users
}
class post-base {
    include monitoring
}
class base {
    class { pre-base: stage => pre }
    class { post-base: stage => post }
    include packages
}

Maybe this seems fairly obvious to people already using stages, but it took me a bit to arrive here, so hopefully it helps you out.

 

UPDATE: PuppetLabs' stdlib module provides a 'deeper' staging setup.  Here is the manifest in github.

11Nov/110

fedora 16

Posted by xaeth

So... I installed Fedora 16 on the home desktop today, coming from Fedora 14.  I am running it on an HP Pavilion Elite... I would tell you what kinda but they have garbage written all over the thing, just not the model!  Suffice it to say, AMD 3core processor, 6GB RAM, and an nVidia vga+dvi video card with two Samsung 20" monitors attached.  My first attempt to upgrade was to use the preupgrade feature.  Ran the software, it said reboot, I did, and came right back to Fedora 14.  Meh, it failed last time so no big suprise[1]... while the preupgrade software was running I burned the Fedora 16 install DVD.  Booted to the installer, which was only on the vga feed monitor, and to big to see the buttons (yay for alt+b for back and alt+n for next).  I went through the process telling it to upgrade my current Fedora 14 to 16.  Post installation I could not get X running.  I assume this was because on Fedora 14 I was using the RPMFusion packaged nvidia drivers.  At this point I sighed and remembered why I have always believed in the concept of fresh installs.  And why not this time? There is btrfs, new userids, the new systemd, and Gnome 3.2.  Why wouldn't I want a clean slate?

So I rebooted to the installer.  Oddly enough this time the installer display "worked".  I could see everything, it was just spanned over both screens which for an installer is more annoying than helpful, BUT it was better than the first install so.. moving on.  Was able to easily add my wireless network during the install to grab updates at install time, which is great. Post install we come to the Firstboot screen, which unfortunately suffered from some screen freakiness, it was skinny and too tall for the monitor.  You can select Forward using alt+f easily enough, unless you want to provide smolt data, that requires a few trial and error tab and spaces.  A short time after starting I was presented with the new and prettier GDM login prompt.  I logged in and was presented with... pretty much nothing.  But that is the point, right?  So there is a small non-descript panel across the top with Activities, the date, a few icons and my name.  I had heard that Gnome 3.2 has Online Account integration, so I clicked on my name in the top right corner and clicked on Online Accounts.  I added my Google account, which is the only type it supports currently, and... well nothing happened, at least not noticeably.  So I went and read about Online Account integration.  It says Contacts are integrated, so I clicked on Activities and typed a contact name in, and voila, there it was.  It says Calendar is integrated, so I clicked on the date in the top panel.  My calendar was not there, so I clicked Open Calendar. Evolution Calendar popped up and it had my Google calendars in it.  I checked the boxes and was prompted for my password, I provided it, and my calendar integrated with the top panel.  Yay.  It says Documents are integrated, and ya well, I never figured that one out.

So I did a couple of quick normal behaviors.  I added HotSSH, which is an amazing SSH GUI for Linux; added RPMFusion, added the printer (even easier than last time, ridiculously so) and created the wifey's account.  At which point I decided to try to add "profile icons" to our local accounts.  So first I have to say that it is silly to integrate Online Accounts and not just use my associated profile picture from that account, or let me choose which account's picture to use.  Second, IT DOES NOT WORK.  I tried clicking on the blank icon and selecting an existing image to scale down, nothing happened.  I tried making a smaller image and selecting that, nothing.  I googled how to set it and found the 'manual' way, nothing.  I found a more specific manual way, nothing.  I rebooted just to verify, still nothing.  I even tried using one of their icons, still nothing!! So that is very annoying.

Aside from an occasional sluggishness, which I am currently (and perhaps naively) attributing to the nouveau driver, its pretty good.  Before this I have been using Gnome 3 a bit on my laptop, so I am not completely thrown off or anything.  I only have primary two complaints with Gnome 3 is the Alt+Tab behavior.  And its not that I don't like how they tried to improve it, and their concept is decent.  But there is one significant flaw.  A quick alt+tab single hit has historically always taken you back and forth between your current and last window, even if they were the same app.  Now it appears that this functionality comes from Alt+Esc.  Which is weird, and I only just now accidentally discovered while typing this complaint.  I thought Alt+Esc was Activities, but that seems to be mapped to the Windows key, which is kewl.  Maybe this complaint is now void.  My second complaint is that I've always used the right click menu on the desktop to open terminals, and now I've got to do a new work flow.  is that really a big deal? I guess not.

The real question now is, how will the wife take the change?

Update 1: I am sad to report, that HotSSH, while installable, does not work without installing an undefined dependancy, the 'vte' package.  A bug was already filed.

Update 2: Got the Gnome profile images working.  Still had to do it the more manual way of placing the image as /var/lib/AccountServices/icons/${userid} and adding the "Icon=/var/lib/AccountServices/icons/${userid}" in /var/lib/AccountServices/users/${userid}.  The problem was that the images were not tagged appropriately for SELinux.  So a quick restorecon -R /var/lib/AccountServices and its fixed.  However this does not explain why doing it the easy way through the GUI did not work.

4Nov/110

Editing image metadata with exiv2

Posted by xaeth

My wife just had some maternity photos taken, and they came out wonderfully. I went to import them into Shotwell so we could upload them to her Picasa web albums (sorry, no link to her pictures)  and all the images imported in the year 2001 folder. I looked at the data and while their time stamps seemed reasonably correct, their date stamp was well over 10 years off.

I use Linux for my operating system, and a few quick good searches turned up the man page for the program exiv2. And while a man page got me there, I figured it would not hurt to provide an example (both for myself and any of you lucky people who stumble upon it).

So, first lets look at the metadata of the image.jpg:

[xaeth@nytefyre Maternity]$ exiv2 image.jpg
 File name       : image.jpg
 File size       : 1454212 Bytes
 MIME type       : image/jpeg
 Image size      : 1672 x 2264
 Camera make     : Canon
 Camera model    : Canon EOS-1Ds Mark II
 Image timestamp : 2001:03:10 03:13:05
 Image number    :
 Exposure time   : 1/125 s
 Aperture        : F10
 Exposure bias   : 0 EV
 Flash           : No, compulsory
 Flash bias      :
 Focal length    : 59.0 mm
 Subject distance: Unknown
 ISO speed       : 100
 Exposure mode   : Manual
 Metering mode   : Multi-segment
 Macro mode      :
 Image quality   :
 Exif Resolution : 4992 x 3328
 White balance   : Auto
 Thumbnail       : image/jpeg, 6822 Bytes
 Copyright       :
 Exif comment    :

So it says in bold up there that this image was taken on March 10th, 2001 at 03:13:05 am (it is a 24h clock).  I am not specifically worried about the time, but the date will be nice several years from now.  So we need to adjust several things, which I am going to follow in parenthesis with the cli switch the man page says we need.  What needs to adjust is the year (-Y), month (-O), day (-D), AND time (-a).  The trick with the adjustment you have to make is that it is a time shift, not a direct set.  You can not tell it the specific day, just to move the setting back or forwards, kinda like using the buttons on an alarm clock.  For all of the fields you can provide a negative or positive number, but the time field is a HH:MM:SS format with the minute and seconds being optional (but if you want to change seconds you need all 3). We want to move from March 10th, 2011 at 03:13:05am to October 15th, 2011 at about 3pmish. To do this we are going to have to adjust forward 10 years, 7 months, 5 days, and 12 hours.  So this is how the command should look:

[xaeth@nytefyre Maternity]$ exiv2 -Y 10 -O 7 -D 5 -a 12:00:00 ad image.jpg

The command does not return a visible success, which is very common for *nix applications.  But before we verify it you might be asking youself, "self, what the heck is that ad in the middle of his command?".  Well self, that is an action, telling exiv2 that it has to do something.  This is optional if the switches you are providing imply it.  All the switches I am using do, so my use was just overly explicit.  Anyway, as I was saying, there are other ways to verify if the command ran successfully that I won't go into here, but what better way than to check the metadata again!

[xaeth@nytefyre Maternity]$ exiv2 image.jpg
 File name       : image.jpg
 File size       : 1454152 Bytes
 MIME type       : image/jpeg
 Image size      : 1672 x 2264
 Camera make     : Canon
 Camera model    : Canon EOS-1Ds Mark II
 Image timestamp : 2011:10:15 15:13:05
 Image number    :
 Exposure time   : 1/125 s
 Aperture        : F10
 Exposure bias   : 0 EV
 Flash           : No, compulsory
 Flash bias      :
 Focal length    : 59.0 mm
 Subject distance: Unknown
 ISO speed       : 100
 Exposure mode   : Manual
 Metering mode   : Multi-segment
 Macro mode      :
 Image quality   :
 Exif Resolution : 4992 x 3328
 White balance   : Auto
 Thumbnail       : image/jpeg, 6822 Bytes
 Copyright       :
 Exif comment    :

Yay!  Now I just do a quick loop through all the pictures in the directory and I have an updated set.

[xaeth@nytefyre Maternity]$ for image in *; do exiv2 -Y 10 -O 7 -D 5 -a 12 ${image}; done

and voila... several dozen reasonably corrected dates in the metadata of those images.  If you noticed the command was slightly different, good for you.  For the actual loop I used the shortened time adjustment I mentioned earlier and left off the ad.

Tagged as: , No Comments
24Oct/110

jumping out the window

Posted by xaeth

So, its time to feed the troll.  I do try and avoid this usually, but I was channeling some xkcd when I wrote this.

While my first exposure to computers was on a Commodore 64 then later with SystemV and Red Hat Linux (RHL), I started my professional career as a Windows desktop support lackey.  In that role I learned a bit more about Windows, and began maintaining  an Exchange 5.5 system and then later towards IIS and DNS management (backwards, I know).  Within a a few short years I was ecstatic to return to the world of Linux sans Windows.  One of the big things for me was that things I wanted to do were always a pain to accomplish, whether it was on Windows or Linux, but Linux allowed me to do it faster and easier.

The deepest development I do is python and some other scripting and web languages.  I have never built Linux from scratch.  I installed Gentoo once, but thought it was way to much effort.  Slackware was nice, but I have always been a fan of my first Linux, Red Hat.  My first personal server ran RHL 7.1 and was maintained with updates until they stopped coming.  In fact, said server is sitting under my desk at home powered off, but still functional.  It has not gotten an update in years, but that is because none were available.  I ran RHL 7.2 on my Toshiba Protege for about 4 years.

I concede frustration in the early days of Fedora, because it was a rocky start, very bleeding edge, and not prone to stability.  I even strayed to spend a year or so in the arms of another, ahh Ubuntu.  Such a nice approachable mistress, but high maintenance between releases due to all the non-upstream modifications.  (Not that Fedora was better mind you in upgrades, it didn't support release upgrades till Fedora 9).  I did come back to Fedora around version 9 and have been staying up to date as time allows. I do prefer to stay completely current.. but time is not always in my favor.  My current desktop is Fedora 14, I am waiting for 16's release to introduce the wifey to Gnome 3.

I am Red Hat certified and at work I am a Red Hat Enterprise Linux (RHEL) advocate.  One of the largest reasons is that I feel it is the best way for companies that have zero interest in participating in open source to actually contribute back (by paying Red Hat to do it).  I have been utilizing RHEL since it was released as version 2.1 back in 2002.  I keep my systems updated all the time.  However, I have been singed a few times by updates.  They can be counted on one hand:

  1. Way back in the day the bind-chroot package would blow away your named.conf on an update.  But now that I think about that... it was not even RHEL.  It may have been RHL.
  2. In RHEL5 there was a openssh update that introduced dynamic tcp window scaling.  We have a phantom network issue and thus started having stalled ssh data transfers.  Not really the update's fault.
  3. In RHEL5 they changed the tzdata package from noarch to arch-specific and you could end up with a bad old tzdata package installed.  Did not actually break anything.
  4. At one point kmod-nvidia blew up on a kernel update during the Fedora 13 time frame.  The kmod was from RPMFusion and of nvidia's proprietary binaries... kind of an issue waiting to happen in the first place.  I moved to the akmod (self-rebuilding kernel mod) package and haven't had an issue since.

I do have one habit that is rather mildly irritating to both myself and others, I am a big fan of playing with software that is supposed to do task X or have feature Y, but is not quite there yet.  Sometimes this is due to it being bleeding edge, other times its just poorly maintained software, or maybe the vendor was just a liar.  I have never had this habit destroy a system,  usually just project timeline delays or the need to find a better solution.

So what is the point of this?  I was forwarded a link to a rant on ZDnet this last Friday (2011.10.21) entitled  Why I've finally had it with my Linux server and I'm moving back to Windows by David Gewirtz.

I am not going to delve to deep into his background since it is readily available on his site, but based on his advertised background this guy should be beyond my skill set in understanding the how computers work. Sadly, understanding and development skills does not a skilled administrator make. Here is how he describes himself in the afore mentioned article:

"... I’m no tech babe in the woods. I’ve been a UNIX product manager, I’ve written kernel code, and I’ve taught programming at the college level. "

So, my quick synopsis on me.  I am a fairly competent sysadmin with a background of being in the trenches.  I took a C class in high school that used C/C++ for Dummies as the course material, and is practically the last time I touched it. I did list my scripting experience above.  I have never touched kernel code.  I never went to college.

Now that context has been set, here are quotes from his article followed by my responses.  I am trying to consider that his rant was written while he was angry and (hopefully) just being melodramatic, but it is difficult.

"I’ve had it with all the patched together pieces and parts that all have to be just the right versions, with just the right dependencies, compiled in just the right way, during just the right phase of the moon, with just the right number of people tilting left at just the right time. "

And what exactly are you doing? Unless you are grabbing several repositories from random places and enabling them all at the same time, or installing everything from scratch this should not be an issue these days.  3-5 years ago? Maybe. 5-10 years ago? okay ya... probably.

"I’ve had it with all the different package managers. With some code distributed with one package manager and other code distributed with other package managers. With modules that can be downloaded on Ubuntu just by typing the sequence in the anemic how-to, but won’t work at all on CentOS or Fedora, because the repositories weren’t specified in just, exactly, EXACTLY, the right frickin’ order on the third Wednesday of the month. "

Okay... so apt (Debian and Ubuntu) and yum (CentOS, Fedora, RHEL) are not the same software.  So their commands are a touch different.  They are also used in different distributions, so there might be different package names. I can see where that can be annoying, it has annoyed me at times.  But this is complaining that your Windows box and OS X box do not use the same exact programs and syntax. There is software that exists on both of those platforms that require different installation and execution procedures.

"With builds and distros that won’t even launch into a UI until you’ve established a solid SSH connection, "

Umm... so I personally prefer my remote access to my servers over an encrypted channel, and SSH is a great medium for that.  You are a security advocate, right? This is a server environment, right?  You need a GUI, why?  I am not against GUIs, they have their place.  However, most server components in Linux do not have a native GUI tool.  It is usually just configuration files, and sometimes a web interface.  Furthermore, if this is the administrative interface of a backup program, why would you run it on lots of machines?  There should be a central administrative interface, and RDP to Windows is a nice feature for that purpose.  Even if it is on a Linux server, if this is a centralized interface, what is wrong with just exporting the GUI over X via your SSH session? Its secure and easy.  It requires almost no setup (install a few programs on your Windows desktop, establish the connection, run the program).  VNC is rather insecure usually...

"I’ve had it with the fact that this stuff doesn’t work reliably."

Ahh reliability... such a subjectively quantifiable term.  I like how you do not explain how it is not reliable, you just barges right on to knowledge and understanding.  In fact the closest you come to saying anything about a lack of reliability is the update issue resulting in a crash, but this statement is completely separate from those statements in your rant.

"Oh, sure, if you work with Linux every hour of every day, if this is all you do, and all you love, if you’ve never had a date since you grew that one facial hair, if you’ve never had any other responsibility in your entire life, then you know every bit of every undocumented piece of folklore. You know which forums and which forum posters have the very long and bizarre command line that only. That. One. Guy. Knows. "

"and THAT command line sequence can be gotten by getting on just the right IRC channel, at just the right time of night, and talking just the right way, to that one incredibly self-absorbed luser who happens to know that you need to put the undocumented"

Okay.. so it is my day job and thus I do spend a significant part of every day doing the work, I will give you that.  But I have had plenty of dates (and am now happily married with a child on the way), lots of other responsibilities, and do not know lots of undocumented folklore (documented, sure).  I do not frequent forums except as the result of searches, and I do almost all of my help searches strictly at google.com.  There are times when I need more help than a search provides, and I use things like the mailing lists or IRC channels for that software.  I do not always get the help I need, but I usually get it figured out.  That being said... I rarely have those types of scenarios even with the bleeding edge things I play with all the time.

"Can you imagine my rank naivety here? I actually said Okay to a Linux update. I know I should have known better. ... But I didn’t. I figured that after all these years, Linux was finally robust enough to not rip me a new one because I just wanted to run a server and keep it up to date. Silly me! Silly, silly me!"

So unless you are pulling in packages from all kinds of non-reliable repositories or letting manually installed software override package installed software this should not be an issue.  I would love to know what the root of this update issue was, because user error is the number one cause of package management updates on any systems in my organization.

"Sure, Linux machines can make great servers. But they require a dedicated group of Linux groupies who know all the folklore, all the secret handshakes, and where all the bodies are buried. "

"That’s how you survive with a Linux distro apparently. Once it’s installed and works, never, ever update it."

I install machines, turn on updating, and walk away.  They run.  I really do not know why you are having such an issue and would actually love to know the truth behind your problems. It boggles my mind to the point where I felt the need to write this blog entry.

Take into consideration that I have a fairly general philosophy about running software on systems am responsible for either installing or administering.

  1. Install software from trusted repositories.  IE: the distribution + one (two is pushing it, but doable) external repository.
  2. Do not configure repositories that have conflicting packages. (Do not turn on DAG and EPEL)
  3. Avoid unpackaged software. Only install packaged software if you can.  If it is not packaged, can you package it? It is not that hard, and other people benefit from your work.

With those 3 things in mind I usually have no issue with my systems.

"Oh, and one last point. Don’t go telling me I don’t know what I’m doing, because that proves my case against Linux. I know quite well what I’m doing, but not to the level that is apparently required to keep a simple LAMP machine running. " (emphasis his)

What I love about this quote is that he attempts to deflect any possibility that he is at fault by saying the requirement of someone to have entry level junior system administrator skills is to much to ask for from someone that wants to be a system administrator.

Now that all being said.  If you have a bad experience with Linux and are done with it, then fine.  Enjoy Windows, or try to.  Just remember that your experience is not the norm.  Linux has greatly improved over the years, and from what I hear Windows is starting to get to the same point with updates and usability as my experience with Linux's updates has been. Are Windows admins still waiting for SP1 or 2 before applying updates? I wish you luck.

On a final note, it does worry me that this is the type of person advising Washington on technical issues and using such a public forum to spread FUD.  Nothing in his background suggests that he is a competent system administrator.  Product management and development? While development and system administration tend to overlap, in my experience most developers turned system administrators are more likely to have all kinds of funky behavior and configuration patterns on their systems.

Tagged as: , No Comments
17Apr/110

call to software vendors… package it right

Posted by xaeth

One of the tasks that I have been responsible for performing over the last several years is packaging software into RPM Package Manager (RPM) packages.  All of our internal RPMs are fairly simple, the tricky part is the 3rd party software.  There are several problems with the distribution of commercial off the shelf (COTS) software in the Linux ecosystem.

  • They are rarely RPMs
  • Sometimes they use InstallAnywhere installers (more on that later)
  • When they are RPMs they either think they know better than RPM (can be explained as a lack of understanding as well) or they try too hard to make a single RPM that works on all RPM-based distributions.

Before I go further I would like to say, if you are packaging your software as a real RPM (or any other native Linux packaging system), even if it is not perfect, THANK YOU. We appreciate it.  Please take my commentary as constructive criticism.  I am not angry with you, and will gladly help you with packaging issues if I can.  I am not the best either, but I have a fair bit of practice.  I am sure others would gladly assist as well.

Moving on... So today I got stuck attempting to automate the installation and configuration of some of unnamed vendor's system management RPMs via puppet.  I made the mistake of looking into the scriptlets and was frustrated by some of their practices.  I started to write a package by package evaluation of the scriptlets, but one package in particular would have taken forever (it attempted to cover every possible RPM-based Linux distribution via the scriptlets).  I recalled a conversation I had a year or two ago with the individual who heads up this company's Linux packaging group, and they had expressed interest in feedback.  At the time I had a few points to provide, but I did not have time for a more in depth analysis.  So I decided to finally write up a general set of bullet points to pass their way, if I am not trying to help then I am part of the problem, right?.  I figured it would not hurt to put it out here as well.  So I am re-wording a touch to make it less specific to just them, and more of a general call to all software vendors.  Also, I would imaging most of what is stated directly translates to Debian, conary, and other native packaging systems; but it is not intended to be a definitive guide.

Things to keep in mind when packaging software for native Linux distribution:

  • It helps to build (compile) your software from source using the packaging tools (RPM), instead of just packaging up the binaries.  You do not have to distribute the source (SRPM, tarball, etc), but the build process can potentially be cleaner and more manageable.  Yes, I know you paid all kinds of money for some fancy handle everything build system.  Are your customers (the system administrators) happy with the output?  I know I am not.
  • With the modern build processes and systems available there really is no good reason to build a mangled cluster of scripts that attempt to make one platform independent RPM instead of building distribution specific RPMs.  They can even still come from the same consolidate spec files, thus allowing you to reduce duplicate work.
  • Scriptlets
    • Should be short and concise
    • Almost the entire build and installation of the software should occur in the %build and %install sections, respectively.
    • Any setup or file layout should be handled in %build and %install sections, if its host specific it should be a documented post-install exercise for the admin.
    • User/group creation, symlinking files, chkconfig and service commands are all acceptable.
    • Should never touch a file owned by another package.  If it needs a setting or switch flipped, document it for the administrator. At worst, include a "check" or "setup" script that the administrator can run manually if they want you to do the work for them, cause the majority of us don't.
  • Files should only be placed on the system via RPM's built in methods.
    • Symlinks can be a caveat of this, but should not be abused. UPDATE 2014.03.19 - Retracting since I just built a package that managed symlinks right. No excuses. :)
    • File ownership and permissions should be set by the RPM, not in the scriptlets.
    • Do not provide library files that the OS already provides.  Get past the "i need this specific version, and no other will do" or "well we used that one but slightly modified" mentality, and when you can't, then require the correct compatible library.  Most distributions do provide them already.  If you needed to modify it, are you properly following the licensing? Wouldn't it be better to just submit a patch and stop having to maintain it and not worry about licensing issues?
  • You should not be adjusting security settings on the system for the administrator, you can provide them (i.e. SELinux policy files, default firewall rules, a file for /etc/sudoers.d/, etc), but implementing them for me is bad security at its worst.
  • If you provide an SELinux policy that does not change any existing policies on the system directly, you can implement that.  But if you change something existing, let us do the work so that we are aware.
  • Do not flip SELinux booleans related to other bits, let the admin or find the right way.
  • Get help... it is out there.  There is selinux mailing lists, and ya know what? Call Red Hat.  They helped get SELinux going, they know what to do.  You are not in charge of my system's security, I am.  If you need a change to an existing policy, talk to that policy's maintainer or implement it in your own policy.
  • Users and groups - for the most part this hasn't looked bad except:
    • Deleting and re-creating a user is a annoying thing to do.  If the admin changed something about the user and everything still functions, don't touch it.  If I need to fix it I can delete the user myself and then let you create it by re-installing, or just look at your newly clean scriptlets to discover the exact syntax.
    • Technically, you should not be deleting any user you create on the system... another exercise for the admin to avoid stranded files.
  • If your company decides not to supply good RPMs, consider a tarball(s) with an simple install script, we can do the rest.  I was avoiding naming names, and I hate to give this software credit, but IBM's DB2 has got to be the easiest 3rd Party software I have ever packaged and deserves credit for the fact that there was just a bunch of tarballs and a few commands to install.  Setup was another matter. heh.

To summarize, write the RPMs according to a public packaging guideline, such as the Fedora Packaging Guidelines.  If Fedora would accept it into the EPEL repositories then you have succeeded.  I realize you want your packages to support other distributions, but odds are that if the spec file for the RPMs are cleaned up to meet Fedora's guidelines, the other distributions should be easy to support, and us administrators would be ecstatic.  Plus the use of tools like mock, koji, OpenSuSE Build Service, etc can greatly ease build and distribution issues.

Since I already named one name, might as well point out a negative one.  Flexera's InstallAnywhere... So InstallAnywhere is a universal installer with an interesting feature.  Their site claims: "Install native packages, such as RPM, on Linux, Solaris, and HP-UX".  This is inaccurate, at least towards RPM.  What they produce is a Java-based installer that injects RPM metadata into your system's RPM database.  This is not an RPM.  We can not distribute this software via Yum or Spacewalk or Red Hat Network Satellite.  They should be ashamed. :(

So some useful reading:

UPDATE: Some basic grammar fixes. 2011.10.19