2010-05-17

Zenoss Swap Threshold Fixes

One of the recur­ring prob­lems I have with Zenoss is fix­ing the swap thresh­old issue. Basically, if your swap space is less than 1G, you’re stuck with an alarm inform­ing you that there’s less than 1G of swap total. The options are to hack it to increase the thresh­old (by decreas­ing the minimum-free thresh­old), or to make the alert use a per­cent­age of the total.

Posting it here since it’s the sec­ond time I’ve had to find this…

Be the first to comment on this...


2009-10-05

Customer Service Fail

One of the bad ideas cur­rently infect­ing com­pa­nies in the tech­nol­ogy field is the LivePerson “chat with a sup­port per­son now” thing. This is a bad idea for mul­ti­ple reasons:

  1. It’s a gigan­tic float­ing piece of garbage dis­tract­ing me from what­ever it is I’m try­ing to learn about your com­pany or it’s products.
  2. The rep­re­sen­ta­tives on the other side are idiots who wouldn’t know what cus­tomer ser­vice is were it [insert your preferred-gendered joke here].

Case in point: Limelight Networks. The web­site lists about 15 “Services” they offer, and none of them are imme­di­ately obvi­ous and I don’t want to spend the next hour pars­ing their mar­ketease try­ing to find the spe­cific niche prod­uct I’m look­ing for. So I (like a fool) click the stu­pid chat thing obscur­ing their page. This started at 8:30am and went on until 9:30am.

Jon: Yes what com­pany or site are you from so I can bet­ter help you?
You: [redacted]
Jon: Thank you for wait­ing. I’ll be with you in just a moment.
Jon: I’m sorry for the delay. I’ll be right with you.
Jon: I’ll be right with you.
You: Why don’t you have a sales­per­son con­tact me and just ask me if our web­site is, in fact, a blank page?
Jon: I’m sorry for the delay. I’ll be right with you.
Jon: Thank you for wait­ing. I’ll be with you in just a moment.
Jon: I’ll be right with you.
Jon: I’ll be right with you.
Jon: Thank you for wait­ing. I’ll be with you in just a moment.
You: So what’s the point of this chat, exactly? For me to wait on iHold for an hour while canned mes­sages about your immi­nent reply scroll past?
You: Until I get bored and find another vendor?

After another cou­ple min­utes I just closed the win­dow, and now I’m at Akamai’s site, look­ing at their “solu­tions finder” — which is what I was look­ing for on Limelight’s web­site to begin with.

Comment on this...


2009-08-22

In The Clouds

I’ve spent the last cou­ple weeks mov­ing off of my exist­ing server(s) and into the cloud. Previously, I had been using my own Zimbra server, own SVN/trac install, and web­sites, albeit vir­tu­al­ized on a shared XEN server. The phys­i­cal server all this was run­ning on was some ancient second-hand single-core i386 Dell pow­eredge which never had enough RAM, cyles or bandwidth.

For this, I paid a friend of mine $30/mo. Recently, the fourth per­son in our arrange­ment dropped out and so our costs went up to $40/mo. Now, I had 768Mb worth of mem­ory on the two vir­tual machines I had, of which I was only actu­ally using one.

So I was pay­ing $40/mo for a sin­gle VM instance I ran SCM and my web­site off of, and my e-mail. That’s dumb, since you can use pri­vate repos on Github for $7/mo, use Gmail for free (all things equal, web­mail is web­mail), and get a pri­vate VM instance on Linode for $17/mo.

So that’s what I did: I cut my costs in half over the self-hosted solu­tion by putting shit online.

Now, if me and my friends had kicked in a lot more $ and got­ten a real server, and split it that 1RU up more aggres­sively, then it would have been cheaper to do that our­selves. But nobody cared enough about that to make it work, so putting it else­where is cheaper.

Which is a gen­er­al­ized con­clu­sion I’m will­ing to draw: if nobody cares, it’s cheaper to pay some­one to do it than to mud­dle through your­self. If some­one does care, then it’s invari­ably going to be cheaper to DIY.

Be the first to comment on this...


2009-07-17

Distributing Static Routes with DHCP

I’m set­ting up an iso­lated net­work for peo­ple to test inter­nal appli­ca­tions on, since the devel­op­ers all have Sun work­sta­tions with a dual-port Gigabit NIC on the moth­er­board, and we’ve got a bunch of older net­work equip­ment that we haven’t got­ten around to eBay­ing yet. What I’m doing is link­ing the sec­ond NICs together with some vir­tual machines and the older net­work equip­ment to cre­ate a sep­a­rate devel­op­ment network.

The devel­op­ment net­work is a full Layer-3 net­work run­ning an IGP between mul­ti­ple nodes with attached client boxes. This allows me to play around with a decent lab net­work, and pro­vides devel­op­ers with a way to dis­cover that Linux sets the TTL of mul­ti­cast pack­ets to “1” well before they are called to explain why their appli­ca­tion didn’t work even after loads of test­ing, spend 8 hours play­ing head-desk, and finally start ques­tion­ing me about fire­walls on our inter­nal net­work, forc­ing me to claw it out of them that they are dri­ving mul­ti­cast with­out a license and explain how to use tcpdump.

Not that I’ve had to do that a dozen times now, or any­thing…
Read the rest of this entry »

Be the second to comment on this...


2009-05-26

My First JBOD, Part 2: Irony

J4200 After unpack­ing, rack­ing, and mount­ing the JBOD, I waited until the week­end had started before pow­er­ing down the server and installing the RAID card. Connected it all up, rebooted into the Adaptec BIOS, and con­fig­ured the 6x 1TB dri­ves into a RAID6 array. After that, I installed the RAID StorageManager off of Sun’s web­site, and then the “Common Array Manager” soft­ware. CAM is sup­posed to pro­vide a web GUI to an organization’s worth of Sun JBODs, so you can update JBOD firmware and query sta­tus and what­not from a sin­gle inter­face. There’s client and server bits writ­ten in Java that run on the var­i­ous boxes, so the data path was going to look like this:

JBOD -> XEN dom0 run­ning remote proxy tool -> XEN domU run­ning web GUI

I say “was going” and “sup­posed to” because all the remote proxy tool in CAM ended up doing was con­sis­tently trig­ger­ing a ker­nel panic in the aacraid dri­ver when­ever it’s detec­tion code fired up.

Take a long drag off the irony of dri­ver and firmware issues, and down­load the latest-n-greatest aacraid dri­ver and firmware from Intel via Sun, and update. Same results. Repeat in var­i­ous con­fig­u­ra­tions, and before throw­ing in the towel, get a basic dump and file a bug. I didn’t put any more seri­ous thought into debug­ging it sim­ply because this whole thing has to be up and run­ning yes­ter­day, and the last time I asked for doc­u­men­ta­tion on the topic, I was rebuffed with a vari­ant of this clas­sic: “If you were smart enough to debug the ker­nel, you wouldn’t need doc­u­men­ta­tion on how to debug the kernel.”

Take a moment to stand in awe of the mas­sive poi­so­nous cobag­gery involved in that state­ment being offered to some­one who wants to help fix a crasher. I’ll wait.

That kind of shit would never fly in any GNOME venue, which is why GNOME kicks so much ass.

Update: The cobag­gery about ker­nel devel­op­ment did not come from Sun or any rep­re­sen­ta­tive of any com­pany involved in open-source, and was unre­lated to this sit­u­a­tion at all. I relate it sim­ply as it per­tains to debug­ging ker­nel issues, and why I don’t do it.

Comment on this...


2009-05-23

My First JBOD: Introduction

This is me set­ting up a JBOD for use by one or more XEN hosts, using pro­fes­sional hard­ware. It’s not a hack, not throw­ing a shit­load of dri­ves into a PC with some “pro­sumer” SATA RAID cards that require you spend weeks fuss­ing with dri­vers and firmware to get even a min­i­mal write per­for­mance out of their under­pow­ered hard­ware RAID.

A for­mer room­mate of mine once setup such a beast using a 12-port SATA card which ended up deliv­er­ing a whop­ping 1 MBps of write speed in a RAID 5 con­fig­u­ra­tion. I sim­ply don’t have time to play around like that these days, so this is me trad­ing cap­i­tal for time.

The host machine is a Sun Fire X4200M2 server with an inter­nal RAID10, run­ning a RHEL 5.3 XEN instal­la­tion. None of the ser­vices cur­rently run­ning on this box are crit­i­cal, which means I can take them down for an hour at the end of the day with­out trou­ble, pro­vided I can get them back up again. I also have the (Memorial Day) week­end to get the new JBOD up and run­ning on this box.

After it’s up, how­ever, I will be host­ing impor­tant business-ey things on var­i­ous vir­tual machines using this JBOD: e-mail, website(s), inter­nal wiki, NAS, along with pri­mary ker­beros, LDAP, cob­bler, pup­pet on the inter­nal RAID; so it’s fairly impor­tant that this get up and work­ing, and be sta­ble once it’s going…

The JBOD itself is a Sun StorageTek J4200 array with a sin­gle IO mod­ule and a PCIe SAS RAID card, run­ning 6x 1TB SATA disks in (even­tu­ally) a RAID6 array. I’d like to play around with inter­est­ing things like redun­dant SATA mul­ti­pathing, but I’m pretty new to the whole stor­age admin area, so I’m not going to be play­ing around with those things on *this* setup…

Comment on this...


2008-11-30

New Books

Latest on the “done” pile are Rule The Freakin’ Markets and IS-IS Network Design Solutions. Summaries/reviews of both are up.

Be the first to comment on this...


2008-10-16

Daemonizing Processes

Update: Commenters have pointed out a few things:

  1. This post is incomplete/incorrect. What I’m doing now is hav­ing the daemon func­tion call a script that looks like this:
    #!/bin/bash
    exec 1>&-
    exec 2>&-
    exec 3>&-
    nohup myPropApp & 2>&1 > thelog.txt

    That code was from another web­site who’s URL I lost, and I posted the solu­tion below based on another, alter­nate method that I hadn’t tried but sounded simpler.

  2. There are other options, like daemonize(1), setsid(1), and the bash builtin disown (which I had pre­ma­turely rejected as ksh-only).

Back when I was using Debian, one of the nicer things about it was their helper tool for startup scripts: start-stop-daemon. Particularly, it’s abil­ity to dae­mo­nize any process with the -b flag. You notice how handy things like that end up being when you’ve got an in-house or oth­er­wise pro­pri­etary app that can’t dae­mo­nize itself prop­erly (e.g. Java-based services).

Somehow I’ve man­aged to get away with not hav­ing to write a script that dae­mo­nizes a normally-foreground process on an RH-based dis­tri­b­u­tion yet, mainly because I’ve been using Debian almost exclu­sively for servers, and have only worked for tiny star­tups, where lux­u­ries like init scripts are the last thing on any­ones’ minds.

Everyone is famil­iar with the nohup & trick, but that still leaves it asso­ci­ated to a ter­mi­nal, so after you log out, your terminal/ssh ses­sion will just hang because stdin is still open. As it turns out, you can close your stan­dard in from bash first by redi­rect­ing your stan­dard input from nil (e.g. someapp <&-), and that will let it just work.

Very sweet for writ­ing initscripts.

Comment on this...


2008-07-04

More Help Wanted

As it turns out I have need for another Systems Administrator, this time in Washington, DC. This job is for a local admin­is­tra­tor to han­dle the day-to-day sup­port and activ­i­ties in the Washington office (com­plete with AD domain, Asterisk server, NAS, and a dozen users), as well as the four branch loca­tions in the DC Metro area and (future) dat­a­cen­ter while work­ing together via IM, mail, and phone with the exist­ing tech team in Chicago to plan and imple­ment improve­ments, and resolve prob­lems. The tech­no­log­i­cal envi­ron­ment is 80% Windows, but the remain­ing 20% is RHEL5; the branch loca­tions are 100% RHEL5.

So, the require­ments are Linux and Windows desk­top sup­port, a desire to teach your­self Asterisk, Windows domains, and Cisco net­work­ing, and the abil­ity to pass a Federal secu­rity check. Experience with open-source web soft­ware and Apache (e.g. Wordpress, Joomla!, etc.) is great, but not required.

As before, send your resumé to me.

Be the first to comment on this...


2008-06-22

Help Wanted

I’m look­ing to hire a Linux Administrator in for a posi­tion in down­town Chicago. It’s a high-demand, high-stress envi­ron­ment with lots of things going on at any one time: We play with high-end sun servers on an inter­na­tional pri­vate net­work, use Amazon EC2, and have a slew of Asterisk servers form­ing the joints of a wide-area VoIP infra­struc­ture. Success and fail­ure is often mea­sured in terms of mil­lisec­onds. On the down­side, we also do Windows, must sup­port the desk­top users (most desk­tops are Linux, though), and the com­pany isn’t large enough to jus­tify a divi­sion of labor yet.

You must be famil­iar with remote admin­is­tra­tion tech­niques, MySQL, apache, VCS, RPM-based dis­tri­b­u­tions, (the basics). Familiarity with bind, dhcpd, ddns, and basic net­work­ing is also rec­om­mended (at the very least you should be able to fig­ure it out with­out handholding).

If this still sounds like some­thing you’d like to par­tic­i­pate in, send your resume to me and I’ll for­ward it on to our HR peo­ple for pro­cess­ing. Act today and you’ll get your very own number!

Comment on this...

Go backward in time