26 May: My First JBOD, Part 2: Irony

J4200 After unpack­ing, rack­ing, and mount­ing the JBOD, I waited until the week­end had started before pow­er­ing down the server and installing the RAID card. Connected it all up, rebooted into the Adaptec BIOS, and con­fig­ured the 6x 1TB dri­ves into a RAID6 array. After that, I installed the RAID StorageManager off of Sun’s web­site, and then the “Common Array Manager” soft­ware. CAM is sup­posed to pro­vide a web GUI to an organization’s worth of Sun JBODs, so you can update JBOD firmware and query sta­tus and what­not from a sin­gle inter­face. There’s client and server bits writ­ten in Java that run on the var­i­ous boxes, so the data path was going to look like this:

JBOD -> XEN dom0 run­ning remote proxy tool -> XEN domU run­ning web GUI

I say “was going” and “sup­posed to” because all the remote proxy tool in CAM ended up doing was con­sis­tently trig­ger­ing a ker­nel panic in the aacraid dri­ver when­ever it’s detec­tion code fired up.

Take a long drag off the irony of dri­ver and firmware issues, and down­load the latest-n-greatest aacraid dri­ver and firmware from Intel via Sun, and update. Same results. Repeat in var­i­ous con­fig­u­ra­tions, and before throw­ing in the towel, get a basic dump and file a bug. I didn’t put any more seri­ous thought into debug­ging it sim­ply because this whole thing has to be up and run­ning yes­ter­day, and the last time I asked for doc­u­men­ta­tion on the topic, I was rebuffed with a vari­ant of this clas­sic: “If you were smart enough to debug the ker­nel, you wouldn’t need doc­u­men­ta­tion on how to debug the kernel.”

Take a moment to stand in awe of the mas­sive poi­so­nous cobag­gery involved in that state­ment being offered to some­one who wants to help fix a crasher. I’ll wait.

That kind of shit would never fly in any GNOME venue, which is why GNOME kicks so much ass.

Update: The cobag­gery about ker­nel devel­op­ment did not come from Sun or any rep­re­sen­ta­tive of any com­pany involved in open-source, and was unre­lated to this sit­u­a­tion at all. I relate it sim­ply as it per­tains to debug­ging ker­nel issues, and why I don’t do it.

No Links Yet

  1. Pingbacks may be sent to http://ignore-your.tv/xmlrpc.php.

4 Comments

  1. Without wish­ing to defend the response from Sun, it’s not the first time I’ve been on the receiv­ing end of a Bad Case of Attitude from mem­bers the GNOME com­mu­nity as well. There are good peo­ple and bad peo­ple in every com­mu­nity; don’t kid your­self that just because GNOME isn’t a busi­ness, that there aren’t ass­holes in our midst.

    numpty

    From United States 2009-05-26 05:56

  2. James, you describe the CAM proxy run­ning in dom0 and the CAM BUI run­ning in domU. Is the panic occur­ring in the dom0 or domU RHE instance? If the lat­ter, the BUI needs to be told where the proxy is. The reg­is­tra­tion wiz­ard will search for you, but I won­der if when the search is being done in the domU (which will be fruit­less since it is a vir­tual machine with vir­tual dri­vers), the aacraid dri­ver is chok­ing on the vir­tual dri­vers. A pos­si­ble work-around is to spec­ify the ip address of the dom0 host in the reg­is­tra­tion wiz­ard. The BUI and the proxy com­mu­ni­cate via TCP/IP. As long as there is net­work con­nec­tiv­ity between the domU and dom0 (which is required for CAM to work in your setup), spec­i­fy­ing the ip address of the dom0 host in the reg­is­tra­tion wiz­ard will pre­vent the dis­cov­ery from hap­pen­ing in the domU and only occur in the dom0 (which is what you want).

    Paul McDonnell

    From United States 2009-05-26 08:54

  3. numpty: As noted in the update, the cobag in ques­tion isn’t a known employee of any company.

    Paul: The actual array is attached to dom0, and it’s dom0 that’s crash­ing when the reg­is­ter process on dom0 starts talk­ing to the raid con­troller. Inside the domUs, they only get the generic xen device (disk image on a log­i­cal vol, on a dif­fer­ent array/controller) that’s pre­sented to them.

    James Cape

    From United States 2009-05-26 09:19

  4. Irony is hav­ing all this beau­ti­ful Sun gear and strug­gling with dri­vers and “array man­age­ment” soft­ware on Linux when you could be done with a sim­ple “zpool cre­ate tank raidz ” on OpenSolaris. Xen is just as easy in OpenSolaris http://​open​so​laris​.org/​o​s​/​c​o​m​m​u​n​i​t​y​/​x​e​n​/​d​o​c​s​/​2​0​0​8_11_dom0/ :3

    James

    From United States 2009-05-27 07:01

  5. You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>