Category Archives: IT

Not all HBA adapters are created alike

Moving my backup NAS to a Dell R620, the version I ended up purchasing has three low-profile PCIe slots (riser 1 + riser 3). While I hadn’t spent time considering it, this meant that I could no longer insert my full-height SAS9201-16e adapter into the chassis. I searched for a replacement, and found the SAS9207-8e for a reasonable price.
What I later learned is that I was relying on a feature that the 9207 doesn’t seem to have: redundant SFF-8088 capabilities. When I had installed the 9201, I had run two cables between the adapter and my external JBOD storage (the NetApp DS4243 previously discussed). Doing the same thing with the 9207 results in a lot of conflicts, since instead of detecting that the two cables are connected to the same system, it instead sees two copies of the same system. You can quickly see how FreeNAS would get confused when seeing two copies of the drives, and trying to mount the same drives twice. I had tried updating the 9207 to the latest firmware, but never found a way to enable this same redundancy with this adapter.
So I was stuck using a single cable. Not a big deal, right? Well occasionally I would get communication errors, and once during a lightning storm the single link completely died, resulting in some (minor) data corruption of the JBOD array.
While I can’t guarantee the redundant connection would have saved me in this case, I believe it is a more robust solution. I ended up purchasing a different R620 chassis enclosure (riser 2 + riser 3) and moving the electronics over to this so I can swap back to my 9201-16e. I will continue testing to see if this does in fact improve reliability.
What’s most annoying is that I can’t seem to see this feature listed in either of the product’s documentation. So without trying, it would be hard to know which HBA adapter provides this feature.

Quanta Winterfell RAM stability and RAPL Mode

As I previously posted, I got my hands on a few inexpensive Quanta Winterfell compute nodes.  The systems have been running well except for occasional RAM errors that I’ve been trying to debug.

I’ve been using PC3-10600R of 2GB, 4GB, and 8GB sizes, single and dual rank.  The system would seem to be more stable when RAM was only inserted into the first slot of each bank (with the white tabs), but produce more errors once the second slots were populated.  This wasn’t 100% consistent either though.

In FreeNAS this would appear as read errors on the console.  Under VMWare it would appear as a complete system lockup and/or pink screen of death with a message about memory errors.

Both systems would run more reliably if I were to force the RAM speed to 1066MHz (instead of the default 1333MHz), but this of course reduced system performance and also wasn’t 100% consistent (some RAM combinations would still cause failures).

The BIOS is more of a developer-level BIOS, which means there are lots of settings available to tune all sorts of parameters, including changing RAM voltage and timing.   At first tried increasing the timing values, thinking maybe there wasn’t enough timing margin.  This really didn’t help, so then I reduced the clock rate and had improved but still flaky.

After some further testing I believe I’ve found a solution in the BIOS that results in stable performance.  Longer-term testing will be necessary, but initial results look good.

There is a setting called RAPL Mode.  RAPL stands for Running Average Power Limit, which essentially is a way to save power by limiting how much the RAM gets to use.  There are different modes, which appear to essentially be different algorithms to determine how much power the RAM should be using.  My best guess is that it is poorly implemented on these systems, which results in the RAM being starved of power and then getting corrupted.  Each mode is a newer algorithm which is supposed to result in better performance.  By default the system is set to Mode 1.

On one system I have 56GB of RAM and am running FreeNAS.  In this case I could adjust the RAM speed to 1333MHz and use RAPL Mode 0.  To test I copied data to the disk, which fills the ARC.  After 24hrs there have been no reported errors on the console (typically would start within a few minutes at 1333MHz previously).

On another system I have 112GB of RAM and am running VMWare ESXi.  At RAPL Mode 0 and 1333MHz the system hung immediately after a VM was migrated to it.  At RAPL disabled and 1333MHz the system has been up with no issues.  To test, I ran a 3D EM field solver that consumes 62GB of RAM when active on a task for about 45 minutes.

So in summary, RAPL might save power, but it could corrupt your data.  I will be disabling this feature on these systems and continue monitoring their performance.

Quanta Winterfell FreeNAS Server

I recently acquired what is known as a Quanta Winterfell Open Compute blade.  Quanta seems to make a number of OEM solutions for large companies.  In this case, Open Compute is a standard for designing no-frills high-density server systems, utilized at least by Facebook.  So what I have here could have been processing my posts, likes, etc.

Quanta Winterfell with “cover” off

The no-frills means you get a very basic chassis, which doesn’t technically even have a front, or completely enclose the entire system!  Strange looking, but when it’s sitting in a datacenter by itself, who is going to care as long as it is doing its job.

The blade itself takes a 12V input (actually 12.5V nominal) from a large space connector.  Since I didn’t have the mating part, for now I just jammed some large-gauge wire into the spades and used one of my Agilent System DC power supplies, so I could also monitor current consumption.

Horrible connection for testing (don’t try this at home).
With 2GB RAM, it idles around 50W.

The barebones blade was $90 (free shipping), which included the heatsinks and a 10Gb SFP+ NIC.  The NIC alone runs about $50, so it wasn’t a bad deal overall.

All that was left to add was a CPU, RAM, and a video card.  The system can output the console over a built-in serial port and I believe serial over LAN, but for ease of bringup I opted for the video card for now.

I wanted to see if FreeNAS would boot, so I plugged a bootable USB key into a hidden USB port (there are only 2 total), and used the other port for a keyboard.  By default the hidden USB port is disabled, so after enabling this in the BIOS it booted right up!

FreeNAS boots.

It is actually very quiet too, at least when not heavily loaded.  The fans are large, so there isn’t any of that loud datacenter whirring sound you would attribute to that environment.

Next steps:

  • Replace Agilent supply with HP server supply (modify connector)
  • Add more RAM
  • Test external SAS card and hard drive shelf
  • Get 10Gb adapter up
  • Investigate headless boot (remove video card)
  • Order second CPU

Converting a NetApp DS4243 drive shelf into a vendor-generic JBOD array

NetApp makes some nice hardware that you can occasionally find for a low price on eBay.  Unfortunately, it is typically hard to reuse since NetApp tends to require specific firmware on the hard drives in their drive shelves.  So you are then locked into their harder to find and higher-priced drives.

With a bit of experimenting, I found a method to get around this for at least one family of hardware.

Netapp’s DS4243 is a 24-bay SAS 6Gbps drive shelf.  It typically is configured with a pair of supplies (can support up to 4) and two IOM3 modules (which only support 3Gbps, but other versions exist).  I managed to pick one of these up off eBay for just under $100 with the pair of supplies and IOM3 modules mentioned.

Note I didn’t even try to use the IOM3 modules.  There might be other ways around the limitations I read about online, but I found a simple and inexpensive option that allows the disk shelf to be used as a generic JBOD array.

I also had a Dell Compellent HB-1235 12-bay SAS  6Gbps drive shelf.  This drive shelf comes with a pair of much longer named controllers (HB-SBB2-E601-COMP) that already present the drive shelf as a JBOD array to FreeNAS.  It turns out, these were manufactured by a company called Xyratex, who just happens to also manufacture the Netapp DS4243.

So what would the chance be that a Dell controller would work in the Netapp drive shelf?

I did some research, and the form and fit of the controllers matched perfectly.  The connectors are identical and placed in the same locations.

Front of the modules.
Rear of the modules, showing identical connector types and placement.

Now there is a chance that the pinout could have changed, power rails could be different, or some other issue might exist due to the fact that these weren’t specified to be connected together, but I was willing to take that chance for the sake of research.  Designing hardware in a similar industry, I took a bet that they were at least close enough to do something without blowing up.  That only question for me was how well would it work.

So all there was left to do was to plug it in and power it up!

Status is green and SAS link lights all good to go!
Drives powered up and show activity.
The NetApp drive shelf even identifies properly!

I was curious if possibly the HB-1235 controller would only see half of the NetApp drive shelf, since it was specifically used in a 12-bay drive shelf.  I purposely inserted a 500GB drive in bay 24 to test if it would work, and it identified properly with no issues at all.

So they identified, but would there be any stability issues?  To at least get a first-order estimate of this, I copied roughly 250GB of data to the array of 6x 3TB drives and had no issues. This was done over a 1Gb link.  After the copy was complete, a scrub of the volume was also successful.

The HB-1235 with two modules and two supplies cost me $120, and the NetApp was around $80.  Each unit only needs one module to run (though the HB-1235 seems to want to run the power supply fans on high when only one module is inserted).  A separate modules runs about $50, so you on a good day on eBay you can have a full 24-bay generic disk shelf for less than $200.

Dell T620 Power Interface Board that won’t power up

I recently acquired a Dell T620 chassis that included everything except a motherboard and power supply.  I had an extra motherboard already, so I installed it only to find that it would not turn on.  The 12V_AUX LED on the motherboard would light up, but when pressing the power button it wouldn’t do anything.

I started debugging by swapping components from another chassis I had, at first thinking it was the frontpanel, switch, or maybe a bad cable.  It turns out that it was the power interface board (PIB) itself.

The board appeared to be in good shape, with no obvious scratches or parts missing.

It was time to get out the microscope and do a closer inspection.  Since there aren’t a lot of parts on the board it didn’t take long to find a suspect problem.  One of the parts appeared to have a solder bridge between two of the pins (6 and 7).

A closer look at the pins:

All it took was removing this solder bridge, and the system then powered up without any further problems.

I have no clue how this would have ever worked in this state, so I’m not sure how it even made it out of Dell’s factory.  It didn’t appear to be reworked based on my experience, so this is a very strange escape.

Either way, I was able to rather quickly find the issue and fix it, saving the need to purchase a replacement.

Converting a FreeNAS stripe to mirrored drive

RAIDZ is a powerful tool, and FreeNAS makes it easier to use via the GUI, but there are a number of advanced things you can’t do in the GUI alone.

In this case, I had an expanded RAIDZ-1 volume with a single striped disk (by accident), and I wanted to convert that striped disk to a mirrored one.  You could also use the same steps to convert a single striped disk (yes a single disk is still labeled as striped) to a mirrored disk.

There are a few other sites that provide the commands, but they don’t do a good job of explaining what the commands are doing.  I’ll try to clear that up!

Determine that name of the new drive.  You can do this within the GUI.  Disks are usually named /dev/ada0, /dev/ada1, etc. In my case (using a PERC H310 with Avago firmware) they show up as /dev/da0, /dev/da1, etc.  The newest drive will have the highest number.  You can also use the SMART test to verify the drive by serial number by running:
smartctl -a /dev/<yournewdrive> | more

Once you determine the correct new drive, remove any existing partitions with:
gpart destroy -F /dev/<yournewdrive>

When all partitions are deleted you can then create the ZVOL partitions.  First start by creating the partition table of type gpt with:
gpart create -s gpt /dev/<yournewdrive>

With the partition table created, you can now add the partitions. All drives get 2GB reserved for swap, and the rest can be used for the ZVOL.  First create the 2GB swap partition (partition 1), which starts at an offset of 128 sectors (which is reserved for the partition table):
gpart add -i 1 -b 128 -t freebsd-swap -s 2g /dev/<yournewdrive>

Now you can create the ZVOL partition (partition 2), which is essentially the rest of the disk space:
gpart add -i 2 -t freebsd-zfs /dev/<yournewdrive>

Next check your work:
gpart show /dev/<yournewdrive>

If you did everything correctly, you should end up with a table like this (in my case for /dev/da5, which is a 4TB drive):

[root@nas ~]# gpart show /dev/da5                                               
=>        34  7814037101  da5  GPT  (3.6T)                                      
          34          94       - free -  (47K)                                  
         128     4194304    1  freebsd-swap  (2.0G)                             
     4194432  7809842696    2  freebsd-zfs  (3.6T)                              
  7814037128           7       - free -  (3.5K)    

If you find you’ve done something wrong, you can always destroy the partition table and start over again.

With the drive partitioned, you can now add it to the pool.  This part is tricky, because you need to do it by GPTID, which is a long hexidecimal code.  If possible, do this using an actual SSH session to your server, because you can’t copy and paste (to my knowledge) to the shell available in the web browser.  To determine the GPTID of each disk, run:
zpool status

You will see something like this:

  pool: VOL1                                                                                                                        
 state: ONLINE                                                                                                                      
  scan: scrub repaired 0 in 0h17m with 0 errors on Sun Mar 19 00:17:55 2017                                                         
        NAME                                          STATE     READ WRITE CKSUM                                                    
        VOL1                                          ONLINE       0     0     0                                                    
          gptid/c3b13e97-ef38-11e6-9be5-74867ad1a828  ONLINE       0     0     0                                                    
          gptid/f36a38d3-1744-11e7-8856-0090fa79871a  ONLINE       0     0     0                                                    
errors: No known data errors 

Those gptid/<GPTID> values are the descriptors you’ll need.  Find the one for the existing drive and copy it to a text editor (notepad).  Then run:
glabel status

Here you will see a list of every drive in the system with the gptid/<GPTID> names for each.  Find the one that matches the new disk (it’s listed multiple times for each disk, so make sure you select the GPTID that matches the ZVOL, not the swap partition) and copy that to notepad as well.

Now you will need to add the disk to the zpool:
zpool attach <volumename> /dev/gptid/<existing disk GPTID> /dev/gptid/<new disk GPTID>
where volumename = the name or your zpool volume.

If all went well, the disk will now be added as a mirror, and the system will begin to resilver (copy all data over to create the mirror).  You can check this in a number of places, but one of the easiest is the GUI.  Go to the storage pane, click on the volume, then click on the Volume Status button at the bottom.  You will then see status like this:

Resilver of an array after the mirror was added.

You can also run zpool status again, which will now show the disk in the list and indicate a resilvering status until complete.  The status light will also go critical in the GUI until the resilver is complete, but there is no reason to worry and all your data will be available during this process.

That’s it!  You’ve successfully created a mirrored disk array without having to wipe the original disk and start from scratch.

A summary of commands:

Get drive name: smartctl -a /dev/<yournewdrive> | more
Clear partitions: gpart destroy -F /dev/<yournewdrive>
Create partition table: gpart create -s gpt /dev/<yournewdrive>
Create swap partition: gpart add -i 1 -b 128 -t freebsd-swap -s 2g /dev/<yournewdrive>
Create ZVOL partition: gpart add -i 2 -t freebsd-zfs /dev/<yournewdrive>
Check your work: gpart show /dev/<yournewdrive>
Get GPTID of existing drive: zpool status
Get GPTID of new drive: glabel status
Add disk mirror: zpool attach <volumename> /dev/gptid/<existing disk GPTID> /dev/gptid/<new disk GPTID>
Check for resilver: zpool status

The Dead OBi200

After some thunderstorms came through, my OBi200 VoIP adapter stopped working.  The network end worked fine, but there was no longer any dial tone, and the device status page for the PHONE port no longer showed any information.

I only had the device for 11 months, so I promptly contacted support.  Since it was probably broken due to a surge/overstress event on the phone line, I decided to open it up and take a look.  I wanted to see if there was anything obviously broken that I could just replace and bring it back online, as well as I was curious what parts they had used in the design (and if there actually was any protection on the ports).

OBi200 Board

As expected, there really isn’t much inside.  There are three primary ICs:

  • Marvell MCU which provides the Ethernet interface, system control, config pages, etc.
  • RAM for the Marvell MCU
  • A Silicon Labs Si32260-FM1 ProSLIC telephone interface IC

The rest of the board is power supplies and a few components required by the primary ICs.  For what it’s worth there does appear to be an ESD protection IC on the USB port, but that’s good general practice for USB anyways.

I couldn’t find a full datasheet online for the Si32260-FM1.  The best I found was a couple-page datashort with a block diagram, pinout, and a summary of what the device does.  It essentially is everything necessary to provide a VoIP interface, including phone line voltage generation, DSP to encode and decode analog and FAX data, and a simple SPI interface for digital data transport (in this case to/from the MCU for transfer over Ethernet).

The SI3226x block diagram

Note that the block diagram shows two channels, but the OBi200 only provides one phone channel.  It appears that the second channel is connected and populated, so getting a second phone port is probably just a firmware change.

Unfortunately the datashort available didn’t have a reference circuit vendors usually provide, which more than likely is what Obihai used in this design.  From what I can tell (without a bunch of probing to determine actually connections) there appears to be a simple analog filter on the frontend made of 0805- and 0603-sized components.  The resistors appear to be either thick- or thin-film and the caps all MLCC.  A quick check with my DMM didn’t find anything that was obviously open, short, or different than a neighboring part with a matched circuit shape.

I did not see anything in the way of TVS diodes, spark gaps, or any other component that would provide significant protection from a high-voltage transient event, which is somewhat unfortunate.  Part of this is probably due to the small size, and the other due to there not being an actual ground lug anywhere on the product (the GND of the power port through a wall wart isn’t a true GND).

If I were to redesign this, knowing it’s probably going to connect to a set of phone lines that might be connected to a network of phone cable where lightning could possibly couple in, I would have probably added at least a couple TVS diodes and a GND lug.  Most customers probably wouldn’t connect the GND lug, but it’s better than nothing.

Fortunately, there are surge suppressors for phone lines available,  but of course it’s a separate product that needs to be purchased.  I decided to go with a Tripp-Lite DTEL2 suppressor, which connects between the OBi200 and the phone network in my home.

Tripp Lite DTEL2 Surge Suppressor

In the end, Obihai honored their 12-month warranty, and I sent the broken device back before doing any additional debugging.  I can only hope that adding an external suppressor will avoid another failure in the future.

Setting up CrashPlan using VMWare and FreeNAS

I’ve had CrashPlan on my list of apps to get installed in order to better backup my system data that isn’t already installed on a NAS.  The FreeNAS forums allude to difficulty involved in properly setting up and upgrading CrashPlan as it isn’t extremely straightforward, plus getting clients to connect to a FreeNAS-based CrashPlan instance also seemed hacky (mentions of changing config files, etc).

I gave the server install a try and dealt with a number of missing driver dependencies, issues with bash properly running, and a bunch of other annoying things.  Then I decided to try a different way.

Crashplan has clients for Windows, Linux and Mac.  In addition to FreeNAS, I also have a VMWare server that is already running an instance of Windows that is providing a few services that weren’t available in other OSes.  I decided this might be the easiest way to go.

One downside to this is I didn’t want to store the backup data on the VMWare server.  The volumes on that server aren’t as large, and while it is setup with RAID volumes, they don’t currently have the 2x redundant backup that my volumes on FreeNAS currently take advantage of.

The easiest solution would be to mount a network drive on the FreeNAS server, but for some reason CrashPlan doesn’t allow a network-mapped drive to be a backup destination.

Fortunately we are running this in VMWare, and there’s more than one way to mount a drive.  What I ended up doing was creating another VMWare disk volume that is stored via NFS on the FreeNAS server.  This is then mounted as a “local” hard drive within the Windows VMWare instance.  The disk was created with 250GB space and thin provisioning, which should be more than enough for my needs.  And if I ever run out of space it’s easy to grow drives.

After this I was worried that the disk image might be replicated as a full copy in FreeNAS each time it detects that the file has changed, but it appears that snapshots and replication are able to keep track of differences and only use space required to replicate the changes.

So in summary, to avoid issues with running CrashPlan in a FreeNAS jail:

  • Run a Windows installation in a VM (I used VMWare)
  • Create a second virtual disk using the network storage as the disk location (if not on the same machine)
  • Mount this disk, partition, and format it like any other disk in Windows.
  • In the CrashPlan app within the Windows VM, set the new disk as the default location for CrashPlan backups
  • Enjoy!

How NOT to setup a RAIDZ Volume

I’m relatively new to FreeNAS, which I somehow missed out on for many years.  Now that I found it I’ve moved over to using it as much as possible for my storage needs.

Based on a number articles it seems as though it’s easy to add storage any time you like.  This is true, BUT there are some HUGE caveats to this.  The FreeNAS GUI makes it somewhat hard to make a mistake, but if you think you know what you’re doing you can override these features and actually put your data at risk.

I started out with three drives and created a RAIDZ-1 volume.  For those unaware, this means that if any ONE drive fails then all the data will still be safe.  To be even more safe, it’s recommended to use a RAIDZ-2 configuration and/or backup your data to yet another location.  I decided on the latter, and have backed up my data to a second FreeNAS server (via snapshot replication).

So far I’ve followed generally-accepted good data integrity practices here, but now for my mistake.  I decided to buy another drive of the same size, and I wanted to add that to my server to increase the space available.  The FreeNAS GUI wouldn’t let me add it with the basic Volume Manager utility, so I went to advanced mode and added it to the ZVOL.  What I ended up with was this:

Adding an additional drive to VOL0 resulted in a striped disk configuration, which is bad news here.

The three original disks (da0-da2) maintain their RAIDZ-1 state, but now a portion of the data is shared with the striped disk da3.  The striped disk has NO redundancy.

Note that the cache drive ada0 is a SSD providing a L2ARC to the volume.  Cache is usually a single striped drive, and if the data is lost there then no harm will be done to the system.

What wasn’t explained very well (or I failed to read earlier), is that RAIDZ volumes can be added to, but each group of disks, once configured, remain in the configuration that they were initially created with.  What I had done was EXPANDED the storage available, but STRIPED my original raidz1-0 array with a single disk.  Because the expanded storage is a SINGLE disk, if that one disk fails then the entire volume will be destroyed.

So if any single one of da0-da2 fails then the array continues to operate in a degraded state. But if da3 were to fail, then the entire array goes down!  Kind of kills the point of RAID, right?

The reason for this is that RAIDZ is a high-performance datacenter-quality product.  Datacenters aren’t usually adding single drives, but rather entire arrays of disks when they need to increase storage.  If you decide to use FreeNAS and RAIDZ, then you must keep this in mind.  It does take more planning (and money for additional disks), but that’s the tradeoff of using this tool.

So how do I solve it?  Well right now da3 needs some redundancy.  I ordered another drive of the same size, which will then be a mirror for this disk.  Note that the mirror must be added via the console, not through the GUI.  There is no GUI-based way to convert a striped disk to a mirrored disk, even though the FreeBSD tools support it.

So then I will have a RAIDZ-1 array + expanded storage of a mirrored array, all sharing data of VOL0.  This still has some limitations on redundancy though (assuming disk 5 is named da4):

  • If any one of da0-2 fails, then the array is OK.
  • If any one of da3-4 fails, then the array is OK.
  • If da3 and da4 fail then the array is dead (the mirrored array has completely failed).
  • If any two of da0-2 fails, then the array is dead (the raid-z1 array has completely failed).

The likelihood of this happening is low, plus all my data is still backed up on a second server.

Another option (since all my data is backed up) would be to destroy the array, and rebuild it with all 5 disks (da0-3 + the one that will arrive soon) in a RAIDZ-2 configuration.  That way if up to TWO of ANY of the disks fails then the array would continue to operate.

I haven’t yet decided if I will do this, but for now adding the mirrored disk fixes my mistake.  Don’t do what I did.

Build your own wooden open-frame rack

My “networking” closet was a mess, so I decided it was time to build a rack of sorts to better organize everything.  I could buy a rack, but I decided to build one out of some scrap wood I had laying around instead.  This way I can size it for exactly what I need, plus customizing it just involves screwing stuff into the wood where ever I wanted.

Wood isn’t perfectly straight, so you need to leave some room for warping.  I used to cheap grade 2×4’s for the rear posts and higher grade 2×4’s for the front (I only had 2 laying around).  For the sides and top/bottom I used some 1×6.  The frame opening is 17.5″ wide, and I decided a 20″ depth worked well for the equipment and space I had available.  For putting it all together I used 2″ drywall screws.

My main piece of equipment is a Cisco Catalyst 4503-E chassis with a GBIC fiber card and POE gigabit card. Air flows right to left through this chassis, so having the sides open is nice.  I plan on adding more venting to this as I have time.  Also the fans are pretty loud, so adding vents should help muffle the noise too.

Rackmount ears would probably hold in wood, but I decided to build a makeshift shelf by adding two more 2×4’s along the sides where I wanted to mount the chassis.  This way the chassis just sits on these 2×4’s at each end.

Wire routing involved using some open frame panels with keystone jacks, plus some wire conduits of various types I found recycled at work.  The photo shows what I currently have complete.

Wire routing in open wooden frame rack.

There is a vent loosely attached to the intake side that runs to the ground. This helps somewhat with circulating air from bottom to top of the closet.  Eventually I’m going to add more permanent vents.

There’s also a few other pieces of equipment that will be added once ventilation is improved.