Trip Report for Engineering Installation Ridealong
Cingular, Seattle, WA -- Thu Jun 7 2007

SE - Eric Crutchlow
DEV - Andy Sharp

The Story

Thursday morning 8.30am, we (Eric and myself) arrived at Cingular -- nee AT&T -- and met with the customer face man, Patrick. After a couple of minutes of chatting and introductions, we headed for the machine room which was actually in a different building as it occupied the entire building. We were there to install a 2-node Bobcat cluster with the storage purchased separately.

We found ourselves in a building roughly the size of ours but turned over entirely to being a "machine room." It was filled with the usual stuff, except interestingly, no Intel-ish servers from Dell, IBM, etc., that I would have expected to see there. All Sun equipment. Most of the newer servers were Sun's AMD Opteron based systems.

The customer told us later that the reason ONStor peaked their interest was that they already had the storage, they just wanted to ditch the obsolete and featureless Novell NFS servers that fronted that storage. We found the two 2260 Bobcats mounted together at the bottom of a rack, completely wired up, and powered up. Apparently they had been that way, untouched otherwise, since they bought them possibly as far back as September 2006. Yeesh.

Eric quickly whipped out his laptop and connected via serial cable to the front console port, all the while maintaining an easy banter with Patrick. He expressed concern privately to me about the machines sitting there for so long unconfigured. Something about how things crash when that is allowed to happen. This is exactly the kind of lore the SEs carry around with them to try and accomplish their task as thoroughly as possible, and I find this sort of thing very annoying. We [Dev] need to make the product far less fragile so that this kind of thing can be relegated to the trash heap of distant memories.

Despite Eric's concern, and despite the fact that the filers were running version 1.3.1.1, he went through the initial config menu on the first filer -- utilising heavily the pre-filled out configuration worksheet from the customer -- and replaced the secondary flash with a pristine 2.2.2.7 flash he made himself. He proceeded to copy the newly created config information to the secondary flash with the system copy config command, followed by a system reboot -s. He repeated the process on the second filer.*  Both rebooted without problems. I was impressed with the way CS has created and operates their processes regarding the customer configuration worksheet and related. Very smooth.

During this time both Eric and myself asked Patrick a few questions here and there -- we had at least an hour to kill while Eric went through the initial config menus on both nodes and rebooted both nodes. Especially since we had to wait several minutes for each filer to have some DNS queries associated with emrs time out before the boot would proceed. Since DNS had not been configured, there is no way these queries can be resolved, and the boot process shouldn't be made to wait for them anyway.  *Bug 19557 filed.

There are two things that are not handled in the initial config menu which make life unfortunate for the installer. DNS and NTP should both absolutely be handled in the initial config menu. Because they aren't the installer has to bring the system up and then configure them, and there are multiple infelicities associated with this. One problem is that NTP needs to be started correctly so that all the filers in a cluster have a synchronised time, otherwise clusterdb and other cluster related operations will screw up because of a heartbeat/time mismatch. This does not happen when NTP is configured through the CLI. The correct NTP startup sequence is only performed at boot time. Currently the installer either sits and waits for NTP to slowly sync itself, or reboots (again) to get the correct startup sequence for NTP to be executed. But any service that has a hostname as part of it's configuration information is trouble until DNS is configured. This would likely include NIS, email (sendmail), automount, NTP, and so on. The problem associated with EMRS has already been mentioned. Bug 19558 filed against NTP.

After the initial config phase was over, we returned to the original building and went to the desk of the sys admin who would drive the WebUI for the rest of the configuration: vservers, volumes, shares, oh my. Thierry, who, judging by the 8 foot high Zidane poster on the wall behind him, is a grumpy expatriot Frenchman, was nevertheless a very competent and knowledgeable sys admin. After some confusing moments at first involving vservers, he ripped through the config and had NFS and CIFS shares up in relatively short order. The phrase "oh -- sweet" was heard to pass his lips more than once during the process.

Patrick mentioned that he wasn't sure that all the network routes and things had been set up correctly, but that he guessed we would find out soon enough. Apparently, there are multiple IT related groups that sometimes have trouble talking to each other: storage, networking, lab manager(s), and then of course there is the end user. This can often result in the "ya can't get thar from hyar" syndrome: the average user or filer administrator can't actually directly access the management interface of the filer from his or her desktop. Our product design and our processes need to take this kind of thing into consideration wherever possible because I strongly suspect it is quite common especially at overly large companies. For example, Thierry could not directly access the WebUI from his desktop -- the two firewalls in between are run by two different groups, and the routes and pin holes would have to be coordinated ... a multi-day task at the very least. He was able to circumvent by logging into a machine that was halfway there and using a remote desktop technology. This had a very definite effect on the way the WebUI behaved. It made it look more sluggish than it really is, and certain components like the active mouse icon meant to indicate "processing..." didn't function properly. In fact it looked like a large fat hook that followed the mouse around. Obviously the ONStor employees knew what it was, but the customer had no idea at first.

Stuff that could use some fixin'....

Several things have already been mentioned in nauseating detail. Here is a laundry list of other things:
sssccc daemon very fragile: we need to move this daemon behind inetd. We can run a separate inetd just for sssccc if need be. Currently, just touching that port out of curiosity by a sys admin will cause a ton of thrashing (exec/fork) by the sssccc daemon which causes a lot of problems as it consumes most of the resources of the system. Using inetd as a frontman would smooth this out somewhat. Incomplete TCP connections and whatnot would not cause a fork or be passed on to child processes.
Who's your daddy? Cluster auto-config at initial install time. Ninety percent of the information entered into the initial config menu, one filer at a time, is identical, and the rest can be easily generated with simplistic iteration, like sequential IP addresses and so forth. We could use Ian's bonjovi protocol idea to auto-detect other filers that are connected on the second management interface, connect to those filers and easily lay down the initial configuration for all the filers in a cluster in one step. This small thing alone would be a real boon to the installers, who are almost always SEs. More complicated and intricate cluster config and upgrade schemes can be implemented later. One step at a time.
NFS initial default configuration should be no_root_squash  When a volume is newly created and exported with NFS, the default should be to have no_root_squash. That's because the filesystem is empty and some root user has to get into it and create some initial directories and whatnot. If it isn't empty, then this does not apply, obviously.
Vserver wizard "atime updates" should be Access Time Updates.
Box Management IP, Vserver Management IP, FP IP, oh my!  At the point where vservers are to be created, the customer gets confused. What are all these vservers I have to create? Core? Management? Perhaps the wizard covers this. I don't think the customer used it the first time through.

Conclusion

I found this trip to be very helpful and enlightening. First I got an insight into how the products strengths and weaknesses affect the daily job of your friendly neighborhood SEs. Secondly I felt like I learned a lot of valuable perspective on how this product currently fits into the fabric of a customers' needs and expectations. Despite a pretty decent understanding of what the product is about, it was an eye opening experience to see something I participated in creating in the hands of people that have real world expectations and uses for it.

One of the things I realized while noticing the three NetApps in the racks around there that seemed to be unused or rarely used (thanks, NetApp, for putting an ops/sec meter on the front bezel!), is that the Bobcat is a very decent product that we have not realized the full potential of yet in terms of ease of use and reliability. While sounding like a back-handed compliment, the good news in that is if we can execute on the notion of incrementally increasing ease of use and reliability over time, without backsliding, this product, and future products based on the same software, should be extremely tough competitors in the NAS market. That's an exciting feeling indeed.