AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<Chuck.Nichols@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
X-Sylpheed-End-Special-Headers: 1
Date: Mon, 25 Jan 2010 11:42:05 -0800
From: Andrew Sharp <andy.sharp@lsi.com>
To: Chuck Nichols <Chuck.Nichols@lsi.com>
Subject: Some notes from thursday
Message-ID: <20100125114205.5792dbd9@ripper.onstor.net>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary=MP_SE+z_QWeAuw3U8b2OFkoxXz

--MP_SE+z_QWeAuw3U8b2OFkoxXz
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

Hi Chuck,

I'm sure you're quite busy with all the action items you have to mete
out, but I wrote up some design notes after that meeting while sitting
on the plane, and I thought I would send them to you.

The upgrade design is way off the mark, and has to be re-done.  I don't
know how to express it any other way.  If Aaron is working on that, I
can mentor him as he often works in Campbell.  Some of the stuff he
presented about serviceability was a bit off -- I'll be gently bringing
him along on the things I noted down.  But the upgrade is not going to
work as it is partially designed.  We should talk more later this week,
as I have some immediate deliverables at the moment which I need to
knock out.

Cheers,

a

--MP_SE+z_QWeAuw3U8b2OFkoxXz
Content-Type: text/plain; name=orion-design-notes.txt
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename=orion-design-notes.txt

flash partitioning:

	forget all this nonsense of a separate partition for each
	domU, that doesn't gonna cut it.  each domU's file system
	is nothing but a file in the dom0.  This maximizes free
	space flash and allows design and implementation flexibility
	that will absolutely be required for this project to ever
	hope of working.

	there are two flash partitions, one active and one standby.
	those lables/terms are LOGICAL, not physical.  the active
	is merely the current live partition.

upgrade:

	the upgrade needs to be done from dom0, and the domUs are
	shut down.  the best way to do it have the 'blob' be a tar
	ball of compressed tarballs: one for each domU identified
	by file name, and one for dom0 of course.

	the tarball for each dom contains files for that dom's
	filesystem.  dom0 upgrade code simply untars each one in its
	appropriate place:
		+ (simplified): untar each tarball in the appropriate place
		+ predefined file in each 'place' (domX filesystem)
		  is executed by that environment, so..., in dom0,
		  that file is executed to complete it's upgrade,
		  and then when each domU is instantiated, it boots
		  up and runs that same file in it's environment,
		  completing the upgrade for that domU.
		+ the standby partition is the one that's upgraded.
		+ when upgrade has competed w/o any detected errors,
		  then that node/controller is rebooted, switching
		  the status of the active and standby partitions.
		+ if the new upgraded software isn't working out,
		  the customer can reboot to the standby partition
		  on all controllers/nodes, thereby rolling back to
		  the previously installed, and working, version.
		  Quickly, and w/o fuss.

bios upgrade:

	BIOS upgrade: a *completely* separate operation, TBD.
	remember, the BIOS is just the boot PROM, and should rarely
	ever change.  The BIOS can be upgraded on either controller,
	regardless of which is active or standby, and is done by a
	process on dom0.  A simple /sys Linux driver is all that's
	needed to accomplish the BIOS upgrade.

PXE boot/ACS:

	a special kernel with integral initrd filesystem will be
	used.  this will be in a file residing on each dom0, and
	served by the tftp server to the PXE boot client.  the code
	on the initrd filesystem will perform the ACS.

	the ACS will be accomplished in multiple steps:
		+ first, it will format the flash
		+ second, it will copy the code from the active
		  controller to the installing controller via tarnet.
		  i just made that word up.  essentially tar'ing the
		  files from the active controller's active partition
		  to a socket, which is read by tar on the other
		  controller and written to the flash.
		+ then it will boot the installing controller into
		  first-install-mode after setting a flag that will
		  tell the first install code to copy the config info
		  from the other controller (over simplified, but you
		  get the basic idea)

BIOS/PXE/USB/FLASH boot image magic

	a 'magic' can be used to prevent arbitrary software from
	being PXE or USB booted.  our BIOS vendor can be instructed
	to incorporate this into their code: such a mod will be
	trivial.  in fact, it's a certainty that they have done it
	for other customers before.  not very hard to hack, but
	better than nothing.


--MP_SE+z_QWeAuw3U8b2OFkoxXz--
