X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C752E1.3AE770CC@onstor-exch02.onstor.net>; Sat, 17 Feb 2007 15:16:11 -0700
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C752E1.3AE770CC"
References: <20070216142730.10a7902b@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E0138C465@onstor-exch02.onstor.net> <20070216181346.14f6cbe5@ripper.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E023B314D@onstor-exch02.onstor.net>
Content-class: urn:content-classes:message
Subject: RE: corruption and upgrade workflow for Lambo [and 1.3.3.?]
Date: Sat, 17 Feb 2007 15:14:13 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E02176022@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: corruption and upgrade workflow for Lambo [and 1.3.3.?]
Thread-Index: AcdSOUGBTFM/IMgPRAWU1tGYFef91wAiyNLDAAcj9qA=
From: "Sandrine Boulanger" <sandrine.boulanger@onstor.com>
To: "Paul Hammer" <paul.hammer@onstor.com>,
	"Andy Sharp" <andy.sharp@onstor.com>,
	"Chris Vandever" <chris.vandever@onstor.com>
Cc: "Tim Gardner" <tim.gardner@onstor.com>,
	"Caeli Collins" <caeli.collins@onstor.com>,
	"Eric Barrett" <eric.barrett@onstor.com>,
	"Ed Kwan" <ed.kwan@onstor.com>,
	"Jay Michlin" <jay.michlin@onstor.com>,
	"Larry Scheer" <larry.scheer@onstor.com>,
	"dl-Software" <dl-software@onstor.com>,
	"Raj Kumar" <raj.kumar@onstor.com>

This is a multi-part message in MIME format.

------_=_NextPart_001_01C752E1.3AE770CC
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

I think the current upgrade procedure is fine. If support finds out =
customers too often encounter the discrepancy, then we can consider a =
double upgrade. But then they completely lose their ability to go back =
since we would overwrite the original flashes.


-----Original Message-----
From: Paul Hammer
Sent: Sat 2/17/2007 10:49 AM
To: Andy Sharp; Chris Vandever
Cc: Tim Gardner; Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; =
Larry Scheer; dl-Software; Raj Kumar; Sandrine Boulanger
Subject: RE: corruption and upgrade workflow for Lambo [and 1.3.3.?]
=20
Adding Raj and Sandrine to the thread in case we want to consider this =
in Delorean.

________________________________

From: Andy Sharp
Sent: Fri 2/16/2007 6:13 PM
To: Chris Vandever
Cc: Tim Gardner; Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; =
Larry Scheer; Paul Hammer; dl-Software
Subject: Re: corruption and upgrade workflow for Lambo [and 1.3.3.?]




On Fri, 16 Feb 2007 18:01:35 -0800 "Chris Vandever"
<chris.vandever@onstor.com> wrote:

> My understanding was that the number of files that fail the compare is
> small in comparison with the total number of files that need to be
> upgraded, thus the second upgrade should get everything remaining
> without any problem.

Based on what I've been seeing, I would characterize it as "the number
of files corrupted is small" but doesn't have any relation to the
number being upgraded.  The max number of upgrade iterations using the
method I describe below is 2.  The max number using the corruption
prone method is ... ?

I'm just talkin' 'bout what I bin seen.

> ChrisV
>
> -----Original Message-----
> From: Andy Sharp
> Sent: Friday, February 16, 2007 2:28 PM
> To: Tim Gardner
> Cc: Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; Larry Scheer;
> Paul Hammer; dl-Software
> Subject: Re: corruption and upgrade workflow for Lambo [and 1.3.3.?]
>
>
> On Fri, 16 Feb 2007 14:19:32 -0800 "Tim Gardner"
> <tim.gardner@onstor.com> wrote:
>
> > The documented procedure is to upgrade the secondary flash, run a
> > system compare, and if
> > corrupted files are found, upgrade again. Once you have a successful
> > compare, reboot from the secondary flash.
>
> What I'm concerned about is that the 'upgrade again' is still the
> corruption prone upgrade process.  It is quite possible, I might even
> hazard a 'likely', that a user will have to execute that loop many
> times before chancing on a lucky upgrade that doesn't corrupt
> anything.
>
> > -----Original Message-----
> > From: Andy Sharp
> > Sent: Friday, February 16, 2007 1:19 PM
> > To: Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; Tim Gardner;
> > Larry Scheer; Paul Hammer; dl-Software
> > Subject: corruption and upgrade workflow for Lambo [and 1.3.3.?]
> >
> > Howdy,
> >
> > Since I've been messing about with the upgrade code a bunch for
> > Delorean, I've been doing a lot of upgrades in the past several days
> > in the process of doing unit testing, and one thing I've noticed is
> > that upgrades from 1.3.3 to 2.2 or later always find several files
> > that are corrupted after the upgrade.
> >
> > This is because the upgrade process has a corruption problem, as we
> > all know, which was fixed in 2.2 (and possibly some version of
> > 1.3.3?). However, when you upgrade to 2.2 you use the old,
> > corruption prone, upgrade process.
> >
> > Therefore, I believe the workflow for upgrading from a
> > non-upgrade-fixed release to a fixed release requires that you
> > actually upgrade twice.  You must be running the new version when
> > you upgrade the second time.  So, for the sake of brevity, I will
> > just mention 1.3.3 -> 2.2+ in the following:
> >
> > 1.  Upgrade from 1.3.3 or 2.1 to 2.2
> > 2.  Boot 2.2
> >     Note: you may have problems at this point, since any file
> > could conceivably be corrupted, including one of the .bin boot
> > images for the TXRX or FP processors.  If necessary, log in quickly
> >     after rebooting and kill pm in order to keep the system from
> >     rebooting itself before you can execute the next step.
> > 3.  Upgrade to 2.2 again.  You may use the same tar ball you did in
> >     step 1.
> >
> > Please set aside a decent amount of time for this: upgrades in 2.2
> > are not fast.  It downloads the tarball twice and verifies the
> > entire system twice for each upgrade.  I am fixing these issues in
> > Delorean so we won't have to live with this for too terribly long.
> >
> > Cheers,
> >
> > a




------_=_NextPart_001_01C752E1.3AE770CC
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7652.24">
<TITLE>RE: corruption and upgrade workflow for Lambo [and =
1.3.3.?]</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/plain format -->

<P><FONT SIZE=3D2>I think the current upgrade procedure is fine. If =
support finds out customers too often encounter the discrepancy, then we =
can consider a double upgrade. But then they completely lose their =
ability to go back since we would overwrite the original flashes.<BR>
<BR>
<BR>
-----Original Message-----<BR>
From: Paul Hammer<BR>
Sent: Sat 2/17/2007 10:49 AM<BR>
To: Andy Sharp; Chris Vandever<BR>
Cc: Tim Gardner; Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; =
Larry Scheer; dl-Software; Raj Kumar; Sandrine Boulanger<BR>
Subject: RE: corruption and upgrade workflow for Lambo [and 1.3.3.?]<BR>
<BR>
Adding Raj and Sandrine to the thread in case we want to consider this =
in Delorean.<BR>
<BR>
________________________________<BR>
<BR>
From: Andy Sharp<BR>
Sent: Fri 2/16/2007 6:13 PM<BR>
To: Chris Vandever<BR>
Cc: Tim Gardner; Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; =
Larry Scheer; Paul Hammer; dl-Software<BR>
Subject: Re: corruption and upgrade workflow for Lambo [and 1.3.3.?]<BR>
<BR>
<BR>
<BR>
<BR>
On Fri, 16 Feb 2007 18:01:35 -0800 &quot;Chris Vandever&quot;<BR>
&lt;chris.vandever@onstor.com&gt; wrote:<BR>
<BR>
&gt; My understanding was that the number of files that fail the compare =
is<BR>
&gt; small in comparison with the total number of files that need to =
be<BR>
&gt; upgraded, thus the second upgrade should get everything =
remaining<BR>
&gt; without any problem.<BR>
<BR>
Based on what I've been seeing, I would characterize it as &quot;the =
number<BR>
of files corrupted is small&quot; but doesn't have any relation to =
the<BR>
number being upgraded.&nbsp; The max number of upgrade iterations using =
the<BR>
method I describe below is 2.&nbsp; The max number using the =
corruption<BR>
prone method is ... ?<BR>
<BR>
I'm just talkin' 'bout what I bin seen.<BR>
<BR>
&gt; ChrisV<BR>
&gt;<BR>
&gt; -----Original Message-----<BR>
&gt; From: Andy Sharp<BR>
&gt; Sent: Friday, February 16, 2007 2:28 PM<BR>
&gt; To: Tim Gardner<BR>
&gt; Cc: Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; Larry =
Scheer;<BR>
&gt; Paul Hammer; dl-Software<BR>
&gt; Subject: Re: corruption and upgrade workflow for Lambo [and =
1.3.3.?]<BR>
&gt;<BR>
&gt;<BR>
&gt; On Fri, 16 Feb 2007 14:19:32 -0800 &quot;Tim Gardner&quot;<BR>
&gt; &lt;tim.gardner@onstor.com&gt; wrote:<BR>
&gt;<BR>
&gt; &gt; The documented procedure is to upgrade the secondary flash, =
run a<BR>
&gt; &gt; system compare, and if<BR>
&gt; &gt; corrupted files are found, upgrade again. Once you have a =
successful<BR>
&gt; &gt; compare, reboot from the secondary flash.<BR>
&gt;<BR>
&gt; What I'm concerned about is that the 'upgrade again' is still =
the<BR>
&gt; corruption prone upgrade process.&nbsp; It is quite possible, I =
might even<BR>
&gt; hazard a 'likely', that a user will have to execute that loop =
many<BR>
&gt; times before chancing on a lucky upgrade that doesn't corrupt<BR>
&gt; anything.<BR>
&gt;<BR>
&gt; &gt; -----Original Message-----<BR>
&gt; &gt; From: Andy Sharp<BR>
&gt; &gt; Sent: Friday, February 16, 2007 1:19 PM<BR>
&gt; &gt; To: Caeli Collins; Eric Barrett; Ed Kwan; Jay Michlin; Tim =
Gardner;<BR>
&gt; &gt; Larry Scheer; Paul Hammer; dl-Software<BR>
&gt; &gt; Subject: corruption and upgrade workflow for Lambo [and =
1.3.3.?]<BR>
&gt; &gt;<BR>
&gt; &gt; Howdy,<BR>
&gt; &gt;<BR>
&gt; &gt; Since I've been messing about with the upgrade code a bunch =
for<BR>
&gt; &gt; Delorean, I've been doing a lot of upgrades in the past =
several days<BR>
&gt; &gt; in the process of doing unit testing, and one thing I've =
noticed is<BR>
&gt; &gt; that upgrades from 1.3.3 to 2.2 or later always find several =
files<BR>
&gt; &gt; that are corrupted after the upgrade.<BR>
&gt; &gt;<BR>
&gt; &gt; This is because the upgrade process has a corruption problem, =
as we<BR>
&gt; &gt; all know, which was fixed in 2.2 (and possibly some version =
of<BR>
&gt; &gt; 1.3.3?). However, when you upgrade to 2.2 you use the old,<BR>
&gt; &gt; corruption prone, upgrade process.<BR>
&gt; &gt;<BR>
&gt; &gt; Therefore, I believe the workflow for upgrading from a<BR>
&gt; &gt; non-upgrade-fixed release to a fixed release requires that =
you<BR>
&gt; &gt; actually upgrade twice.&nbsp; You must be running the new =
version when<BR>
&gt; &gt; you upgrade the second time.&nbsp; So, for the sake of =
brevity, I will<BR>
&gt; &gt; just mention 1.3.3 -&gt; 2.2+ in the following:<BR>
&gt; &gt;<BR>
&gt; &gt; 1.&nbsp; Upgrade from 1.3.3 or 2.1 to 2.2<BR>
&gt; &gt; 2.&nbsp; Boot 2.2<BR>
&gt; &gt;&nbsp;&nbsp;&nbsp;&nbsp; Note: you may have problems at this =
point, since any file<BR>
&gt; &gt; could conceivably be corrupted, including one of the .bin =
boot<BR>
&gt; &gt; images for the TXRX or FP processors.&nbsp; If necessary, log =
in quickly<BR>
&gt; &gt;&nbsp;&nbsp;&nbsp;&nbsp; after rebooting and kill pm in order =
to keep the system from<BR>
&gt; &gt;&nbsp;&nbsp;&nbsp;&nbsp; rebooting itself before you can =
execute the next step.<BR>
&gt; &gt; 3.&nbsp; Upgrade to 2.2 again.&nbsp; You may use the same tar =
ball you did in<BR>
&gt; &gt;&nbsp;&nbsp;&nbsp;&nbsp; step 1.<BR>
&gt; &gt;<BR>
&gt; &gt; Please set aside a decent amount of time for this: upgrades in =
2.2<BR>
&gt; &gt; are not fast.&nbsp; It downloads the tarball twice and =
verifies the<BR>
&gt; &gt; entire system twice for each upgrade.&nbsp; I am fixing these =
issues in<BR>
&gt; &gt; Delorean so we won't have to live with this for too terribly =
long.<BR>
&gt; &gt;<BR>
&gt; &gt; Cheers,<BR>
&gt; &gt;<BR>
&gt; &gt; a<BR>
<BR>
<BR>
<BR>
</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C752E1.3AE770CC--
