AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<Brian.Stark@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/LSI/INBOX	0	E1EC65251D4B3D46BBC0AAA3C0629222B2392E58@cosmail02.lsi.com
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 20 Jan 2010 10:06:24 -0800
From: Andrew Sharp <andy.sharp@lsi.com>
To: "Stark, Brian" <Brian.Stark@lsi.com>
Subject: Re: CF Cards from Migration
Message-ID: <20100120100624.1ac3d8a3@ripper.onstor.net>
In-Reply-To: <E1EC65251D4B3D46BBC0AAA3C0629222B2392E58@cosmail02.lsi.com>
References: <B50ED5C0A7967343B07C1764EE1D7BD1F4A0962D@cosmail01.lsi.com>
	<20100114185615.25e76b9a@ripper.onstor.net>
	<E1EC65251D4B3D46BBC0AAA3C0629222B2392CF9@cosmail02.lsi.com>
	<20100120095342.6b080ed1@ripper.onstor.net>
	<E1EC65251D4B3D46BBC0AAA3C0629222B2392E58@cosmail02.lsi.com>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Wed, 20 Jan 2010 10:54:55 -0700 "Stark, Brian" <Brian.Stark@lsi.com>
wrote:

> OK, I'll light a fire.
> 
> Where are you?  In Denver?

Houston.

> 
> -----Original Message-----
> From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> Sent: Wednesday, January 20, 2010 9:54 AM
> To: Stark, Brian
> Subject: Re: CF Cards from Migration
> 
> On Tue, 19 Jan 2010 22:07:17 -0700 "Stark, Brian"
> <Brian.Stark@lsi.com> wrote:
> 
> > Have you received any feedback on this?  I think you hit the nail on
> > the head...
> 
> 
> You're the first.
> 
> > 
> > -----Original Message-----
> > From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> > Sent: Thursday, January 14, 2010 6:56 PM
> > To: Duffy, Bill
> > Cc: Kwan, Ed; Jin, Danqing; Limato, Dave; Thiessen, Joachim;
> > Collins, Caeli; Keiffer, John; Scheer, Larry; Boulanger, Sandrine;
> > Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris Subject:
> > Re: CF Cards from Migration
> > 
> > Keep in mind that we're all LSI.  4.2.0.10 is our official release
> > right now, no?  We don't have the resources to support a half-dozen
> > different releases, it's already hurting us with just the small
> > number of different releases involved in this schlmozzle.
> > 
> > Given that, I would say that, being sensitive to [internal] customer
> > fears, we need to make it clear that there isn't necessarily a
> > "choice" for them to make, beyond the choice to migrate or not.
> > Just as in LSI storage arrays, the NAS gateway isn't 100% bug free,
> > and we have to be able to advance the releases without hindrance
> > from the [internal] customer.  Put another way, it's really our
> > decision, we are the experts.  If problems crop up, we support,
> > fix, workaround them as usual.  If we the experts believe 4.2.0.10
> > is the correct thing to do, then that's it.
> > 
> > Thoughts?
> > 
> > 
> > On Thu, 14 Jan 2010 18:10:12 -0700 "Duffy, Bill"
> > <Bill.Duffy@lsi.com> wrote:
> > 
> > > Lsi said no way
> > > 
> > > ________________________________
> > > From: Kwan, Ed
> > > To: Duffy, Bill; Jin, Danqing; Limato, Dave; Thiessen, Joachim;
> > > Collins, Caeli; Keiffer, John; Scheer, Larry; Sharp, Andy;
> > > Boulanger, Sandrine Cc: Currin, Shawn; Kumar, Raj; Stark, Brian;
> > > Vandever, Chris Sent: Thu Jan 14 18:09:43 2010 Subject: RE: CF
> > > Cards from Migration Why wasn't 4.0.2.10 installed in the first
> > > place?
> > > 
> > > From: Duffy, Bill
> > > Sent: Thursday, January 14, 2010 5:07 PM
> > > To: Jin, Danqing; Limato, Dave; Thiessen, Joachim; Kwan, Ed;
> > > Collins, Caeli; Keiffer, John; Scheer, Larry; Sharp, Andy;
> > > Boulanger, Sandrine Cc: Currin, Shawn; Kumar, Raj; Stark, Brian;
> > > Vandever, Chris Subject: Re: CF Cards from Migration
> > > 
> > > 
> > > So 4.0.2.10 logged in as root should be the only release used for
> > > transition going forward. Correct?
> > > 
> > > ________________________________
> > > From: Jin, Danqing
> > > To: Limato, Dave; Thiessen, Joachim; Kwan, Ed; Collins, Caeli;
> > > Keiffer, John; Scheer, Larry; Sharp, Andy; Boulanger, Sandrine Cc:
> > > Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris; Duffy,
> > > Bill Sent: Thu Jan 14 17:58:30 2010 Subject: RE: CF Cards from
> > > Migration Yes, "system version -s" and "system copy all" both
> > > failed on minonstor1.
> > > 
> > > From: Limato, Dave
> > > Sent: Thursday, January 14, 2010 4:58 PM
> > > To: Thiessen, Joachim; Kwan, Ed; Collins, Caeli; Jin, Danqing;
> > > Keiffer, John; Scheer, Larry; Sharp, Andy; Boulanger, Sandrine Cc:
> > > Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris; Duffy,
> > > Bill Subject: RE: CF Cards from Migration
> > > 
> > > It looks like system version -s and system copy all -I failed
> > > around 14:23 which indicates trouble mounting other flash I think.
> > > 
> > > From: Thiessen, Joachim
> > > Sent: Thursday, January 14, 2010 4:56 PM
> > > To: Kwan, Ed; Collins, Caeli; Jin, Danqing; Keiffer, John; Limato,
> > > Dave; Scheer, Larry; Sharp, Andy; Boulanger, Sandrine Cc: Currin,
> > > Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris; Duffy, Bill
> > > Subject: RE: CF Cards from Migration
> > > 
> > > Would a "system copy all -i"  or "system config copy" expose the
> > > problem that we booted from the wrong CF card?
> > > 
> > > From: Kwan, Ed
> > > Sent: Thursday, January 14, 2010 4:39 PM
> > > To: Collins, Caeli; Jin, Danqing; Keiffer, John; Limato, Dave;
> > > Scheer, Larry; Sharp, Andy; Boulanger, Sandrine Cc: Currin, Shawn;
> > > Kumar, Raj; Stark, Brian; Vandever, Chris; Duffy, Bill; Thiessen,
> > > Joachim Subject: RE: CF Cards from Migration
> > > 
> > > 4.0.2.8 and 4.0.2.4
> > > 
> > > From: Collins, Caeli
> > > Sent: Thursday, January 14, 2010 4:37 PM
> > > To: Kwan, Ed; Jin, Danqing; Keiffer, John; Limato, Dave; Scheer,
> > > Larry; Sharp, Andy; Boulanger, Sandrine Cc: Currin, Shawn; Kumar,
> > > Raj; Stark, Brian; Vandever, Chris; Duffy, Bill; Thiessen, Joachim
> > > Subject: RE: CF Cards from Migration
> > > 
> > > What release were they running?
> > > 
> > > Caeli
> > > 
> > > From: Kwan, Ed
> > > Sent: Thursday, January 14, 2010 4:27 PM
> > > To: Jin, Danqing; Keiffer, John; Limato, Dave; Scheer, Larry;
> > > Sharp, Andy; Boulanger, Sandrine Cc: Currin, Shawn; Kumar, Raj;
> > > Stark, Brian; Vandever, Chris; Collins, Caeli Subject: RE: CF
> > > Cards from Migration
> > > 
> > > CC'ing Caeli.
> > > 
> > > From: Jin, Danqing
> > > Sent: Thursday, January 14, 2010 4:21 PM
> > > To: Keiffer, John; Limato, Dave; Scheer, Larry; Sharp, Andy;
> > > Boulanger, Sandrine Cc: Currin, Shawn; Kumar, Raj; Stark, Brian;
> > > Vandever, Chris; Kwan, Ed Subject: RE: CF Cards from Migration
> > > 
> > > John pointed out that the filer booted off to a wrong flash card
> > > and the following message:
> > > 
> > > Jan  9 13:55:22 minonstor1 kernel: prom_init: env[9] =
> > > 'bootdev=/dev/sda1' ...
> > > Jan  9 13:55:22 minonstor1 kernel: irq 56: nobody cared (try
> > > booting with the "irqpoll" option) Jan  9 13:55:22 minonstor1
> > > kernel: Call Trace: Jan  9 13:55:22 minonstor1 kernel:
> > > [<ffffffff82007888>] dump_stack+0x8/0x38 Jan  9 13:55:22
> > > minonstor1 kernel: [<ffffffff82050f90>]
> > > __report_bad_irq+0x40/0xd8 ...
> > > 
> > > So this really smells like defect 27788 which Andy already fixed
> > > (included in patch 4.0.2.10 and later)?
> > > 
> > > From: Keiffer, John
> > > Sent: Thursday, January 14, 2010 2:57 PM
> > > To: Keiffer, John; Limato, Dave; Scheer, Larry; Sharp, Andy;
> > > Boulanger, Sandrine Cc: Currin, Shawn; Kumar, Raj; Stark, Brian;
> > > Vandever, Chris; Kwan, Ed; Jin, Danqing Subject: RE: CF Cards from
> > > Migration
> > > 
> > > From minonstor1:
> > > 
> > > Jan  9 16:39:34 minonstor1 : 0:0:ncm:WARNING: ncmd :
> > > ncm_filer_rsp_complete: rpc_rsp[0x52ca48] flags[1004] sz[3176]
> > > len[3176] dest_appid[39] status[-19]     failed Jan  9 16:39:35
> > > minonstor1 : 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete:
> > > rpc_rsp[0x569b78] flags[1004] sz[3176] len[3176] dest_appid[39]
> > > status[-19]     failed Jan  9 16:39:35 minonstor1 :
> > > 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete: rpc_rsp[0x5136c0]
> > > flags[1004] sz[3176] len[3176] dest_appid[39] status[-19]
> > > failed Jan  9 16:39:35 minonstor1 : 0:0:ncm:WARNING: ncmd :
> > > ncm_filer_rsp_complete: rpc_rsp[0x521e80] flags[1004] sz[1816]
> > > len[1816] dest_appid[39] status[-19]     failed Jan  9 16:39:35
> > > minonstor1 : 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete:
> > > rpc_rsp[0x521fe8] flags[1004] sz[3176] len[3176] dest_appid[39]
> > > status[-19]     failed Jan  9 16:39:35 minonstor1 :
> > > 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete: rpc_rsp[0x5680b0]
> > > flags[1004] sz[960] len[960] dest_appid[39] status[-19] failed Jan
> > > 9 16:39:35 minonstor1 : 0:0:ncm:WARNING: ncmd :
> > > ncm_filer_rsp_complete: rpc_rsp[0x521b60] flags[1004] sz[8048]
> > > len[8048] dest_appid[39] status[-19]     failed Jan  9 16:39:35
> > > minonstor1 : 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete:
> > > rpc_rsp[0x521da0] flags[1004] sz[8048] len[8048] dest_appid[39]
> > > status[-19]     failed Jan  9 16:39:35 minonstor1 :
> > > 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete: rpc_rsp[0x52c808]
> > > flags[1004] sz[3176] len[3176] dest_appid[39] status[-19]
> > > failed Jan  9 16:39:35 minonstor1 : 0:0:ncm:WARNING: ncmd :
> > > ncm_filer_rsp_complete: rpc_rsp[0x52c8c8] flags[1004] sz[3176]
> > > len[3176] dest_appid[39] status[-19]     failed Jan  9 16:39:35
> > > minonstor1 : 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete:
> > > rpc_rsp[0x56ad60] flags[1004] sz[8048] len[8048] dest_appid[39]
> > > status[-19]     failed Jan  9 16:39:35 minonstor1 :
> > > 0:0:ncm:WARNING: ncmd : ncm_filer_rsp_complete: rpc_rsp[0x567e28]
> > > flags[4] sz[8048] len[8048] dest_appid[39] status[-19] failed Jan
> > > 9 16:39:35 minonstor1 : 0:0:ncm:WARNING: ncmd :
> > > ncm_filer_rsp_complete: rpc_rsp[0x52c988] flags[4] sz[960]
> > > len[960] dest_appid[39] status[-19] failed Jan  9 16:39:36
> > > minonstor1 : 0:0:nfxsh:NOTICE: cmd[0]: vsvr set "VS_MGMT_67686" :
> > > status[0] Jan 9 16:39:36 minonstor1 : 0:0:nfxsh:NOTICE: cmd[1]:
> > > interface show interface : status[0] Jan  9 16:39:36 minonstor1 :
> > > 0:0:snmpd:INFO: getVolumeSummary: got rsp status error (0) Jan  9
> > > 16:39:36 minonstor1 : 0:0:snmpd:INFO: read_volume_info: Can't get
> > > vol summary info (rc=-4)
> > > 
> > > From: Keiffer, John
> > > Sent: Thursday, January 14, 2010 2:55 PM
> > > To: Limato, Dave; Scheer, Larry; Sharp, Andy; Boulanger, Sandrine
> > > Cc: Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris;
> > > Kwan, Ed; Jin, Danqing Subject: RE: CF Cards from Migration
> > > 
> > > This is hard to sift through.
> > > 
> > > Anybody else agree that it appears to have gone wonky around 16:39
> > > on 1/9?
> > > 
> > > At that time I believe we were on Top-CF0 and Bottom-CF1...
> > > 
> > > Seems to me like after they added the trunk1 interface things got
> > > messed up? Seems to have led to ncm warnings and ea issues etc...
> > > 
> > > Jan  9 16:38:45 minonstor2 : 0:0:nfxsh:NOTICE: cmd[0]: vsvr set
> > > "MINFSV06" : status[0] Jan  9 16:38:46 minonstor2 :
> > > 0:0:nfxsh:NOTICE: cmd[1]: vsvr stats -i 1 -c 1 : status[0] Jan  9
> > > 16:38:49 minonstor2 : 0:0:ea:INFO: nfxnis_resRcv[3050]:
> > > DNS[192.19.189.10] closed connection, VS=6. err=0 Jan  9 16:39:00
> > > minonstor2 : 0:0:nfxsh:NOTICE: cmd[6]: interface create trunk1 -l
> > > trunk1 : status[0] Jan  9 16:39:11 minonstor2 : 0:0:ncm:WARNING:
> > > ncmd : ncm_local_rpc_received: ncm_forward_to_filer failed - -9
> > > Jan  9 16:39:15 minonstor2 last message repeated 3 times Jan  9
> > > 16:39:17 minonstor2 : 0:0:nfxsh:NOTICE: cmd[7]: interface show  :
> > > status[4] Jan  9 16:39:18 minonstor2 : 0:0:ncm:WARNING: ncmd :
> > > ncm_local_rpc_received: ncm_forward_to_filer failed - -9 Jan  9
> > > 16:39:21 minonstor2 last message repeated 2 times Jan  9 16:39:23
> > > minonstor2 : 0:0:vsd:INFO: vsd_ipStack_initCtxt : There is no IP
> > > interface configured for vs 1
> > > 
> > > From: Limato, Dave
> > > Sent: Thursday, January 14, 2010 1:14 PM
> > > To: Keiffer, John; Scheer, Larry; Sharp, Andy; Boulanger, Sandrine
> > > Cc: Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris;
> > > Kwan, Ed; Jin, Danqing Subject: RE: CF Cards from Migration
> > > 
> > > I hear Ed request Danqing construct a timeline of what happened.
> > > In the meantime, if you can figure out which flash/node was the
> > > PCC and which was the second node of the cluster. This will help
> > > others with diagnosis.
> > > 
> > > From: Keiffer, John
> > > Sent: Thursday, January 14, 2010 1:11 PM
> > > To: Limato, Dave; Scheer, Larry; Sharp, Andy; Boulanger, Sandrine
> > > Cc: Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris;
> > > Kwan, Ed; Jin, Danqing Subject: RE: CF Cards from Migration
> > > 
> > > This will take a while to sift through. Would be nice if we had a
> > > timeline for when it supposedly went bad, and on which system it
> > > first was reported against.
> > > 
> > > I see that when the top blade initially booted it booted to CF1,
> > > which it is not supposed to... but that's probably nothing at this
> > > point.
> > > 
> > > Jan  9 13:23:55 localhost kernel: prom_init: env[9] =
> > > 'bootdev=/dev/sdb1'
> > > 
> > > From: Limato, Dave
> > > Sent: Thursday, January 14, 2010 1:00 PM
> > > To: Limato, Dave; Scheer, Larry; Sharp, Andy; Keiffer, John;
> > > Boulanger, Sandrine Cc: Currin, Shawn; Kumar, Raj; Stark, Brian;
> > > Vandever, Chris; Kwan, Ed; Jin, Danqing Subject: RE: CF Cards from
> > > Migration
> > > 
> > > I have copied all the data from the flash cards to
> > > 
> > > 10.0.0.222:/nx_corevol/defect_27946
> > > 
> > > From: Limato, Dave
> > > Sent: Thursday, January 14, 2010 11:23 AM
> > > To: Scheer, Larry; Sharp, Andy; Keiffer, John; Boulanger, Sandrine
> > > Cc: Currin, Shawn; Kumar, Raj; Stark, Brian; Vandever, Chris;
> > > Kwan, Ed; Jin, Danqing Subject: CF Cards from Migration
> > > 
> > > I have the CF Cards from that migration. I am going to pull all
> > > of /var and /onstor/conf. Does anyone think we need anything else
> > > to debug this issue.  Let me know. I will also try and copy all
> > > of / but not sure how long that will take.
> > > 
> > > 
> > > 
> > > Dave Limato - Sr. QA Engineer - LSI Corporation - ONStor Product
> > > Test
> > > - desk 408-433-8742  - cell 510.329.9994 -- dave.limato@lsi.com
> > > 
