X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C88CA5.59770C80@onstor-exch02.onstor.net>; Sat, 22 Mar 2008 22:18:37 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: sub13 issues
Date: Sat, 22 Mar 2008 22:18:36 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E08FC3973@onstor-exch02.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E08FC3945@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: sub13 issues
Thread-Index: AciL5vhUs+/0FYfqQtu0SHCR0dTVzwAU5CZCAAqm0DAAD3Kb4A==
References: <BB375AF679D4A34E9CA8DFA650E2B04E03B5B70E@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E03B5B712@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E08FC3945@onstor-exch02.onstor.net>
From: "Jonathan Goldick" <jonathan.goldick@onstor.com>
To: "John Keiffer" <john.keiffer@onstor.com>,
	"Vikas Saini" <vikas.saini@onstor.com>,
	"dl-QA" <dl-qa@onstor.com>,
	"dl-hcl-qa" <dl-hcl-qa@onstor.com>,
	"dl-Cougar" <dl-Cougar@onstor.com>


I was hoping that #1 was addressed by changelist 28463 but there must be
something more going on.  We will need to get a machine that reliably
reproduces this and do the following (in defect 22921 now):
1. Take all volumes offline.
2. eeepoll off on all FP cores except one.
3. On the FP console that we didn't eeepoll off, run the following
   scsiadmin dbg_level 3
   scsiadmin trace all func
3. Retry the volume create
4. See if scsi replied to the READ/WRITE requests generated by SDM on
behalf of evm_cfgd
5. If it does, then we have to move to SDM and see if it's sending the
responses to evm_cfgd by attaching a debugger there.
6. If SDM does, then we have to move to evm_cfgd and see if it's
processing the responses by attaching a debugger there.


------------------
#2 looks like 22925.  The following changelist missed the submittal
Change 28485 by jong@jong-jong-cifs on 2008/03/21 22:05:51

	Fix memory leak in EVM
	       Reviewed by JobiA
	       This addresses defect 22925

Affected files ...

... //depot/dev/nfx-tree/code/sm-evm-srvr/evm-io.c#14 edit

-----------------------
#3 I have a tentative fix for this.  I need to run spec to see if it
breaks anything, and then get it through the review cycle.  Given the
holiday I don't expect this to be in the tree until Monday afternoon at
the earliest.
-----------------------



-----Original Message-----
From: John Keiffer=20
Sent: Saturday, March 22, 2008 2:41 PM
To: Vikas Saini; dl-QA; dl-hcl-qa; dl-Cougar
Subject: RE: sub13 issues


Regarding 22821: Both systems CAN ping each other and the network. The
issue is that they think they can't sync up the clusterDB or something.
I'm not sure.

-----Original Message-----
From: Vikas Saini=20
Sent: Saturday, March 22, 2008 9:43 AM
To: Vikas Saini; dl-QA; dl-hcl-qa; dl-Cougar
Subject: sub13 issues
Importance: High

Hi All,
   so far we have seen following issues on sub13

1) An issue where "vol create" failed with timeout error message. elogs
displayed lun label read problem. defect 22921 and 22923

2) OPS dropping to zero problem resurfaced on John K system(g6r10,g5r10)
and g11r204(system is still in that state incase someone wants to have a
look)

3) Manny also saw an issue where OPS are dropping to zero for a second
or two.(on g10r204).

4) TXRX crash causing FP crash problem is still happening. this needs to
be fixed ASAP. defect 22448

5) A couple of issues with system upgrade. defect 22919 and a few
others.

Apart from these, 22821 where John K's cluster is still messed and not
able ping each other. we need some resolution on that.
=20

Thanks
Vikas





