X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C881C9.0A7A86F0@onstor-exch02.onstor.net>; Sun, 9 Mar 2008 02:36:24 -0700
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C881C9.0A7A86F0"
Content-class: urn:content-classes:message
Subject: RE: in-branch testing status for fb-jong-perf2
Date: Sun, 9 Mar 2008 02:36:24 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E08C0F5DF@onstor-exch02.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E08C0F542@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: in-branch testing status for fb-jong-perf2
Thread-Index: AciA0GA6mppkfybfRe2ItgQQzYJ9lwA9gTKQ
References: <BB375AF679D4A34E9CA8DFA650E2B04E08C0F542@onstor-exch02.onstor.net>
From: "Jonathan Goldick" <jonathan.goldick@onstor.com>
To: "dl-Cougar" <dl-Cougar@onstor.com>

This is a multi-part message in MIME format.

------_=_NextPart_001_01C881C9.0A7A86F0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Tim and I made a lot of progress today.
I think that the QA folks should be able to try again to get some
testing done with my branch.  I have updated all targets.


The dbg build seems pretty stable now, but it's pretty slow.  We have
done a lot of Cougar and Bobcat dbg testing and there are no outstanding
crashes.

Note that I have not really done any opt testing yet, we just did a
smoke test using dump/restore via ndmpc.

Since yesterday:
1.	NDMP dump and restore work now.
2.	The FS and scsi sanity and checks now pass
3.	I found the cougar-specific problems in my resource leak
detection logic but have left it ifdef'd out.
4.	I have passed the 4000 parallel client read/write test in
nfsperftest with a caveat that a random io size has a problem I need to
resolve later, NFS seems to return more data than was requested.  This
shouldn't be related to my changes but we'll see.

What doesn't work yet in decreasing priority:
1.	The dbg build is too slow.  I will look into it via kpi(s) later
on.
2.	The I/O coalescing knob for log writes does not work when set
beyond 32, it starts sending smaller chunks.  I can reproduce this
without Cougar.  I have set the defaults to be 32 since that works
properly on all platforms.
3.	My scsi resource leak detection logic doesn't work for Cougar so
I've ifdef'd it out for that platform.  I will revisit it later on.



_____________________________________________
From: Jonathan Goldick=20
Sent: Friday, March 07, 2008 7:56 PM
To: dl-Cougar
Subject: in-branch testing status for fb-jong-perf2

Raj and Sandrine have helped me make some real problem in shaking out
the Cougar-specific bugs.

It would appear that our LUN labeling and missing LUN problems are
resolved as well the device id mismatch that Tim was working on.  Cougar
soak seems to work reliably.

We can run nfsperftest with a variety of load profiles and I/O(s) are
being coalesced.

What doesn't work yet in decreasing priority:
4.	NDMP dump is failing early on so I broke tape I/O on Bobcat.
Raj has a scsi trace but I will likely need help from Tim.
5.	I fail one of my underflow sanity checks when we take a file
system exception.  I'm not sure if it's a true double free or a bug in
my stats.
6.	The I/O coalescing knob for log writes does not work when set
beyond 32, it starts sending smaller chunks.  I can reproduce this
without Cougar.  I have set the defaults to be 32 since that works
properly on all platforms.
7.	My scsi resource leak detection logic doesn't work for Cougar so
I've ifdef'd it out for that platform.  I will revisit it later on.



------_=_NextPart_001_01C881C9.0A7A86F0
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7653.38">
<TITLE>RE: in-branch testing status for fb-jong-perf2</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">Tim and I made a lot of progress =
today.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">I think that the QA folks should be able to try =
again to get some testing done with my branch.&nbsp; I have updated all =
targets.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">The dbg =
build seems pretty stable now, but it</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">s pretty slow.&nbsp; We =
have done a lot of Cougar and Bobcat dbg testing and there are no =
outstanding crashes</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">Note that</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">I have not really done any opt te</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">sting yet</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">, we just did a smoke test =
using dump/restore</FONT><FONT SIZE=3D2 FACE=3D"Arial"> via =
ndmpc</FONT><FONT SIZE=3D2 FACE=3D"Arial">.</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">Since =
yesterday:</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">1.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
<FONT SIZE=3D2 FACE=3D"Arial">NDMP dump and restore work =
now.</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">2.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">The FS</FONT> <FONT SIZE=3D2 FACE=3D"Arial">and =
scsi</FONT> <FONT SIZE=3D2 FACE=3D"Arial">sanity and checks now =
pass</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">3.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">I found the cougar-specific problems in my resource leak =
detection logic but have left it ifdef</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">d out.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">4.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">I have passed the 4000 parallel client read/write test in =
nfsperftest with a caveat that a random io size has a problem I need to =
resolve later</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">, NFS seems to return more =
data than was requested.&nbsp; This shouldn</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">t be related to my changes =
but we</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">ll see.</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">What doesn</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 FACE=3D"Arial">t work yet in =
decreasing priority:</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">1.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">The dbg build is too slow.</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&nbsp; I will look into it via kpi(s) later =
on.</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">2.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">The I/O coalescing knob f</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">or log writes does not work when set beyond 32, it starts =
sending smaller chunks.&nbsp; I can reproduce this without Cougar.&nbsp; =
I have set the defaults to be 32 since that works</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">properly on all platforms.</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">3.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">My scsi resource leak detection logic doesn</FONT><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 FACE=3D"Arial">t =
work for</FONT> <FONT SIZE=3D2 FACE=3D"Arial">Cougar so I</FONT><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 FACE=3D"Arial">ve =
ifdef</FONT><FONT SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">d it out for that platform.&nbsp; I will revisit it later =
on</FONT><FONT SIZE=3D2 FACE=3D"Arial">.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 =
FACE=3D"Tahoma">_____________________________________________<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">From:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Jonathan Goldick<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Sent:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Friday, March 07, 2008 =
7:56 PM<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">To:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> dl-Cougar<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Subject:</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma"> in-branch testing status for =
fb-jong-perf2</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">Raj and Sandrine have helped me make some real =
problem in shaking out the Cougar-specific bugs.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">It would =
appear that our LUN labeling and missing LUN problems are resolved as =
well the d</FONT><FONT SIZE=3D2 FACE=3D"Arial">evice id mismatch that =
Tim was working on.&nbsp; Cougar soak seems to work =
reliably.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">We can =
run nfsperftest with a variety of load profiles and I/O(s) are being =
coalesced.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">What =
doesn</FONT><FONT SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">t work yet in decreasing priority:</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">4.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">NDMP dump is failing early on so I bro</FONT><FONT =
SIZE=3D2 FACE=3D"Arial">ke tape I/O on Bobcat.&nbsp; Raj has a scsi =
trace but I will likely need help from Tim.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">5.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">I fail one of my underflow sanity checks when we take a =
file system exception.&nbsp; I</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 FACE=3D"Arial">m not sure if =
it</FONT><FONT SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">s a true double free or a bug in my =
stats.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">6.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">The I/O coalescing knob f</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">or log writes does not work when set beyond 32, it starts =
sending smaller chunks.&nbsp; I can reproduce this without Cougar.&nbsp; =
I have set the defaults to be 32 since that works properly on all =
platforms.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">7.&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT> <FONT SIZE=3D2 =
FACE=3D"Arial">My scsi resource leak detection logic doesn</FONT><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 FACE=3D"Arial">t =
work for</FONT> <FONT SIZE=3D2 FACE=3D"Arial">Cougar so I</FONT><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 FACE=3D"Arial">ve =
ifdef</FONT><FONT SIZE=3D2 FACE=3D"Arial">&#8217;</FONT><FONT SIZE=3D2 =
FACE=3D"Arial">d it out for that platform.&nbsp; I will revisit it later =
on.</FONT></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P DIR=3DLTR><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

</BODY>
</HTML>
------_=_NextPart_001_01C881C9.0A7A86F0--
