X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C7FF9F.BB586464@onstor-exch02.onstor.net>; Tue, 25 Sep 2007 10:13:11 -0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: NFS sequential write performance
Date: Tue, 25 Sep 2007 10:13:14 -0800
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E05B46829@onstor-exch02.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E019AF8C2@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: NFS sequential write performance
Thread-Index: Acf/mipjyus+Vc6NTkKJpNKDXPsIPwAAPodNAACfvAAAAGExZgAAEPVA
From: "Fay Chong" <fay.chong@onstor.com>
To: "Jonathan Goldick" <jonathan.goldick@onstor.com>,
	"Maxim Kozlovsky" <maxim.kozlovsky@onstor.com>,
	"Paul Hammer" <paul.hammer@onstor.com>,
	"Jobi Ariyamannil" <jobi.ariyamannil@onstor.com>,
	"Brian Montero" <brian.montero@onstor.com>,
	"Andy Sharp" <andy.sharp@onstor.com>
Cc: "Amit Bothra" <amit.bothra@onstor.com>,
	"Bill Nadzam" <bill.nadzam@onstor.com>,
	"Bob Miller" <bob.miller@onstor.com>,
	"Brian DeForest" <brian.deforest@onstor.com>,
	"John Rogers" <john.rogers@onstor.com>,
	"Fay Chong" <fay.chong@onstor.com>

Great information. What network cards, what clients, and which version
of BSD are they using?=20
Thanks
Fay

-----Original Message-----
From: Jonathan Goldick=20
Sent: Tuesday, September 25, 2007 11:09 AM
To: Maxim Kozlovsky; Paul Hammer; Fay Chong; Jobi Ariyamannil; Brian
Montero; Andy Sharp
Cc: Amit Bothra; Bill Nadzam; Bob Miller; Brian DeForest
Subject: Re: NFS sequential write performance

Btw, just got back from Yahoo and they say that the newer BSD has the
perf problem only with certain network cards, they are going to have to
swap out the NIC(s) on 2000 clients.  This factoid is to emphasis the
need to use identical clients in our testing.

-----Original Message-----
From: Maxim Kozlovsky
To: Paul Hammer; Fay Chong; Jonathan Goldick; Jobi Ariyamannil; Brian
Montero; Andy Sharp
CC: Amit Bothra; Bill Nadzam; Bob Miller; Brian DeForest
Sent: Tue Sep 25 11:04:07 2007
Subject: RE: NFS sequential write performance

Sure we can start now. I am still going to find out what is going on
with the 10x file copy slowdown in 19041. If it is caused by a different
issue I would like to fix that problem first.

=20

________________________________

From: Paul Hammer=20
Sent: Tuesday, September 25, 2007 10:56 AM
To: Maxim Kozlovsky; Fay Chong; Jonathan Goldick; Jobi Ariyamannil;
Brian Montero; Andy Sharp
Cc: Amit Bothra; Bill Nadzam; Bob Miller; Brian DeForest
Subject: RE: NFS sequential write performance

=20

Max,

=20

Very nice progress.=20

=20

Looks like we need to do this in R98; we can make the call later if this
drives the whole R98 schedule, i.e. pulls the ship date in possibly if
the work is done and tested earlier than 12/31.

=20

Can we start this work now?=20

=20

I copied Bill and Amit since they are both working on similar or perhaps
related issues, want to make sure we are all aligned and not working on
the same issues interpedently.=20

=20

Thanks,

=20

-Paul

=20

=20

=20

________________________________

From: Maxim Kozlovsky
Sent: Tue 9/25/2007 10:33 AM
To: Fay Chong; Paul Hammer; Jonathan Goldick; Jobi Ariyamannil; Brian
Montero; Andy Sharp
Subject: NFS sequential write performance

Here is the summary of yesterday's experiments:

Linux RH3 and BSD 6.2 performance was acceptable, BSD was somewhat
better than Linux. This can be attributed to the hardware difference. We
need to rerun the test on the identical hardware if anybody is
interested to find out what exactly is the difference between these two
OSs.

Linux RH5 - the problem is how we handle the unaligned writes. This bad
news is that it is not easy to fix. The good news is that I have written
most of the code to handle it. It will of course now take me twice the
time to finish the code to it could take if I got a change to work on it
back in March. In any case, this is a release project and not something
that can be patched.

BSD 4.1 - The problem is BSD committing too often. Not much we can do
about it except writing the data faster during commits. Implementing
large I/Os may help, which is another thing I was going to do in March.
This proves once again that we should do as I say and everything will be
fine.

I could not reproduce the issue with bug 19041 (slow copy of the files)
in the test environment. RH5 is slower than RH3 but not by a factor of
10 as when copying on compile2 with dogfood as the server. One of the
differences between the test system and dogfood is the link aggregation
on the dogfood. We need to setup the test system with the same link
configuration as dogfood and try again. We may also need to try the same
linux version as on compile2. Fay, could you please make the changes in
the test setup?

Max

