AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080717101830.3438ac5f@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<ed.kwan@onstor.com>,<vikas.saini@onstor.com>,<paul.hammer@onstor.com>,<raj.kumar@onstor.com>,<brian.nguyen@onstor.com>,<jonathan.goldick@onstor.com>,<john.rogers@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0AF1DF49@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Thu, 17 Jul 2008 10:19:09 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Ed Kwan" <ed.kwan@onstor.com>
Cc: "Vikas Saini" <vikas.saini@onstor.com>, "Paul Hammer"
 <paul.hammer@onstor.com>, "Raj Kumar" <raj.kumar@onstor.com>, "Brian
 Nguyen" <brian.nguyen@onstor.com>, "Jonathan Goldick"
 <jonathan.goldick@onstor.com>, John Rogers <john.rogers@onstor.com>
Subject: Re: Permabit script for corruption
Message-ID: <20080717101909.08a88175@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0AF1DF49@onstor-exch02.onstor.net>
References: <BB375AF679D4A34E9CA8DFA650E2B04E0AF1DD64@onstor-exch02.onstor.net>
	<20080717033341.4bc8baa4@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E03B5B7E5@onstor-exch02.onstor.net>
	<20080717095907.481b0b0e@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0AF1DF49@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

The "cp" command that shows last in the log file is still hung on my
system.  If anyone cares to take a look.  Talk about a "performance"
problem...

On Thu, 17 Jul 2008 10:11:07 -0700 "Ed Kwan" <ed.kwan@onstor.com> wrote:

> Brian and I have been running the script against 3.1.1.3 and 3.2.0.6
> since yesterday night and evening respectively, and we haven't seen
> any problem yet.
> 
> Also keep in mind Permabit is seeing I/O retries, high evm times,
> plus 4 disks failures in the past 2 months.  CS is planning to
> replace some of the DotHill array hardware.
> 
> > -----Original Message-----
> > From: Andy Sharp
> > Sent: Thursday, July 17, 2008 9:59 AM
> > To: Vikas Saini
> > Cc: Paul Hammer; Raj Kumar; Brian Nguyen; Ed Kwan; Jonathan Goldick
> > Subject: Re: Permabit script for corruption
> > 
> > Cuz it's all I got.  Who's gonna care at 3am?  Anyway, it seemed to
> > hang up after just a few iterations.  It was terminated on schedule
> > a couple hours later.
> > 
> > 
> > $ cat /home/andy/log.ripper
> > +++ seq 1 10000
> > ++ for x in '`seq 1 10000`'
> > ++ echo 'Time number: 1'
> > Time number: 1
> > +++ date
> > ++ echo 'copy random to onstor Thu Jul 17 03:22:50 PDT 2008'
> > copy random to onstor Thu Jul 17 03:22:50 PDT 2008
> > ++ cp /u1/deleteme.ripper.random ./deleteme.ripper.target
> > 
> > real	5m1.846s
> > user	0m0.006s
> > sys	0m2.248s
> > +++ date
> > ++ echo 'diff 1 Thu Jul 17 03:27:52 PDT 2008'
> > diff 1 Thu Jul 17 03:27:52 PDT 2008
> > ++ diff /u1/deleteme.ripper.random ./deleteme.ripper.target
> > 
> > real	4m45.411s
> > user	0m1.166s
> > sys	0m5.376s
> > ++ '[' 0 '!=' 0 ']'
> > +++ date
> > ++ echo 'copy zero to onstor Thu Jul 17 03:32:38 PDT 2008'
> > copy zero to onstor Thu Jul 17 03:32:38 PDT 2008
> > ++ cp /u1/deleteme.ripper.zero ./deleteme.ripper.target
> > 
> > real	0m0.367s
> > user	0m0.000s
> > sys	0m0.167s
> > +++ date
> > ++ echo 'diff 2 Thu Jul 17 03:32:38 PDT 2008'
> > diff 2 Thu Jul 17 03:32:38 PDT 2008
> > ++ diff /u1/deleteme.ripper.zero ./deleteme.ripper.target
> > 
> > real	0m0.002s
> > user	0m0.000s
> > sys	0m0.001s
> > ++ '[' 0 '!=' 0 ']'
> > ++ for x in '`seq 1 10000`'
> > ++ echo 'Time number: 2'
> > Time number: 2
> > +++ date
> > ++ echo 'copy random to onstor Thu Jul 17 03:32:38 PDT 2008'
> > copy random to onstor Thu Jul 17 03:32:38 PDT 2008
> > ++ cp /u1/deleteme.ripper.random ./deleteme.ripper.target
> > 
> > real	4m39.908s
> > user	0m0.005s
> > sys	0m2.166s
> > +++ date
> > ++ echo 'diff 1 Thu Jul 17 03:37:18 PDT 2008'
> > diff 1 Thu Jul 17 03:37:18 PDT 2008
> > ++ diff /u1/deleteme.ripper.random ./deleteme.ripper.target
> > 
> > real	4m27.721s
> > user	0m1.166s
> > sys	0m3.643s
> > ++ '[' 0 '!=' 0 ']'
> > +++ date
> > ++ echo 'copy zero to onstor Thu Jul 17 03:41:46 PDT 2008'
> > copy zero to onstor Thu Jul 17 03:41:46 PDT 2008
> > ++ cp /u1/deleteme.ripper.zero ./deleteme.ripper.target
> > 
> > 
> > That's all the farther it got.
> > 
> > 
> > 
> > 
> > On Thu, 17 Jul 2008 07:50:13 -0700 "Vikas Saini"
> > <vikas.saini@onstor.com> wrote:
> > 
> > > why against mightydog ? we should not use mightydog as our testing
> > > machine... we can run it against CS if needed..
> > >
> > >
> > > Vikas
> > >
> > >
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: Andy Sharp
> > > Sent: Thu 7/17/2008 3:33 AM
> > > To: Paul Hammer
> > > Cc: Raj Kumar; Vikas Saini; Brian Nguyen; Ed Kwan; Jonathan
> > > Goldick Subject: Re: Permabit script for corruption
> > >
> > > Here is the unmangled script which I'm running against MD right
> > > now and for the next couple hours
> > >
> > > # /permabit/user is mounted from the onstor
> > > b=/permabit/user/trg/testing-trg
> > > # /u1 is a scratch area on local disk
> > > r=/u1/deleteme.`hostname`.random
> > > z=/u1/deleteme.`hostname`.zero
> > > t=$b/deleteme.`hostname`.target
> > > dd if=/dev/urandom of=$r count=1000000 bs=1024
> > > dd if=/dev/zero of=$z count=1 bs=1024
> > > mkdir $b ; cd $b
> > > for x in `seq 1 10000` ; do
> > >   echo "Time number: $x"
> > >   echo "copy 1 `date`"
> > >   time cp $r $t
> > >   echo "diff 1 `date`"
> > >   time diff $r $t
> > >   if [ $? != 0 ] ; then
> > >    echo broken
> > >   fi
> > >   echo "copy 2 `date`"
> > >   time cp $z $t
> > >   echo "diff 2 `date`"
> > >   time diff $z $t
> > >   if [ $? != 0 ] ; then
> > >    echo broken
> > >   fi
> > > done 2>&1 |tee $b/log.`hostname`
> > >
> > >
> > >
> > > On Wed, 16 Jul 2008 20:59:45 -0700 "Paul Hammer"
> > > <paul.hammer@onstor.com> wrote:
> > >
> > > > Hi Guys,
> > > >
> > > > If this really causes a corruption we have to figure it out now.
> > > > Can I ask that one of you take this action item to run the
> > > > script and see if this is really a tool that causes corruptions
> > > > with EverON? Please let me know who is taking on this task. I
> > > > only want us to run this script against 4.0/3.3.
> > > >
> > > > Thanks,
> > > >
> > > > -Paul
> > > >
> > > > -----Original Message-----
> > > > From: Rich LaReau
> > > > Sent: 2008-07-16 13:58
> > > > To: dl-esc-l3
> > > > Subject: Permabit script for corruption
> > > >
> > > >
> > > > Hi Ed and team,
> > > >
> > > > This is the script that Permabit says they can use to generate
> > > > corruption.  I'll post a copy to the case and to the associated
> > > > defect.
> > > >
> > > > Rich
> > > >
> > > >
> > > > -----Original Message-----
> > > > From: Caeli Collins
> > > > Sent: Wednesday, July 16, 2008 1:37 PM
> > > > To: Rich LaReau
> > > > Subject: FW: script to reproduce the problem
> > > >
> > > > This is what I got
> > > >
> > > >
> > > > Caeli
> > > >
> > > > -----Original Message-----
> > > > From: Tracy Gangwer [mailto:trg@permabit.com]
> > > > Sent: Wednesday, July 16, 2008 13:28
> > > > To: Caeli Collins
> > > > Cc: Clint McVey
> > > > Subject: script to reproduce the problem
> > > >
> > > > Caeli -
> > > >
> > > > Below is the script we discussed on the phone.   We are seeing
> about
> > > > a 3% failure rate.
> > > >
> > > > Thanks,
> > > >
> > > > trg
> > > >
> > > > # /permabit/user is mounted from the onstor
> > > > b=/permabit/user/trg/testing-trg # /u1 is a scratch area on
> > > > local disk r=/u1/deleteme.`hostname`.random
> z=/u1/deleteme.`hostname`.zero
> > > > t=$b/deleteme.`hostname`.target dd if=/dev/urandom of=$r
> > > > count=1000000 bs=1024 dd if=/dev/zero of=$z count=1 bs=1024
> > > > mkdir $b ; cd $b for x in `seq 1 10000` ; do
> > > >   echo "Time number: $x"
> > > >   echo "copy 1 `date`"
> > > >   time cp $r $t
> > > >   echo "diff 1 `date`"
> > > >   time diff $r $t
> > > >   if [ $? != 0 ] ; then
> > > >    echo broken
> > > >   fi
> > > >   echo "copy 2 `date`"
> > > >   time cp $z $t
> > > >   echo "diff 2 `date`"
> > > >   time diff $z $t
> > > >   if [ $? != 0 ] ; then
> > > >    echo broken
> > > >   fi
> > > > done 2>&1 |tee $b/log.`hostname`
> > > >
> > >
