AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20090401103915.65f4ee0c@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:mail.onstor.net
NSV:
SSH:
R:<rendell.fong@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@exch1.onstor.net/INBOX	0	2779531E7C760D4491C96305019FEEB52AC8BC1D3C@exch1.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 1 Apr 2009 10:39:38 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: Rendell Fong <rendell.fong@onstor.com>
Subject: Re: take a 10 minute look at bug 26538
Message-ID: <20090401103938.37806f06@ripper.onstor.net>
In-Reply-To: <2779531E7C760D4491C96305019FEEB52AC8BC1D3C@exch1.onstor.net>
References: <20090401100729.730b11c2@ripper.onstor.net>
	<2779531E7C760D4491C96305019FEEB52AC8BC1D3C@exch1.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Wed, 1 Apr 2009 10:21:56 -0700 Rendell Fong
<rendell.fong@onstor.com> wrote:

> The only way I know how to cause an error is using the sysrq facility (by running "echo c > /proc/sysrq-trigger" and causing a bad memory access in sysrq.c:sysrq_handle_crashdump().

can that be done on a per-thread basis?

> 

How about the old standby:

	volatile int *p = NULL;
	int j;

	j = *p;
	<boing>


> I don't know which kernel thread that it can be done in without being considered fatal.  Also, it needs to be done after the nfs root is mounted so that writing to the crash file is possible and ideally after tuxrx initialization upon command.



Your job is to make it no longer considered "fatal" ~:^)


> > -----Original Message-----
> > From: Andy Sharp
> > Sent: Wednesday, April 01, 2009 10:07 AM
> > To: Rendell Fong
> > Subject: Re: take a 10 minute look at bug 26538
> > 
> > take a kernel thread, have it hit a segfault, and see if you can get a
> > task level core dump.  the kernel code you have should include a thread
> > which pins itself to core#3 on the txrx processor.  just write a
> > segfault or bus error into it.
> > 
> > Good work on the oops thing.
> > 
> >  On Wed, 1 Apr 2009 09:59:27 -0700 Rendell Fong
> > <rendell.fong@onstor.com> wrote:
> > 
> > > I think I'm done with the core dump on tuxrx.  Things will basically
> > work if the rlimit core is set for the process that crashes.  For non-
> > fatal crashes in the kernel, I've completed the changes to log the Oops
> > info in /var/crash/1.0 file.
> > >
> > > So what's next?
> > >
> > >
> > > > -----Original Message-----
> > > > From: Andy Sharp
> > > > Sent: Wednesday, April 01, 2009 9:55 AM
> > > > To: Rendell Fong
> > > > Subject: Re: take a 10 minute look at bug 26538
> > > >
> > > > TuxRx is more important.  Forget it.
> > > >
> > > > On Wed, 1 Apr 2009 08:46:24 -0700 Rendell Fong
> > > > <rendell.fong@onstor.com> wrote:
> > > >
> > > > > I've looked at the core file and tried comparing it to a valid one.
> > It
> > > > is not obvious why gdb64 doesn't like it.   I'll have to get the
> > source to
> > > > debug it, i.e. run gdb on gdb64 or add some debug code to resolve the
> > > > problem.
> > > > >
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Andy Sharp
> > > > > > Sent: Monday, March 30, 2009 5:35 PM
> > > > > > To: Rendell Fong
> > > > > > Subject: take a 10 minute look at bug 26538
> > > > > >
> > > > > > Hi Rendell,
> > > > > >
> > > > > > Do Max a quick favor and take 10 minutes to look at bug 26538 to
> > see
> > > > if
> > > > > > you can fix the core file, Max says gdb is having trouble parsing
> > it.
> > > > I
> > > > > > don't know anything about this core file business, but maybe you
> > do.
