AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20090727135219.732d8b88@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:mail.onstor.net
NSV:
SSH:
R:<rendell.fong@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@exch1.onstor.net/INBOX	0	2779531E7C760D4491C96305019FEEB52AD708EDDE@exch1.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Mon, 27 Jul 2009 13:52:49 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: Rendell Fong <rendell.fong@onstor.com>
Subject: Re: please review - 32712
Message-ID: <20090727135249.5cd794ae@ripper.onstor.net>
In-Reply-To: <2779531E7C760D4491C96305019FEEB52AD708EDDE@exch1.onstor.net>
References: <20090727104136.531fefab@ripper.onstor.net>
	<2779531E7C760D4491C96305019FEEB52AD708EDDE@exch1.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Mon, 27 Jul 2009 11:48:25 -0700 Rendell Fong
<rendell.fong@onstor.com> wrote:

> See comments below.
> 
> 
> > -----Original Message-----
> > From: Andy Sharp
> > Sent: Monday, July 27, 2009 10:42 AM
> > To: Rendell Fong
> > Subject: Re: please review - 32712
> > 
> > 
> > On Wed, 24 Jun 2009 18:45:43 -0700 Rendell Fong
> > <rendell.fong@onstor.com> wrote:
> > >
> > > Finally got the code ported over and working...
> > >
> > > Change 32712 by rendellf@rendellf-test on 2009/06/23
> > > 16:51:57 *pending*
> > >
> > >         Changes to support save of SSC crashdump info
> > > when a crash occurs due to kernel panic or non-fatal die
> > > exception in Cougar (Linux).  The crashdump info is
> > > stored in boot PROM at time of crash and then copied to
> > > the crash file after the system is rebooted.
> > >
> > > Affected files ...
> > >
> > > ... //depot/dev/linux/kernel/linux-mips-2.6/arch/mips/kernel/Makefile#1
> > > edit ... //depot/dev/linux/kernel/linux-mips-
> > 2.6/arch/mips/kernel/nfx_crashdump.c#1
> > > add ... //depot/dev/linux/kernel/linux-mips-
> > 2.6/arch/mips/kernel/nfx_crashdump.h#1
> > > add ... //depot/dev/linux/kernel/linux-mips-
> > 2.6/arch/mips/kernel/traps.c#3
> > > edit ... //depot/dev/linux/kernel/linux-mips-2.6/cougar-config#5
> > > edit ... //depot/dev/linux/kernel/linux-mips-2.6/drivers/char/sysrq.c#2
> > > edit ... //depot/dev/linux/kernel/linux-mips-2.6/drivers/ssc-mgmt-
> > bus/prom.c#3
> > > edit ... //depot/dev/linux/kernel/linux-mips-2.6/kernel/panic.c#1
> > > edit ... //depot/dev/nfx-tree/code/sm-except/crashdump.c#5
> > > edit ... //depot/dev/nfx-tree/code/sm-except/crashdump.h#2
> > > edit ... //depot/dev/nfx-tree/code/ssc-crashsave/Makefile#1
> > > edit ... //depot/dev/nfx-tree/code/ssc-crashsave/crashsave-int.h#1
> > > edit ... //depot/dev/nfx-tree/code/ssc-crashsave/crashsave.c#3
> > > edit ... //depot/dev/nfx-tree/code/ssc-crashsave/linux.c#2
> > > edit
> > 
> > 
> > 
> > 
> > = Change 32712 by rendellf@rendellf-test on 2009/06/23 16:51:57
> > *pending* =
> > = 	TED26961: [Sanger - 12690] filer rebooted but no crash
> > file saved =
> > = 	Changes to support save of SSC crashdump info when a crash
> > occurs due = 	to kernel panic or non-fatal die exception in
> > Cougar (Linux).  The = 	crashdump info is stored in boot
> > PROM at time of crash and then copied = 	to the crash file
> > by crashsaved after the system is rebooted. =
> > 
> > 
> > 
> > 
> > linux/kernel/Makefile
> > 
> >      looks good
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/Kconfig
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/Makefile
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/kernel/traps.c
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/onstor/Kconfig
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/onstor/common/Makefile
> > 
> > 
> > 
> >      >>add
> >      >>linux/kernel/linux-mips-2.6/arch/mips/onstor/common/Makefile
> > 
> > 	looks good
> > 
> > 
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/onstor/common/ons_crashdump.c
> > 
> > 
> >      >>add
> >      linux/kernel/linux-mips-2.6/arch/mips/onstor/common/ons_crashdump.c
> > 
> >      line 216 this code still seems to be saving the stack rather
> > than letting save_stack_trace do it?
> 
> As I explained in the previous email the intent here is to save the


Oh yeah.


> > 
> >      line 362 ->read can return an error here (negative number),
> >      put in code to handle that.
> > 
> 
> Currently the crashdump info will be saved even though other non
> related data that happens to be in the sector can't be read.  What
> should be done if the sector can't be read?  Skip saving the
> crashdump info?


Hey, you're the programmer, you're supposed to decide!  I'm just the
do-nothing manager NOB.


> > linux/kernel/linux-mips-2.6/arch/mips/onstor/cougar/Makefile
> > 
> > 
> >      looks good
> > 
> > 
> > linux/kernel/linux-mips-2.6/arch/mips/onstor/cougar/version.c
> > 
> > 
> >      >>add
> >      >>linux/kernel/linux-mips-2.6/arch/mips/onstor/cougar/version.c
> > 
> >      I'm confused.  Why do we need the contents of /version?
> > 
> 
> The /version file contains the EverON release number.  If not
> included, then it will be difficult to associate the linux build
> version to this EverON release version number.  Perhaps there's a
> version string that can be referenced instead as a build time
> constant.  The linux build version only contains date/time info.


Why do we need to be able to do this?  The date/time info does tell you
when the release was built.  I'd rather that some userspace program
told the kernel the info instead of the kernel poking around on the FS
all hack-attack style.  I guess I'm just not sure why the kernel has to
do this, can't the userspace crapola do it?  Fill that in or whatever?


> > linux/kernel/linux-mips-2.6/cougar-config
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > 
> > linux/kernel/linux-mips-2.6/cougar-debug-config
> > 
> >      >>add linux/kernel/linux-mips-2.6/cougar-debug-config
> > 
> >      looks ok.  we don't need CONFIG_STACKTRACE_SUPPORT?  i can't
> >      remember.
> > 
> 
> Ok, but I can't figure out how to remove it.  Should it be removed
> from cougar-config as well?

N-n-n, what I meant was, don't we need that config variable turned on?
I'm a bit confused because I hacked my kernel to only use -one- config
variable to control stack trace stuff, but there's still 2 or more in
the stock tree.


> > 
> > 
> > linux/kernel/linux-mips-2.6/drivers/char/sysrq.c
> > 
> >      looks good
> > 
> > linux/kernel/linux-mips-2.6/drivers/ssc-mgmt-bus/prom.c
> > 
> > 
> > 
> >      excellent
> > 
> > 
> > 
> > linux/kernel/linux-mips-2.6/include/linux/onstor/ons_crashdump.h
> > 
> > 
> >      >>add
> >      linux/kernel/linux-mips-2.6/include/linux/onstor/ons_crashdump.h
> > 
> > 
> > linux/kernel/linux-mips-2.6/kernel/panic.c
> > 
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > 
> > 
> > nfx-tree/code/sm-except/crashdump.c
> > 
> > 
> > 
> > 
> >      ok
> > 
> > 
> > 
> > 
> > 
> > nfx-tree/code/sm-except/crashdump.h
> > 
> > 
> > 
> > 
> >      looks good
> > 
> > 
> > 
> > 
> > nfx-tree/code/ssc-crashsave/Makefile
> > 
> > 
> > 
> > 
> >      aieeee!
> > 
> >      ok
> > 
> > 
> > nfx-tree/code/ssc-crashsave/crashsave-int.h
> > 
> > 
> > 
> > 
> >      I still don't understand why we are doing a notify_event_cpu
> >      failed since we're already booting back up from said failure.
> >      Such a notification will cause another reboot, won't it?
> > 
> 
> No, it won't.  Notice that the event state is different
> (EVENT_STATE_FAILED rather than EVENT_STATE_DOWN).  It acts as a
> notification that the crashdump info has been copied to the crash
> file.

Hokay, is that it's only function?  Perhaps the name should be changed
or whatever, to reflect its true purpose, I don't know, for
maintainability?  Whatever, forget it.  Bigger fish to fry.  We need to
get the vsvr daemon progressing.


> > nfx-tree/code/ssc-crashsave/crashsave.c
> > 
> > 
> > 
> >      hmm, ok
> > 
> > 
> > nfx-tree/code/ssc-crashsave/linux.c
> > 
> > 
> > 
> > 
> >      hokay, if you say so ~:^)
> > 
> >      i can't look at these elog calls, after fixing so many of them
> >      in another branch in another lifetime apparently.
> > 
> 
