AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20090414101036.2d54686c@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:mail.onstor.net
NSV:
SSH:
R:<rendell.fong@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@exch1.onstor.net/INBOX	0	2779531E7C760D4491C96305019FEEB52AC8BC1D42@exch1.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Tue, 14 Apr 2009 10:10:41 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: Rendell Fong <rendell.fong@onstor.com>
Subject: Re: core dump status
Message-ID: <20090414101041.3bb1536a@ripper.onstor.net>
In-Reply-To: <2779531E7C760D4491C96305019FEEB52AC8BC1D42@exch1.onstor.net>
References: <2779531E7C760D4491C96305019FEEB52AC8BC1D42@exch1.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

This is excellent progress.  It's strange that there is so much
interaction with other subsystems to dump a core.  But perhaps there is
good reason for that, like the FP itself doesn't know enough on its own
to do the whole job.

Little things like register mismatches, ah, someone can file a bug.
I'm kidding, of course, but it's just a small wrinkle that can be
ironed out at some point.

We can talk more when I get into the office.

Cheers,

a


On Mon, 13 Apr 2009 17:51:19 -0700 Rendell Fong
<rendell.fong@onstor.com> wrote:

> I was trying to save the core dump info to the core volume but it turns out that the system is too tightly coupled to let me write to the core volume.  At initialization, the core file thread on the FP needs to be started because it is needed to initiate opening the core volume.  For it to work as expected, it must communicate with the cluster DB and evm which run in the SSC.  evm hooks into with the scsi and ispfc code on the FP.  Bottomline, is that with my hacked up FP image I can't write the core dump info to the core volume because I couldn't figure out how to stub out the SSC code and break the existing dependencies between the various apps.
> 
> I believe I have all the code implemented for core dump.  It's just a question of how to test it without being able to run to whole system.
> I'm thinking of waiting until we have SSC to FP communication working before continuing on.
> 
> Meanwhile I've verified that the crash info is written to boot prom in the same format that the existing txrx code uses.  The core file thread in the Cougar txrx does recognize it after rebooting the system and switch images.  The crash info is forwarded to crashsaved running in the SSC and the info is saved in the /var/crash/1.0 file.
> 
> One of the issues is that register and stack info dumped to the console doesn't match what is being saved by core dump.
> Linux is displaying it from the die() routine and it eventually calls the panic() routine where core dump is hooked in.
> But I'm not sure the register values and stack data that my code is saving is correct.
> 
> I needed to make changes to the prom code in drivers/mgmt-bus/prom.c file for use by the core dump code.
> For now, the core dump code is almost entirely in kernel/panic.c with many #defines copied over from nfx tree.
> I don't know where the code ought to be put in the interim.
> 
> 
> 
> 
> 
