AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<sandrine.boulanger@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E05C74671@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Mon, 7 Apr 2008 14:36:27 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Sandrine Boulanger" <sandrine.boulanger@onstor.com>
Subject: Re: how do we get more info for kernel oops?
Message-ID: <20080407143627.2535b716@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E05C74671@onstor-exch02.onstor.net>
References: <20080407133847.11f7e78f@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E05C74671@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Yup.

On Mon, 7 Apr 2008 14:32:27 -0700 "Sandrine Boulanger"
<sandrine.boulanger@onstor.com> wrote:

> Is it this one ?
> TED00023192 	Kernel crash on reboot Unable to handle kernel
> paging request
> 
> -----Original Message-----
> From: Andy Sharp 
> Sent: Monday, April 07, 2008 1:39 PM
> To: Sandrine Boulanger
> Subject: Re: how do we get more info for kernel oops?
> 
> That's all the information we need.  Here is the relevant part:
> 
> > tx0: Waiting for FP to start coredump local mem copy
> 
> The FP crashed and sent a bogus address, 0x120, to the SSC via the
> mgmt_bus driver.  The mgmt_bus driver, having no way to know it's a
> bogus address, dereferences it, and of course gets a page fault.
> While I recently changed this from a panic to an oops, when it
> happens in the context of an interrupt, the kernel has no choice but
> panic.
> 
> I thought there was already a bug filed for this, but I don't see it.
> Someone may have marked it WAD or assigned it to someone else as the
> root cause was the FP crash.
> 
> 
> On Mon, 7 Apr 2008 13:13:19 -0700 "Sandrine Boulanger"
> <sandrine.boulanger@onstor.com> wrote:
> 
> > I just noticied this, don't know how it happened. 
> > 
> > g14r10 login: Oops[#1]:
> > CPU 0 Unable to handle kernel paging request at virtual address
> > 0000000000000120, epc == ffffffff82006a88, ra == ffffffff82006a90
> > Oops[#2]:
> > Cpu 0
> > $ 0   : 0000000000000000 0000000010001fe0 000000000000000d
> > 0000000000000001
> > $ 4   : ffffffff8226daf8 0000000000000000 ffffffffffffffff
> > 0000000000004699
> > $ 8   : ffffffff822b0000 ffffffff822b2890 ffffffffffff4699
> > ffffffff82300000
> > $12   : ffffffff82310000 ffffffff82300000 fffffffffffffffd
> > ffffffff8223b508
> > $16   : 0000000000000000 0000000000000000 0000000000000000
> > ffffffff82270000
> > $20   : ffffffff822ca680 900000f81a7084a0 0000000000000018
> > a80000008b430760
> > $24   : 0000000000000000 0000000000000020
> > $28   : a80000008be60000 a80000008be63ab0 0000000000000000
> > ffffffff82006a90
> > Hi    : 0000000000000000
> > Lo    : 0000000000000000
> > epc   : ffffffff82006a88 show_regs+0x38/0x470     Not tainted
> > ra    : ffffffff82006a90 show_regs+0x40/0x470
> > Status: 10001fe2    KX SX UX KERNEL EXL
> > Cause : 80809008
> > BadVA : 0000000000000120
> > PrId  : 00040103
> > Modules linked in: autofs4
> > Process ndmp_cfgd (pid: 1331, threadinfo=a80000008be60000,
> > task=a80000000491b200)
> > Stack : 0000000000000000 0000000000000000 a80000008e21a0e0
> > a80000008eb7f200
> >         ffffffff822ca680 ffffffff82006ffc ffffffff8226df70
> > ffffffff820070cc
> >         0000000000000000 a80000008eb7f210 ffffffff82195e60
> > fffffffe000025a5
> >         900000008f000000 ffffffff82195f84 a80000008e21a0e0
> > ffffffff822ca680
> >         0000000000000018 a80000008eb7f200 ffffffff822ca680
> > a80000008be63da8
> >         ffffffff821a9ea4 ffffffff821a9ea4 0000000000000022
> > a80000008e21a0e0
> >         0000000000000018 ffffffff82220938 0000000000000000
> > 0000000000000008
> >         0000000000000018 0000000000000007 0000000000000001
> > 000000000000001f
> >         ffffffff822ca950 0000000000000000 a80000008be63c88
> > a80000008bb78cc0
> >         0000000000000000 000000007feb7b20 000000007feb7bb0
> > 0000000000000018
> >         ...
> > Call Trace:
> > [<ffffffff82006a88>] show_regs+0x38/0x470
> > [<ffffffff82006ffc>] show_registers+0x14/0x68
> > [<ffffffff820070cc>] die+0x7c/0xe0
> > [<ffffffff82195e60>] MGMTBUS_PHYS2VIRT+0xc8/0x128
> > [<ffffffff82195f84>] mgmtbus_hard_start_xmit+0xc4/0x178
> > [<ffffffff821a9ea4>] dev_queue_xmit+0x30c/0x458
> > [<ffffffff82220938>] eee_dgram_sendmsg+0x2b8/0x440
> > [<ffffffff82199540>] sock_sendmsg+0x98/0xe8
> > [<ffffffff82199998>] sys_sendto+0xe8/0x138
> > [<ffffffff8200fec8>] handle_sys+0x108/0x124
> > 
> > 
> > Code: 0000882d  3c138227  6484daf8 <0c8094da> 8e540120  08801abb
> > 0200282d  24050010  1200001a
> > Kernel panic - not syncing: Fatal exception in interrupt
> > Rebooting in 5 seconds..<6>tx0:
> > tx0:
> > tx0: Exception Cause = Watchdog Timeout/NMI
> > tx0: ERREPC:   0xffffffff83245494
> > tx0: RA:       0xffffffff8324548c
> > tx0: SR:       0x200800e1
> > tx0: Waiting for FP to start coredump local mem copy
> > SiByte Watchdog in danger of initiating system reset in 8.1 seconds
> > 
> > 
> > 
> > PowerOn Self Test........OK
> > 
> > Initializing System......please wait
> > 
> > 
