AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080909102117.1b7f6982@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<john.keiffer@onstor.com>,<manohar.divate@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0B8A5961@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Tue, 9 Sep 2008 10:21:39 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "John Keiffer" <john.keiffer@onstor.com>
Cc: "Manohar Divate" <manohar.divate@onstor.com>
Subject: Re: kernel panic
Message-ID: <20080909102139.477ba4c0@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0B8A5961@onstor-exch02.onstor.net>
References: <BB375AF679D4A34E9CA8DFA650E2B04E0B8A5961@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Good capture on the output.  It looks like it might have been a
hardware problem possibly combined with a bug.  Please file a bug and
paste this output in the description.  Make the area SW-Linux.

Thanks,

a

On Tue, 9 Sep 2008 10:12:56 -0700 "John Keiffer"
<john.keiffer@onstor.com> wrote:

> Andy,
> 
>  
> 
> I'm running a TTE script that is rebooting one filer in my TTE cluster
> about every 4 minutes. I had just logged in on the SSC console, and I
> tried to type 'crash' which is an alias to 'ls -l /var/crash', when
> the system appears to have started to reboot. At that time it also
> had some kind of Kernel Panic in the middle of the reboot. What do
> you think?
> 
>  
> 
> g14r10 login: root
> 
> Password:
> 
> Last login: Mon Sep  8 14:11:41 2008 from 10.0.0.41 on pts/1
> 
> Linux g14r10 2.6.22-cg #1 Thu Sep 4 15:20:56 PDT 2008 mips64
> 
>  
> 
> Welcome to the ONStor NAS Gateway.
> 
> g14r10:~#
> 
> Broadcast message from root@g14r10 (Tue Sep  9 09:44:52 2008):
> 
>  
> 
> The system is going down for reboot NOW!
> 
> INIT: SenCPU 0 Unable to handle kernel paging request at virtual
> address 00000000000001c0, epc == ffffffff8214b568, ra ==
> ffffffff8204f614
> 
> Oops[#1]:
> 
> Cpu 0
> 
> $ 0   : 0000000000000000 0000000014001fe1 9000000010060160
> 00000000000d000d
> 
> $ 4   : 0000000000000000 ffffffff822afd91 ffffffff822afd90
> 0000000000000001
> 
> $ 8   : 0000000014001fe0 9000000000000000 ffffffff822f00d8
> a80000008d16fd28
> 
> $12   : 0000000014001fe0 000000001000001f 0000000000000000
> a80000008d16c000
> 
> $16   : ffffffff8231b390 004189374bc6a7ef 0000000000000800
> 0000000000000000
> 
> $20   : 00000000000d000d ffffffff80804b08 ffffffff808a1658
> ffffffff80804eb8
> 
> $24   : 0000000000000000 ffffffff821e9830
> 
> $28   : ffffffff822ac000 ffffffff822afd90 0000000000000006
> ffffffff8204f614
> 
> Hi    : 0000000000000000
> 
> Lo    : 00000000000002c0
> 
> epc   : ffffffff8214b568 duart_int+0x148/0x3f0     Not tainted
> 
> ra    : ffffffff8204f614 handle_IRQ_event+0x6c/0xe8
> 
> Status: 14001fe3    KX SX UX KERNEL EXL IE
> 
> Cause : 00808008
> 
> BadVA : 00000000000001c0
> 
> PrId  : 00040103
> 
> Modules linked in: autofs4
> 
> Process swapper (pid: 0, threadinfo=ffffffff822ac000,
> task=ffffffff822b02f0)
> 
> Stack : ffffffff82020d00 ffffffff822afd90 a80000000490ede0
> 0000000000000000
> 
>         0000000000000008 0000000000000000 0000000000000001
> ffffffff8204f614
> 
>         ffffffff822b57a0 0000000000000008 a80000000490ede0
> fffffffffffffbff
> 
>         ffffffff82310000 ffffffff8204f764 0000000000000100
> ffffffff822f0000
> 
>         ffffffff822f0620 ffffffff822f0000 ffffffff80804ab0
> ffffffff820011a4
> 
>         0000000000000000 ffffffff82001840 0000000000000000
> 0000000014001fe1
> 
>         0000000000000000 000000000000888d 0000000000000000
> ffffffff822b02f0
> 
>         0000000014001fe0 ffffffffffff00fe ffffffff822ff298
> a80000008d16fd30
> 
>         a80000008e6435f8 a80000008d16fd28 ffffffff822affe0
> 0000000000001f00
> 
>         0000000000000000 a80000008d16c000 ffffffff822f0000
> ffffffff822f0000
> 
>         ...
> 
> Call Trace:
> 
> [<ffffffff8214b568>] duart_int+0x148/0x3f0
> 
> [<ffffffff8204f614>] handle_IRQ_event+0x6c/0xe8
> 
> [<ffffffff8204f764>] __do_IRQ+0xd4/0x160
> 
> [<ffffffff820011a4>] plat_irq_dispatch+0x1e4/0x1f0
> 
> [<ffffffff82001840>] ret_from_irq+0x0/0x4
> 
> [<ffffffff82003840>] cpu_idle+0x18/0x68
> 
> [<ffffffff822d4bac>] start_kernel+0x2dc/0x358
> 
>  
> 
>  
> 
> Code: 8c430000  a3a00000  a3a30001 <de6801c0> 11000006  00000000
> 8d030018  8d02001c  0062102a
> 
> Kernel panic - not syncing: Fatal exception in interrupt
> 
> Rebooting in 5 seconds..<2>SiByte Watchdog in danger of initiating
> system reset in 8.3 seconds
> 
>  
> 
>  
> 
>  
> 
> PowerOn Self Test........OK
> 
>  
> 
> Initializing System......please wait
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
> PMON [SSC,EL,FP,64]
> 
> ONStor Inc. PROM_SIBYTE_CG : Cougar-prom-1.0.8 : Thu Jul 31 17:59:23
> 2008
> 
> CPU type SB1125.  Rev 35  600 MHz
> 
> module: SSC, Slot 0, CPU 0
> 
> Memory size 512 MB.
> 
> Icache size  32 KB, 32/line (4 way)
> 
> Dcache size  32 KB, 32/line (4 way)
> 
> Scache size 256 KB, 32/line (4 way)
> 
> debug IP addr = 10.2.10.14
> 
> debug IP mask = 255.255.0.0
> 
>  
> 
>  
> 
> Initializing Autoloader, hit control-E to bypass
> 
> ........................................................................
> ........
> 
>  
> 
> Thank you,
> 
> John Keiffer
> 
>  
> 
>  
> 
