AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080402155316.23447892@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<brian.stark@onstor.com>,<manohar.divate@onstor.com>,<dl-Cougar@onstor.com>,<chris.vandever@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	20080402142354.779271ea@ripper.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 2 Apr 2008 15:54:18 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Manohar Divate" <manohar.divate@onstor.com>
Cc: "Brian Stark" <brian.stark@onstor.com>, "dl-Cougar"
 <dl-Cougar@onstor.com>, "Chris Vandever" <chris.vandever@onstor.com>
Subject: Re: Data bus error, epc == ffffffff8218afe4, ra == ffffffff8204f614
Message-ID: <20080402155418.5f2e772b@ripper.onstor.net>
In-Reply-To: <20080402142354.779271ea@ripper.onstor.net>
References: <BB375AF679D4A34E9CA8DFA650E2B04E050CFA07@onstor-exch02.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E09321DB2@onstor-exch02.onstor.net>
	<20080402142354.779271ea@ripper.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

I added this info to the bug:

TED00023034 - kernel crash: DBE in yenta_irq during shutdown 


On Wed, 2 Apr 2008 14:23:54 -0700 Andrew Sharp <andy.sharp@onstor.com>
wrote:

> It's definitely software.  A bug has already been filed for it.
> 
> 
> On Wed, 2 Apr 2008 14:20:54 -0700 "Brian Stark"
> <brian.stark@onstor.com> wrote:
> 
> > I don't this this is hardware.  A data bus error typically results
> > when the CPU accesses a bogus address.
> > 
> > 
> > Brian
> > 
> > 
> > > _____________________________________________ 
> > > From: 	Manohar Divate  
> > > Sent:	Wednesday, April 02, 2008 1:56 PM
> > > To:	dl-Cougar
> > > Cc:	Chris Vandever
> > > Subject:	Data bus error, epc == ffffffff8218afe4, ra ==
> > > ffffffff8204f614
> > > 
> > > In a 3 node Cougar cluster
> > > 
> > > One node rebooted as expected when it lost its sc interface
> > > ( ifconfig eth0 down)
> > > After node joined the cluster the pcc started going down and hit
> > > this panic
> > > 
> > > Is it a hardware eroor ?/
> > > 
> > > ThanX
> > > manny
> > > 
> > > 
> > > 
> > > Apr  2 13:46:59 g9r204 : 0:0:vsd:ERROR: vsd_doDomainOp : Error
> > > doing domain opearation for VS 1
> > > Apr  2 13:46:59 g9r204 : 0:0:vsd:ERROR: vsd_doDomainOp : Error
> > > doing domain opearation for VS 5
> > > Apr  2 13:46:59 g9r204 : 0:0:vsd:ERROR: vsd_disableVsProc : Not
> > > able to stop authentication service for VS 1
> > > Apr  2 13:46:59 g9r204 : 0:0:vsd:ERROR: vsd_disableVsProc : Not
> > > able to stop authentication service for VS 5
> > > Apr  2 13:46:59 g9r204 : 0:0:vsd:ERROR: vsd_removeVsReqProc :
> > > remove VS proc failed for VS 1
> > > Apr  2 13:46:59 g9r204 : 0:0:vsd:ERROR: vsd_removeVsReqProc :
> > > remove VS proc failed for VS 5
> > > Apr  2 13:47:00 g9r204 : 0:0:vsd:ERROR: vsd_doDomainOp : Error
> > > doing domain opearation for VS 3
> > > Apr  2 13:47:00 g9r204 : 0:0:vsd:ERROR: vsd_disableVsProc : Not
> > > able to stop authentication service for VS 3
> > > Apr  2 13:47:00 g9r204 : 0:0:vsd:ERROR: vsd_removeVsReqProc :
> > > remove VS proc failed for VS 3
> > > Apr  2 13:47:00 g9r204 : 0:0:cluster2:ERROR: Node going down for
> > > reboot! (cluster_server: invalidating clusDb).
> > > Apr  2 13:47:01 g9r204 : 0:0:eventd:CRITICAL: Process-EVENT Node:
> > > Name 'local', State Down, Msg 'Node going down for reboot!
> > > (cluster_server: invalidating clusDb).'
> > > Apr  2 13:47:01 g9r204 : 0:0:spm:NOTICE: spm_ncmNodeEvent: Lost
> > > connect for
> > > Apr  2 13:47:01 g9r204 : 0:0:spm:NOTICE: spm_ncmNodeEvent:
> > > disconnected
> > > INIT: Sending processes the TERM signal Rcvd post request APP:
> > > unknown EVENT: N
> > > Apr  2 13:47:01 g9r204 : 0:0:nfxsh:NOTICE: cmd[0]: clu show clu :
> > > status[0]
> > > Stopping deferred execution scheduler: atd.
> > > Stopping periodic command scheduler: crond.
> > > Stopping MTA: exim4_liste
> > > Stopping internet superserver: inetd.
> > > Stopping OpenBSD Secure Shell server: sshd.
> > > Stopping automounter: done.
> > > Stopping NTP server: ntpd.
> > > Saving the system clock..
> > > Stopping NFS common utilities: statd.
> > > Stopping kernel log daemon: klogd.
> > > Stopping system log daemon: syslogd.
> > > Stopping ONStor services:DBE physical address: 0041001000
> > > Data bus error, epc == ffffffff8218afe4, ra == ffffffff8204f614
> > > Oops[#1]:
> > > Cpu 0
> > > $ 0   : 0000000000000000 0000000030001fe1 ffffffffffffffff
> > > 9000000041001000
> > > $ 4   : 0000000000000039 a8000000049bc800 0200000000000000
> > > 0000000000000000
> > > $ 8   : a8000000049bc800 9000000000000000 ffffffff822f019c
> > > 6f62657220726f66
> > > $12   : 0000000030001fe0 000000001000001f 0000000000000000
> > > 620a7064752f3436
> > > $16   : a80000000490e520 0000000000000000 0000000000000039
> > > 0000000000000000
> > > $20   : 0000000000000001 900000f81a7084a0 0000000000000500
> > > a80000008eca88e0
> > > $24   : 0000000000000010 ffffffff82195ec0
> > > $28   : a80000008e4c4000 a80000008e4c78c0 0000000000000000
> > > ffffffff8204f614
> > > Hi    : 0000000000000000
> > > Lo    : 0000000000001398
> > > epc   : ffffffff8218afe4 yenta_interrupt+0x14/0x118     Not
> > > tainted ra    : ffffffff8204f614 handle_IRQ_event+0x6c/0xe8
> > > Status: 30001fe3    KX SX UX KERNEL EXL IE
> > > Cause : 0080841c
> > > PrId  : 00040103
> > > Modules linked in: autofs4
> > > Process eventd (pid: 996, threadinfo=a80000008e4c4000,
> > > task=a80000008e4c0cb8)
> > > Stack : ffffffff8204f614 fffffffffffffbff ffffffff822b6878
> > > 0000000000000039
> > >         a80000000490e520 fffffffffffffbff ffffffff82310000
> > > ffffffff8204f764
> > >         0200000000000000 a80000008e95a810 a80000008acdfb40
> > > 900000008f0012e0
> > >         900000008f004b40 ffffffff820011a4 0000000000000000
> > > ffffffff82001840
> > >         0000000000000000 00000000004aaa20 900000f81a5a2700
> > > 00000000005a2700
> > >         900000f81a5a2860 a80000008e95a972 00000000000003b0
> > > ffffff0000000000
> > >         0000000000000000 696f672065646f4e 206e776f6420676e
> > > 6f62657220726f66
> > >         0000000000000010 a80000008e4c7de8 0000000000000000
> > > 620a7064752f3436
> > >         0000000000000510 a80000008e95a810 a80000008acdfb40
> > > 900000008f0012e0
> > >         900000008f004b40 900000f81a7084a0 0000000000000500
> > > a80000008eca88e0
> > >         ...
> > > Call Trace:
> > > [<ffffffff8218afe4>] yenta_interrupt+0x14/0x118
> > > [<ffffffff8204f614>] handle_IRQ_event+0x6c/0xe8
> > > [<ffffffff8204f764>] __do_IRQ+0xd4/0x160
> > > [<ffffffff820011a4>] plat_irq_dispatch+0x1e4/0x1f0
> > > [<ffffffff82001840>] ret_from_irq+0x0/0x4
> > > [<ffffffff8212e3e4>] src_unaligned_dst_aligned+0xc/0x50
> > > [<ffffffff82195fd8>] mgmtbus_hard_start_xmit+0x118/0x178
> > > [<ffffffff821a9ea4>] dev_queue_xmit+0x30c/0x458
> > > [<ffffffff82220938>] eee_dgram_sendmsg+0x2b8/0x440
> > > [<ffffffff82199540>] sock_sendmsg+0x98/0xe8
> > > [<ffffffff821997d8>] sys_sendmsg+0x248/0x320
> > > [<ffffffff8200fec8>] handle_sys+0x108/0x124
> > > 
> > > 
> > > Code: ffbf0000  dca30010  8c620000 <0040202d> ac620000  dca60010
> > > 8cc30000  90c20804  1480001f
> > > Kernel panic - not syncing: Fatal exception in interrupt
> > > Rebooting in 5 seconds..<2>SiByte Watchdog in danger of initiating
> > > system reset in 4.1 seconds
> > > SiByte Watchdog in danger of initiating system reset in 4.1
> > > seconds
> > > 
> > > 
> > > 
> > > PowerOn Self Test........OK
> > > 
> > > Initializing System......please wait
> > > 
> > > irtual servers on nas gateway g7r204
> > > 
> > >  ID  State                             Name
> > > ====================================================
> > > 6    Enabled                           VS_MGMT_1883
> > > Cluster Name: g9r204       Cluster State:   On
> > > NAS Gateways        IP              State   PCC
> > > ------------------------------------------------------
> > > g9r204              10.2.204.9      UP      YES
> > > g10r204             10.2.204.10     DOWN    NO
> > > g7r204              10.2.204.7      UP      NO
> > > Virtual servers on nas gateway g9r204
> > > 
> > >  ID  State                             Name
> > > ====================================================
> > > 1    Enabled                           VS_MGMT_1874
> > > 2    Disabled                          G9R204-VS-2
> > > 3    Enabled                           VLANTAG
> > > 5    Enabled                           NOLPORT
> > > 8    Disabled                          G10R204-VS-3
> > > Virtual servers on nas gateway g7r204
> > > 
> > >  ID  State                             Name
> > > ====================================================
> > > 6    Enabled                           VS_MGMT_1883
> > > Cluster Name: g9r204       Cluster State:   On
> > > NAS Gateways        IP              State   PCC
> > > ------------------------------------------------------
> > > g9r204              10.2.204.9      UP      YES
> > > g10r204             10.2.204.10     UP      NO
> > > g7r204              10.2.204.7      UP      NO
> > > Virtual servers on nas gateway g9r204
> > > 
> > >  ID  State                             Name
> > > ====================================================
> > > 1    Enabled                           VS_MGMT_1874
> > > 2    Disabled                          G9R204-VS-2
> > > 3    Enabled                           VLANTAG
> > > 5    Enabled                           NOLPORT
> > > 8    Disabled                          G10R204-VS-3
> > > Virtual servers on nas gateway g7r204
> > > 
> > >  ID  State                             Name
> > > ====================================================
> > > 6    Enabled                           VS_MGMT_1883
> > > Cluster Name: g9r204       Cluster State:   On
> > > NAS Gateways        IP              State   PCC
> > > ------------------------------------------------------
> > > g9r204              10.2.204.9      UP      YES
> > > g10r204             10.2.204.10     UP      NO
> > > g7r204              10.2.204.7      UP      NO
> > > Virtual servers on nas gateway g9r204
> > > 
> > >  ID  State                             Name
> > > ====================================================
> > > 1    Enabled                           VS_MGMT_1874
> > > 2    Disabled                          G9R204-VS-2
> > > 3    Enabled                           VLANTAG
> > > 5    Enabled                           NOLPORT
> > > 8    Disabled                          G10R204-VS-3
> > > 
> > > 
