AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080626152047.5543dee7@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<damon.wong@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0AA0B392@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Thu, 26 Jun 2008 15:21:42 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Damon Wong" <damon.wong@onstor.com>
Subject: Re: I hit fs_abort while deleting a volume on this nightly build
 (0626).
Message-ID: <20080626152142.740f4405@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0AA0B392@onstor-exch02.onstor.net>
References: <BB375AF679D4A34E9CA8DFA650E2B04E0AA0B32E@onstor-exch02.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0AA0B392@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Hi Damon,

Please add Jonathon and Bill to your list of recipients for these
reports.

"Bill Nadzam" <bill.nadzam@onstor.com>
"Jonathan Goldick" <jonathan.goldick@onstor.com>

Thanks,

a


On Thu, 26 Jun 2008 15:03:04 -0700 "Damon Wong" <damon.wong@onstor.com>
wrote:

> Owner block looks ok. Read the volume log header
> 
> Magic             0xdead420
> cksum             0x0
> version           0x1
> state             0x2
> last write        0x48640d3c Thu Jun 26 14:42:20 2008
> chassisId         0x770
> node              g5r204
> 
> history index     0
> first write         0x48543813 Sat Jun 14 14:28:51 2008
> mount time          0x48640a45 Thu Jun 26 14:29:41 2008
> last write          0x48640d3c Thu Jun 26 14:42:20 2008
> nbr_writes          148857
> nbr mounts          22
> chassisId           0x770
> 
> 
> 
> _____________________________________________
> From: Damon Wong 
> Sent: Thursday, June 26, 2008 2:56 PM
> To: Vikas Saini
> Cc: Raj Kumar; Larry Scheer; Andy Sharp; Sandrine Boulanger
> Subject: I hit fs_abort while deleting a volume on this nightly build
> (0626).
> 
> I hit fs_abort while deleting a volume on this nightly build.
> Similar to 24376 but I hit it while creating the volume, not deleting
> the volume.
> 
> Damon
> 
> Core_dump:
> ONStor Core Analysis Tool version v2.1.0
> Core location: /n/users/dwong/private/scripts/1.2
> Core platform: CG
> Core type: FP
> Core version: R4.0.0.0CGDBG-061408
> Boot time: Thu Jun 26 14:21:18 2008
> Crash time: Thu Jun 26 14:23:35 2008
> Image location:
> /n/build-trees/R4.0.0.0/R4.0.0.0-061408-sub26/nfx-tree/Build/cg/dbg/Imag
> es/fp_cg
> 
> ===BEGIN STACK TRACE===
> #0  fs_abort () at fs-err.c:1141
> #1  0x831d9128 in fs_owner_mount (context=0x102059d458,
> buf=0x400d5f8280) at fs-owner.c:322
> #2  0x831d9cbc in fs_owner_check_mount (context=0x102059d458) at
> fs-owner.c:625
> #3  0x831da530 in fs_owner_start (context=0x102059d458) at
> fs-owner.c:859
> #4  0x831a7fe0 in fs_doMount (context=0x102059d458, req=0x101c8c4320)
> at fs-mount.c:2479
> #5  0x831ac088 in fs_mountThread (context=0x102059d458) at
> fs-mount.c:3601
> #6  0x830c6f88 in fs_threadBegin (handle=0, arg=0x0) at
> fs-context.c:2950
> ===END STACK TRACE===
> [root@c15r18-rhel4 scripts]#
> 
> /var/log/Onstor/messages:
> Jun 26 14:23:23 g5r204 : 0:0:(null):NOTICE: 'root' logged in through
> remote host(10.2.18.15)
> Jun 26 14:23:24 g5r204 : 0:0:nfxsh:NOTICE: cmd[0]: system show
> nodename : status[0]
> Jun 26 14:23:25 g5r204 : 0:0:nfxsh:NOTICE: cmd[1]: volume show  :
> status[0]
> Jun 26 14:23:25 g5r204 : 0:0:nfxsh:NOTICE: cmd[2]: vsvr set
> G5R204-T1 : status[0]
> Jun 26 14:23:25 g5r204 : 0:0:ea:ERROR: ea_delFsysProc[2590]:
> Volume[g5r204-t1-vol2] is in Mounting state. Operation not allowed.
> Jun 26 14:23:25 g5r204 : 0:0:nfxsh:NOTICE: cmd[3]: volume delete
> g5r204-t1-vol2 : status[11]
> Jun 26 14:23:25 g5r204 : 0:0:nfxsh:NOTICE: cmd[4]: system show
> nodename : status[0]
> Jun 26 14:23:25 g5r204 : 0:0:nfxsh:NOTICE: cmd[5]: lun show disk -t
> free : status[0]
> Jun 26 14:23:26 g5r204 : 0:0:nfxsh:NOTICE: cmd[4]: vsvr id 2 :
> status[0] Jun 26 14:23:35 g5r204 : 1:5:efs:ERROR: 81: fs_abort
> Jun 26 14:23:38 g5r204 : 0:0:eventd:WARNING: Process-EVENT CPU: Slot
> 1, CPU 0, State Down
> Jun 26 14:23:38 g5r204 : 0:0:eventd:WARNING: Process-EVENT CPU: Slot
> 1, CPU 1, State Down
> Jun 26 14:23:38 g5r204 : 0:0:vtm:NOTICE: Vtm_ProcEvntMsg:
> NFX_EVENT_CPU, state 2, slot 1, cpu 0
> Jun 26 14:23:38 g5r204 : 0:0:eventd:WARNING: Process-EVENT CPU: Slot
> 1, CPU 2, State Down
> Jun 26 14:23:38 g5r204 : 0:0:eventd:WARNING: Process-EVENT CPU: Slot
> 1, CPU 3, State Down
> Jun 26 14:23:38 g5r204 : 0:0:eventd:WARNING: Process-EVENT CPU: Slot
> 1, CPU 4, State Down
> Jun 26 14:23:38 g5r204 : 0:0:eventd:WARNING: Process-EVENT CPU: Slot
> 1, CPU 5, State Down
> Jun 26 14:23:38 g5r204 : 0:0:sanm:ERROR: SANM: FP NIM down. Aborting
> all mirror sessions.
> Jun 26 14:23:38 g5r204 : 0:0:sanm:ERROR: SANM: FP NIM down. Aborting
> all mirror sessions.
> Jun 26 14:23:38 g5r204 : 0:0:vtm:NOTICE: Vtm_ProcEvntMsg:
> NFX_EVENT_CPU, state 2, slot 1, cpu 1
> Jun 26 14:23:38 g5r204 : 0:0:vtm:NOTICE: Vtm_ProcEvntMsg:
> NFX_EVENT_CPU, state 2, slot 1, cpu 2
> Jun 26 14:23:38 g5r204 : 0:0:vtm:NOTICE: Vtm_ProcEvntMsg:
> NFX_EVENT_CPU, state 2, slot 1, cpu 3
> Jun 26 14:23:38 g5r204 : 0:0:vtm:NOTICE: Vtm_ProcEvntMsg:
> NFX_EVENT_CPU, state 2, slot 1, cpu 4
> Jun 26 14:23:38 g5r204 : 0:0:vtm:NOTICE: Vtm_ProcEvntMsg:
> NFX_EVENT_CPU, state 2, slot 1, cpu 5
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol2', Id 0x00000770000001dd, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol4', Id 0x00000770000001df, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol5', Id 0x00000770000001e0, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol3', Id 0x00000770000001de, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol6', Id 0x00000770000001e1, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'vol_mgmt_1904', Id 0x000007700000017d, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol1', Id 0x00000770000001dc, Event 'Offline', State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT IP i/f: IP
> 192.167.1.1, Port bp0, State Remove
> Jun 26 14:23:39 g5r204 : 0:0:eventd:WARNING: Process-EVENT Vsvr:
> Virtual server 'VS_MGMT_1904', Id 1, State 'Down'
> Jun 26 14:23:39 g5r204 : 0:0:eventd:NOTICE: Process-EVENT IP i/f: IP
> 192.167.2.1, Port bp0, State Remove
> Jun 26 14:23:39 g5r204 : 0:0:eventd:WARNING: Process-EVENT Vsvr:
> Virtual server 'G5R204-T1', Id 2, State 'Down'
> Jun 26 14:23:42 g5r204 : 0:0:vtm:ERROR: vtm_gatherInfo_proc: gather
> info failed for request from g5r204
> Jun 26 14:23:43 g5r204 : 0:0:vtm:ERROR: vtm_failover_vsvr_proc: no
> alternative filer found for vs 2 failover
> Jun 26 14:23:43 g5r204 : 0:0:vtm:ERROR: vtm_failover_vsvr_proc: no
> alternative filer found for vs 3 failover
> Jun 26 14:23:49 g5r204 : 0:0:nfxsh:NOTICE: cmd[5]: system show
> chassis : status[0]
> Jun 26 14:24:03 g5r204 : 0:0:nfxsh:NOTICE: cmd[0]: -> EMRS: Not
> gathering h_res_stats: not all processors are up. : status[2]
> Jun 26 14:24:16 g5r204 : 0:0:(null):NOTICE: 'root' logged in through
> remote host(10.0.0.233)
> Jun 26 14:26:15 g5r204 : 0:0:eventd:CRITICAL: Process-EVENT Node: Name
> 'local', State Down, Msg 'Node going down for reboot! ('system reboot'
> issued from nfxsh).'
> Jun 26 14:26:15 g5r204 : 0:0:spm:NOTICE: spm_ncmNodeEvent: Lost
> connect for
> Jun 26 14:26:15 g5r204 : 0:0:spm:NOTICE: spm_ncmNodeEvent:
> disconnected Jun 26 14:26:15 g5r204 pm: pm_terminate: child 992
> (/onstor/bin/crashsaved) terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 990
> (/onstor/bin/sscccc) terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 987 (/onstor/bin/asd)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 986 (/onstor/bin/snmpd)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 985
> (/onstor/bin/cluster_relay) terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 981 (/onstor/bin/sanmd)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 979 (/onstor/bin/vtmd)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 978 (/onstor/bin/vsd)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 967
> (/onstor/bin/auth-agent) terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 966
> (/onstor/bin/ndmp_cfgd) terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: forcibly terminating child
> 965 (/onstor/bin/tape-driver)
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 963 (/onstor/bin/ipmd)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 962 (/onstor/bin/spm)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 961 (/onstor/bin/ea)
> terminated
> Jun 26 14:26:16 g5r204 pm: pm_terminate: child 960
> (/onstor/bin/evm_cfgd) terminated
> Jun 26 14:26:17 g5r204 pm: pm_terminate: child 959
> (/onstor/bin/sdm_cfgd) terminated
> Jun 26 14:26:17 g5r204 pm: pm_terminate: child 956
> (/onstor/bin/cluster_server) terminated
> Jun 26 14:26:17 g5r204 pm: pm_terminate: child 955
> (/onstor/bin/chassisd) terminated
> Jun 26 14:26:17 g5r204 pm: pm_terminate: child 954
> (/onstor/bin/timekeeper) terminated
> Jun 26 14:26:17 g5r204 pm: pm_terminate: child 953
> (/onstor/bin/eventd) terminated
> Jun 26 14:26:17 g5r204 pm: pm_terminate: child 945 (/onstor/bin/ncmd)
> terminated
> Jun 26 14:26:18 g5r204 pm: pm_terminate: child 932 (/onstor/bin/elog)
> terminated
> Jun 26 14:29:10 g5r204 pm: /onstor/bin/elog: finished initialization.
> Jun 26 14:29:10 g5r204 : 0:0:cm:NOTICE: CHASSISD: Chassis Manager:
> Started
> Jun 26 14:29:11 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Port: fp1.0
> State Up, Msg ''
> Jun 26 14:29:11 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Port: fp1.1
> State Up, Msg ''
> Jun 26 14:29:11 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Port: fp1.2
> State Down, Msg ''
> Jun 26 14:29:12 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Port: fp1.3
> State Down, Msg ''
> Jun 26 14:29:12 g5r204 : 1:4:scsi:NOTICE: 9: ispfc: ispfc:sp2.1
> Fibrechannel link now online
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Port: sp2.1
> State Up, Msg ''
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT CPU: Slot 1,
> CPU 0, State Up
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT CPU: Slot 1,
> CPU 1, State Up
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT CPU: Slot 1,
> CPU 2, State Up
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT CPU: Slot 1,
> CPU 3, State Up
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT CPU: Slot 1,
> CPU 4, State Up
> Jun 26 14:29:13 g5r204 : 0:0:eventd:NOTICE: Process-EVENT CPU: Slot 1,
> CPU 5, State Up
> Jun 26 14:29:17 g5r204 : 0:0:cluster2:NOTICE: Using 10.2.204.5 as my
> primary address
> Jun 26 14:29:17 g5r204 : 0:0:cluster2:NOTICE: ubik init with buff size
> 64
> Jun 26 14:29:17 g5r204 : 0:0:eventd:WARNING: Process-EVENT 0.0.0.0:
> Mgmt Port 0.0.0.0 PCC, State Up
> Jun 26 14:29:20 g5r204 : 0:0:cluster2:WARNING:
> ClusterCtrl_iUpdateState: PCC down pccname g5r204
> Jun 26 14:29:20 g5r204 : 0:0:eventd:WARNING: Process-EVENT 0.0.0.0:
> Mgmt Port 0.0.0.0 PCC, State Up
> Jun 26 14:29:23 g5r204 : 0:0:ea:NOTICE: ea_rcvRmcMsg: NCM session open
> failed
> Jun 26 14:29:24 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d4514000000] volId[0x770000001dd]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol2', Id 0x00000770000001dd, Event 'Create', State 'Down'
> Jun 26 14:29:25 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d4520000000] volId[0x770000001df]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d4501000000] volId[0x770000001e1]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol4', Id 0x00000770000001df, Event 'Create', State 'Down'
> Jun 26 14:29:25 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d4507000000] volId[0x770000001de]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d4509000000] volId[0x770000001e0]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol6', Id 0x00000770000001e1, Event 'Create', State 'Down'
> Jun 26 14:29:25 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d4513000000] volId[0x770000001db]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol3', Id 0x00000770000001de, Event 'Create', State 'Down'
> Jun 26 14:29:25 g5r204 : 0:0:evm:NOTICE: evm_procOdCfg:
> LUN[ONStor_0a50a1_0a4f7100480d333d450b000000] volId[0x770000001c8]
> startBlk[0] numBlks[58595200]
> Jun 26 14:29:25 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'g5r204-t1-vol5', Id 0x00000770000001e0, Event 'Create', State 'Down'
> Jun 26 14:29:25 g5r204 : 0:0:eventd:NOTICE: Process-EVENT Volume: name
> 'dtest_v1', Id 0x00000770000001db, Event 'Create', State 'Down'
> 
> _____________________________________________
> From: Vikas Saini 
> Sent: Thursday, June 26, 2008 2:08 PM
> To: Damon Wong
> Subject: FW: 
> 
> Can you do some AT on this. Just some IO, nfsperftest, dd 
> 
> Thanks
> Vikas
> 
> 
> _____________________________________________
> From: Larry Scheer 
> Sent: Thursday, June 26, 2008 1:17 PM
> To: Vikas Saini; Raj Kumar
> Cc: Andy Sharp; Sandrine Boulanger
> Subject: 
> 
> Vikas, Raj,
>     As requested by Andy the nightly build from Thursday AM is
> available.
>  
> Top of Tree change list is:
> Change 29844 on 2008/06/25 by billn@billn-dev 'Protection code added
> to make s'
> 
> Source tree
> 		/n/Build-Trees/Nightly/R4.0.0.0-062608/nfx-tree
> 		/n/Build-Trees/Nightly/R4.0.0.0-062608/linux
> 
> Distribution images to use with system upgrade:
> 
> R3.3.0.0 images are here:
> Cheetah optimized:
> http://10.2.0.21/upgrade/NIGHTLY/R3.3.0.0-062608.tar.gz
> <http://10.2.0.21/upgrade/R3.3.0.0-052008.tar.gz> 
> Cheetah debug:
> http://10.2.0.21/upgrade/NIGHTLY/R3.3.0.0DBG-062608.tar.gz
> <http://10.2.0.21/upgrade/R3.3.0.0DBG-052008.tar.gz> 
> 
> Bobcat optimized:
> http://10.2.0.21/upgrade/NIGHTLY/R3.3.0.0BC-062608.tar.gz
> <http://10.2.0.21/upgrade/R3.3.0.0BC-051308.tar.gz> 
> Bobcat debug:
> http://10.2.0.21/upgrade/NIGHTLY/R3.3.0.0BCDBG-062608.tar.gz
> <http://10.2.0.21/upgrade/R3.3.0.0BCDBG-042908.tar.gz> 
> 
> R4.0.0.0 images are here
> Cougar debug:
> http://10.2.0.21/upgrade/NIGHTLY/R4.0.0.0CGDBG-062608.tar.gz
> <http://10.2.0.21/upgrade/R4.0.0.0CGDBG-042908.tar.gz> 
> Cougar optimized:
> http://10.2.0.21/upgrade/NIGHTLY/R4.0.0.0CG-062608.tar.gz
> <http://10.2.0.21/upgrade/R4.0.0.0CG-051308.tar.gz> 
> 
> 
> Larry
> 
