AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<Maxim.Kozlovsky@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/LSI/INBOX	0	861DA0537719934884B3D30A2666FECC010E3D22D6@cosmail02.lsi.com
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 24 Mar 2010 15:43:23 -0700
From: Andrew Sharp <andy.sharp@lsi.com>
To: "Kozlovsky, Maxim" <Maxim.Kozlovsky@lsi.com>
Subject: Re: kernel stacktrace
Message-ID: <20100324154323.76ba2a7b@ripper.onstor.net>
In-Reply-To: <861DA0537719934884B3D30A2666FECC010E3D22D6@cosmail02.lsi.com>
References: <861DA0537719934884B3D30A2666FECC010E3D2043@cosmail02.lsi.com>
	<20100322145532.6950acfa@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D2086@cosmail02.lsi.com>
	<20100322172110.0550c53c@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D226B@cosmail02.lsi.com>
	<20100323111014.44f376df@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D22A1@cosmail02.lsi.com>
	<20100323111533.7002e8c0@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D22AE@cosmail02.lsi.com>
	<20100323114336.47b05d17@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D22D6@cosmail02.lsi.com>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

I did as you requested, and I get the expected output:

tuxstor0:~# insmod /lib/modules/`uname -r`/kernel/drivers/testm.ko
testm: module license 'unspecified' taints kernel.
test_init: 
CPU 0 Unable to handle kernel paging request at virtual address 0000000000000000, epc == ffffffffc000402c, ra == ffffffffc000402c
Oops[#2]:
Cpu 0
$ 0   : 0000000000000000 0000000010001fe0 000000000000000f ffffffff83380000
$ 4   : ffffffff8337abf8 0000000010001fe0 ffffffffffffffff 0000000000002744
$ 8   : ffffffff83312a60 0000000000000000 ffffffffffff2744 ffffffff83380000
$12   : ffffffff83390000 ffffffff83380000 0000000000000000 a8000001fe288000
$16   : ffffffffc00044e0 ffffffff83310000 a8000001ff58d9b8 0000000000000013
$20   : 0000000000000013 ffffffffc00044e0 c0000000000005e0 c000000000000000
$24   : 0000000000000000 0000000000000020                                  
$28   : a8000001fe7b8000 a8000001fe7bbd50 ffffffff83050050 ffffffffc000402c
Hi    : 0000000000000000
Lo    : 0000000000000000
epc   : ffffffffc000402c init_module+0x24/0x38 [testm]     Tainted: P      
ra    : ffffffffc000402c init_module+0x24/0x38 [testm]
Status: 10001fe3    KX SX UX KERNEL EXL IE 
Cause : 1080800c
BadVA : 0000000000000000
PrId  : 01041100
Modules linked in: testm(P) acpu
Process insmod (pid: 16127, threadinfo=a8000001fe7b8000, task=a8000001ff4acd60)
Stack : ffffffff830519d4 ffffffff83052edc ffffffff832f7050 0000000000000000
        c000000000000534 0000000000000007 0000000000000000 0000000000000020
        0000000000000020 0000000000000000 0000000000000006 c000000000000a60
        c000000000000820 c000000000000a20 a8000001ff76cd60 a8000001ff241dc0
        c000000000000e90 0000000000000011 0000000000000000 0000000000000000
        0000000000000000 0000000000000000 0000000000000000 0000000000000000
        0000000000000000 0000000000000000 0000000000000000 0000000000000000
        0000000000000000 0000000000000000 0000000000000000 0000000000000000
        0000000000000000 000000007f89aedb 0000000000000f2a 0000000000004000
        0000000000442060 000000007f89aedb 0000000000442050 0000000000000003
        ...
Call Trace:
[<ffffffffc000402c>] init_module+0x24/0x38 [testm]
[<ffffffff830519d4>] sys_init_module+0x164/0x19c8
[<ffffffff83011cf4>] handle_sys+0x114/0x130


Code: ffbf0000  0040f809  64a54090 <ac000000> 0000102d  dfbf0000  03e00008  67bd0010  00002801 
primary crash already saved... crash #2 (Oops) will be ignored
Segmentation fault
tuxstor0:~# 





On Tue, 23 Mar 2010 12:49:39 -0600 "Kozlovsky, Maxim"
<Maxim.Kozlovsky@lsi.com> wrote:

> The obvious difference between the two cases is that this module is
> built as a kernel module and my module is external. Must be something
> in the make files that is not done right for the external modules.
> Can we go through the denial phase faster?
> 
> -----Original Message-----
> From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> Sent: Tuesday, March 23, 2010 11:44 AM
> To: Kozlovsky, Maxim
> Subject: Re: kernel stacktrace
> 
> Here's what I get:
> 
> tuxstor0:~# modprobe acpu
> acpu_threadg_init: 
> CPU 0 Unable to handle kernel paging request at virtual address
> 0000000000000000, epc == ffffffffc0002024, ra == ffffffffc0002024
> Oops[#1]: Cpu 0
> $ 0   : 0000000000000000 0000000010001fe0 0000000000000017
> 0000000000000000 $ 4   : ffffffff83312a60 0000000010001fe0
> a80000000a446108 a8000001ff1a2110 $ 8   : ffffffff83312a68
> 0000000000000000 ffffffffffff1dea ffffffff83380000 $12   :
> ffffffff83390000 ffffffff83380000 0000000000000000 a8000001fda70000
> $16   : ffffffffc0000800 ffffffff83310000 a8000001ff499318
> 0000000000000016 $20   : 0000000000000016 ffffffffc0000800
> c000000000000800 c000000000000000 $24   : 0000000000000000
> a8000001fef18000 $28   : a8000001fda70000 a8000001fda73d50
> ffffffff83050050 ffffffffc0002024 Hi    : 000000000000007f Lo    :
> be76c8b4395810fa epc   : ffffffffc0002024 acpu_threadg_init+0x24/0x38
> [acpu]     Not tainted ra    : ffffffffc0002024
> acpu_threadg_init+0x24/0x38 [acpu] Status: 10001fe3    KX SX UX
> KERNEL EXL IE Cause : 1080800c
> BadVA : 0000000000000000
> PrId  : 01041100
> Modules linked in: acpu
> Process modprobe (pid: 13250, threadinfo=a8000001fda70000,
> task=a8000001feacf3a8) Stack : ffffffff830519d4 ffffffff83052edc
> ffffffff832f7340 0000000000000000 c000000000000734 000000000000000a
> 0000000000000000 000000000000002d 000000000000002d 0000000000000000
> 0000000000000009 c000000000000d40 c000000000000b00 c000000000000d00
> 0000000000000000 a8000001ffac6378 c000000000001428 0000000000000014
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000449d94
> 0000000000000000 000000007fa27b34 0000000000000000 000000000044b9a0
> 0000000000449d88 000000002aac0000 ...
> Call Trace:
> [<ffffffffc0002024>] acpu_threadg_init+0x24/0x38 [acpu]
> [<ffffffff830519d4>] sys_init_module+0x164/0x19c8
> [<ffffffff83011cf4>] handle_sys+0x114/0x130
> 
> 
> Code: ffbf0000  0040f809  64a500f0 <ac000000> 0000102d  dfbf0000
> 03e00008  67bd0010  00000009 Crashdump saved to prom
> Segmentation fault
> tuxstor0:~# 
> 
> 
> 
> 
> 
> 
> On Tue, 23 Mar 2010 12:26:33 -0600 "Kozlovsky, Maxim"
> <Maxim.Kozlovsky@lsi.com> wrote:
> 
> > Try this one. Loading this module produces bad stack trace. Is this
> > stack hosed as well?
> > 
> > #include <linux/module.h>
> > 
> > int
> > test_init(void)
> > {
> >     printk("%s: \n", __func__);
> >     *(int *)0 = 0;
> >     return 0;
> > }
> > 
> > void
> > test_exit(void)
> > {
> > }
> > 
> > module_init(test_init);
> > module_exit(test_exit);
> > 
> > tuxrx0:~# modprobe eee-test
> > modprobe eee-test
> > Call Trace:
> > [<ffffffffc02e102c>] $LCFI1+0x8/0x1c [eee_test]
> > 
> > 
> > -----Original Message-----
> > From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> > Sent: Tuesday, March 23, 2010 11:16 AM
> > To: Kozlovsky, Maxim
> > Subject: Re: kernel stacktrace
> > 
> > It works all the time when the stack isn't hosed.
> > 
> > On Tue, 23 Mar 2010 12:13:05 -0600 "Kozlovsky, Maxim"
> > <Maxim.Kozlovsky@lsi.com> wrote:
> > 
> > > This will make it easier for you to fix the problem, it means the
> > > code is not completely broken, sometimes it works.
> > > 
> > > -----Original Message-----
> > > From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> > > Sent: Tuesday, March 23, 2010 11:10 AM
> > > To: Kozlovsky, Maxim
> > > Subject: Re: kernel stacktrace
> > > 
> > > The stack trace works just fine any other time.
> > > 
> > > On Tue, 23 Mar 2010 11:42:16 -0600 "Kozlovsky, Maxim"
> > > <Maxim.Kozlovsky@lsi.com> wrote:
> > > 
> > > > The stack is not hosed, the stack trace is. This example does
> > > > not do anything with the threads, it crashes in the context of
> > > > modprobe process. You need to add a task to the schedule for
> > > > yourself to fix the stack trace, it should come at higher
> > > > priority than 8-way.
> > > > 
> > > > -----Original Message-----
> > > > From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> > > > Sent: Monday, March 22, 2010 5:21 PM
> > > > To: Kozlovsky, Maxim
> > > > Subject: Re: kernel stacktrace
> > > > 
> > > > On Mon, 22 Mar 2010 17:01:17 -0600 "Kozlovsky, Maxim"
> > > > <Maxim.Kozlovsky@lsi.com> wrote:
> > > > 
> > > > > Any ideas besides "your stack is hosed"? 
> > > > > 
> > > > > 
> > > > > Here is another bad stack trace, after modifying the
> > > > > code/sm-tests/sm-test.c to crash in mytest_create():
> > > > > 
> > > > >         ...
> > > > > Call Trace:
> > > > > [<ffffffffc03d10c0>] $LVL8+0x0/0x10 [sm_test]
> > > > > [<ffffffffc03d10c0>] $LVL8+0x0/0x10 [sm_test]
> > > > 
> > > > C'mon, that doesn't look hosed to you?  I don't know the full
> > > > context of what you're up to, but it looks like you're mucking
> > > > about in the esm_threads stuff.  Could be hosing the stack.
> > > > Anyway, this example does a page fault right away, and the stack
> > > > trace looks suspiciosly like the other one, suggesting it is
> > > > crashing somewhat immediately as well.
> > > > 
> > > > 
> > > > > int32
> > > > > mytest_create(esm_handle_t handle, int32	next_state,
> > > > > esm_event_t	this_event, void *user_info)
> > > > > {
> > > > >     printk("%s:\n", __func__);
> > > > >     *(int *)0 = 0;
> > > > >     return next_state;
> > > > > }
> > > > > 
> > > > > Clearly the stack is not hosed here. Full output below:
> > > > > 
> > > > > modprobe sm-test
> > > > > neteee2: module license 'unspecified' taints kernel.
> > > > > eee_init: ENTRY
> > > > > eee_initIPCQueues: ENTRY my_index 0
> > > > > eee_initIPCQueues: EXIT; num_ipc_queues 6
> > > > > eee_initFixedFwdQueues: Initialize eee.rcv_queue[]
> > > > > eee_initFixedFwdQueues: EXIT
> > > > > eee_initFWDQueues: Initialize eee.fwd_queue[]
> > > > > eee_initFwdQueues() eee.fwd_queue[0] fwd_start
> > > > > 0xa8000001ffa8f950, end 0xa8000001ffa8fa50 eee_initFwdQueues()
> > > > > eee.fwd_queue[0] fwd_head 0xa8000001ffa8f950, tail
> > > > > 0xa8000001ff1c3a78 eee_initFwdQueues() eee.fwd_queue[1]
> > > > > fwd_start 0xa8000001fe873000, end 0xa8000001fe874000
> > > > > eee_initFwdQueues() eee.fwd_queue[1] fwd_head
> > > > > 0xa8000001fe873000, tail 0xa8000001ff1c3f48
> > > > > eee_initFwdQueues() eee.fwd_queue[2] fwd_start
> > > > > 0xa8000001fdd52000, end 0xa8000001fdd53000
> > > > > eee_initFwdQueues() eee.fwd_queue[2] fwd_head
> > > > > 0xa8000001fdd52000, tail 0xa8000001ff1c3960 eee_initFWDQueues
> > > > > EXIT eee_app_init: ENTRY/EXIT eee_thread: enter eee_init:
> > > > > EXIT eee_thread: enter mytest_create: CPU 2 Unable to handle
> > > > > kernel paging request at virtual address 0000000000000000,
> > > > > epc == ffffffffc03d10c0, ra == ffffffffc03d10c0 Oops[#1]: Cpu
> > > > > 2 $ 0   : 0000000000000000 0000000010001fe0 0000000000000012
> > > > > 0000000000000000 $ 4   : ffffffff832f2a60 0000000010001fe0
> > > > > 0000000000020000 a8000001ffab2d40 $ 8   : ffffffff832f2a68
> > > > > 0000000000000027 0000000000000001 0000000000000080 $12   :
> > > > > 00000000000008fc a8000000870c2000 ffffffff83330000
> > > > > a8000001ff478000 $16   : 0000000000000000 ffffffffc03d0000
> > > > > ffffffffc03d7b18 0000000000000001 $20   : 0000000000000000
> > > > > ffffffffc03d7b08 ffffffffc03d7e20 c000000000000000 $24   :
> > > > > 0000000000000001 a8000001ff280000 $28   : a8000001ff478000
> > > > > a8000001ff47bca0 ffffffff83050190 ffffffffc03d10c0 Hi    :
> > > > > 00000000000000a0 Lo    : 00000000000000be epc   :
> > > > > ffffffffc03d10c0 $LVL8+0x0/0x10 [sm_test]     Tainted: P
> > > > > ra    : ffffffffc03d10c0 $LVL8+0x0/0x10 [sm_test] Status:
> > > > > 10001fe3    KX SX UX KERNEL EXL IE Cause : 1080800c
> > > > > BadVA : 0000000000000000
> > > > > PrId  : 05041100
> > > > > Modules linked in: sm_test(P) esm_threads(P) elog_mod(P)
> > > > > neteee2(P) ipv6 Process modprobe (pid: 1075,
> > > > > threadinfo=a8000001ff478000, task=a80000000b15edc0) Stack :
> > > > > 0000000000000000 0000000000000000 0000000000000000
> > > > > 0000000000000000 a8000001fd36a758 ffffffffc03b3fb0
> > > > > 0000000000000000 0000000000000020 0000000000000000
> > > > > 0000000000000000 0000000000000000 a8000001ff47bd40
> > > > > ffffffffc02e1398 ffffffffc03d1630 a8000001fee43640
> > > > > 0000000000000021 0000000000000021 ffffffffc03d7b60
> > > > > c000000000006878 ffffffffc03d11b4 ffffffffc03d7b60
> > > > > ffffffff832f0000 ffffffff83051b14 ffffffff83053020
> > > > > ffffffffc02ece38 ffffffffc03533a0 c000000000006748
> > > > > 0000000000000007 0000000000000000 0000000000000345
> > > > > 0000000000000345 0000000000000000 0000000000000006
> > > > > c000000000007078 c000000000006af8 c000000000007038
> > > > > 0000000000000000 a8000001ff1c3d18 c0000000000112d0
> > > > > 000000000000001f ... Call Trace: [<ffffffffc03d10c0>]
> > > > > $LVL8+0x0/0x10 [sm_test] [<ffffffffc03d10c0>] $LVL8+0x0/0x10
> > > > > [sm_test]
> > > > > 
> > > > > 
> > > > > Code: ffa70008  0040f809  ffa80010 <ac000000> 0200102d
> > > > > dfbf0028 dfb00020  03e00008  67bd0030 Crashdump saved to prom
> > > > > Segmentation fault
> > > > > tuxrx0:~#
> > > > > 
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Kozlovsky, Maxim 
> > > > > Sent: Monday, March 22, 2010 2:56 PM
> > > > > To: Sharp, Andy
> > > > > Subject: RE: kernel stacktrace
> > > > > 
> > > > > 
> > > > > 
> > > > > -----Original Message-----
> > > > > From: Andrew Sharp [mailto:andy.sharp@lsi.com] 
> > > > > Sent: Monday, March 22, 2010 2:56 PM
> > > > > To: Kozlovsky, Maxim
> > > > > Subject: Re: kernel stacktrace
> > > > > 
> > > > > On Mon, 22 Mar 2010 15:49:16 -0600 "Kozlovsky, Maxim"
> > > > > <Maxim.Kozlovsky@lsi.com> wrote:
> > > > > 
> > > > > > Hello,
> > > > > > 
> > > > > > Why I am getting so uninformative stack trace from kernel
> > > > > > crash?
> > > > > > 
> > > > > > Call Trace:
> > > > > > [<ffffffffc0bba578>] $LVL86+0x8/0x10 [scsi_mod]
> > > > > > [<ffffffffc0bba568>] $LVL85+0x0/0x8 [scsi_mod]
> > > > > > 
> > > > > > Is there something I can do to make it better?
> > > > > > 
> > > > > > Max
> > > > > > 
> > > > > 
> > > > > Normally you shouldn't get such a short stack trace, hence I
> > > > > would say your stack is hosed. 
> > > > > 
> > > > > [MK] 
> > > > > The stack is not hosed. 
