AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<Maxim.Kozlovsky@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/LSI/INBOX	0	861DA0537719934884B3D30A2666FECC010E3D27B9@cosmail02.lsi.com
X-Sylpheed-End-Special-Headers: 1
Date: Fri, 9 Apr 2010 13:38:51 -0700
From: Andrew Sharp <andy.sharp@lsi.com>
To: "Kozlovsky, Maxim" <Maxim.Kozlovsky@lsi.com>
Subject: Re: kernel stacktrace
Message-ID: <20100409133851.62431bc2@ripper.onstor.net>
In-Reply-To: <861DA0537719934884B3D30A2666FECC010E3D27B9@cosmail02.lsi.com>
References: <861DA0537719934884B3D30A2666FECC010E3D2043@cosmail02.lsi.com>
	<20100322145532.6950acfa@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D2086@cosmail02.lsi.com>
	<20100322172110.0550c53c@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D226B@cosmail02.lsi.com>
	<20100323111014.44f376df@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D22A1@cosmail02.lsi.com>
	<20100323111533.7002e8c0@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D22AE@cosmail02.lsi.com>
	<20100323114336.47b05d17@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D22D6@cosmail02.lsi.com>
	<20100324154323.76ba2a7b@ripper.onstor.net>
	<861DA0537719934884B3D30A2666FECC010E3D27B9@cosmail02.lsi.com>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable

Ah, but I did, simulating our makefiles exactly, including installing
the module, with the same results.

=46rom your test:

> > > > > Modules linked in: sm_test(P) esm_threads(P) elog_mod(P)


Not only are these modules linked in, but I see output from eee_thread
that it is being called and doing stuff.  Problem found?

Max owes me a case of beer!  Yay!



On Wed, 24 Mar 2010 16:45:42 -0600 "Kozlovsky, Maxim"
<Maxim.Kozlovsky@lsi.com> wrote:

> No you did not. Build and install a module with our makefiles.
>=20
> -----Original Message-----
> From: Andrew Sharp [mailto:andy.sharp@lsi.com]
> Sent: Wednesday, March 24, 2010 3:43 PM
> To: Kozlovsky, Maxim
> Subject: Re: kernel stacktrace
>=20
> I did as you requested, and I get the expected output:
>=20
> tuxstor0:~# insmod /lib/modules/`uname -r`/kernel/drivers/testm.ko
> testm: module license 'unspecified' taints kernel.
> test_init:
> CPU 0 Unable to handle kernel paging request at virtual address
> 0000000000000000, epc =3D=3D ffffffffc000402c, ra =3D=3D ffffffffc000402c
> Oops[#2]: Cpu 0
> $ 0   : 0000000000000000 0000000010001fe0 000000000000000f
> ffffffff83380000 $ 4   : ffffffff8337abf8 0000000010001fe0
> ffffffffffffffff 0000000000002744 $ 8   : ffffffff83312a60
> 0000000000000000 ffffffffffff2744 ffffffff83380000 $12   :
> ffffffff83390000 ffffffff83380000 0000000000000000 a8000001fe288000
> $16   : ffffffffc00044e0 ffffffff83310000 a8000001ff58d9b8
> 0000000000000013 $20   : 0000000000000013 ffffffffc00044e0
> c0000000000005e0 c000000000000000 $24   : 0000000000000000
> 0000000000000020 $28   : a8000001fe7b8000 a8000001fe7bbd50
> ffffffff83050050 ffffffffc000402c Hi    : 0000000000000000 Lo    :
> 0000000000000000 epc   : ffffffffc000402c init_module+0x24/0x38
> [testm]     Tainted: P ra    : ffffffffc000402c init_module+0x24/0x38
> [testm] Status: 10001fe3    KX SX UX KERNEL EXL IE
> Cause : 1080800c
> BadVA : 0000000000000000
> PrId  : 01041100
> Modules linked in: testm(P) acpu
> Process insmod (pid: 16127, threadinfo=3Da8000001fe7b8000,
> task=3Da8000001ff4acd60) Stack : ffffffff830519d4 ffffffff83052edc
> ffffffff832f7050 0000000000000000 c000000000000534 0000000000000007
> 0000000000000000 0000000000000020 0000000000000020 0000000000000000
> 0000000000000006 c000000000000a60 c000000000000820 c000000000000a20
> a8000001ff76cd60 a8000001ff241dc0 c000000000000e90 0000000000000011
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000 0000000000000000 0000000000000000 000000007f89aedb
> 0000000000000f2a 0000000000004000 0000000000442060 000000007f89aedb
> 0000000000442050 0000000000000003 ...
> Call Trace:
> [<ffffffffc000402c>] init_module+0x24/0x38 [testm]
> [<ffffffff830519d4>] sys_init_module+0x164/0x19c8
> [<ffffffff83011cf4>] handle_sys+0x114/0x130
>=20
>=20
> Code: ffbf0000  0040f809  64a54090 <ac000000> 0000102d  dfbf0000
> 03e00008  67bd0010  00002801 primary crash already saved... crash #2
> (Oops) will be ignored Segmentation fault
> tuxstor0:~#
>=20
>=20
>=20
>=20
>=20
> On Tue, 23 Mar 2010 12:49:39 -0600 "Kozlovsky, Maxim"
> <Maxim.Kozlovsky@lsi.com> wrote:
>=20
> > The obvious difference between the two cases is that this module is
> > built as a kernel module and my module is external. Must be
> > something in the make files that is not done right for the external
> > modules. Can we go through the denial phase faster?
> >
> > -----Original Message-----
> > From: Andrew Sharp [mailto:andy.sharp@lsi.com]
> > Sent: Tuesday, March 23, 2010 11:44 AM
> > To: Kozlovsky, Maxim
> > Subject: Re: kernel stacktrace
> >
> > Here's what I get:
> >
> > tuxstor0:~# modprobe acpu
> > acpu_threadg_init:
> > CPU 0 Unable to handle kernel paging request at virtual address
> > 0000000000000000, epc =3D=3D ffffffffc0002024, ra =3D=3D ffffffffc00020=
24
> > Oops[#1]: Cpu 0
> > $ 0   : 0000000000000000 0000000010001fe0 0000000000000017
> > 0000000000000000 $ 4   : ffffffff83312a60 0000000010001fe0
> > a80000000a446108 a8000001ff1a2110 $ 8   : ffffffff83312a68
> > 0000000000000000 ffffffffffff1dea ffffffff83380000 $12   :
> > ffffffff83390000 ffffffff83380000 0000000000000000 a8000001fda70000
> > $16   : ffffffffc0000800 ffffffff83310000 a8000001ff499318
> > 0000000000000016 $20   : 0000000000000016 ffffffffc0000800
> > c000000000000800 c000000000000000 $24   : 0000000000000000
> > a8000001fef18000 $28   : a8000001fda70000 a8000001fda73d50
> > ffffffff83050050 ffffffffc0002024 Hi    : 000000000000007f Lo    :
> > be76c8b4395810fa epc   : ffffffffc0002024
> > acpu_threadg_init+0x24/0x38 [acpu]     Not tainted ra    :
> > ffffffffc0002024 acpu_threadg_init+0x24/0x38 [acpu] Status:
> > 10001fe3    KX SX UX KERNEL EXL IE Cause : 1080800c
> > BadVA : 0000000000000000
> > PrId  : 01041100
> > Modules linked in: acpu
> > Process modprobe (pid: 13250, threadinfo=3Da8000001fda70000,
> > task=3Da8000001feacf3a8) Stack : ffffffff830519d4 ffffffff83052edc
> > ffffffff832f7340 0000000000000000 c000000000000734 000000000000000a
> > 0000000000000000 000000000000002d 000000000000002d 0000000000000000
> > 0000000000000009 c000000000000d40 c000000000000b00 c000000000000d00
> > 0000000000000000 a8000001ffac6378 c000000000001428 0000000000000014
> > 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > 0000000000000000 0000000000000000 0000000000000000 0000000000449d94
> > 0000000000000000 000000007fa27b34 0000000000000000 000000000044b9a0
> > 0000000000449d88 000000002aac0000 ...
> > Call Trace:
> > [<ffffffffc0002024>] acpu_threadg_init+0x24/0x38 [acpu]
> > [<ffffffff830519d4>] sys_init_module+0x164/0x19c8
> > [<ffffffff83011cf4>] handle_sys+0x114/0x130
> >
> >
> > Code: ffbf0000  0040f809  64a500f0 <ac000000> 0000102d  dfbf0000
> > 03e00008  67bd0010  00000009 Crashdump saved to prom
> > Segmentation fault
> > tuxstor0:~#
> >
> >
> >
> >
> >
> >
> > On Tue, 23 Mar 2010 12:26:33 -0600 "Kozlovsky, Maxim"
> > <Maxim.Kozlovsky@lsi.com> wrote:
> >
> > > Try this one. Loading this module produces bad stack trace. Is
> > > this stack hosed as well?
> > >
> > > #include <linux/module.h>
> > >
> > > int
> > > test_init(void)
> > > {
> > >     printk("%s: \n", __func__);
> > >     *(int *)0 =3D 0;
> > >     return 0;
> > > }
> > >
> > > void
> > > test_exit(void)
> > > {
> > > }
> > >
> > > module_init(test_init);
> > > module_exit(test_exit);
> > >
> > > tuxrx0:~# modprobe eee-test
> > > modprobe eee-test
> > > Call Trace:
> > > [<ffffffffc02e102c>] $LCFI1+0x8/0x1c [eee_test]
> > >
> > >
> > > -----Original Message-----
> > > From: Andrew Sharp [mailto:andy.sharp@lsi.com]
> > > Sent: Tuesday, March 23, 2010 11:16 AM
> > > To: Kozlovsky, Maxim
> > > Subject: Re: kernel stacktrace
> > >
> > > It works all the time when the stack isn't hosed.
> > >
> > > On Tue, 23 Mar 2010 12:13:05 -0600 "Kozlovsky, Maxim"
> > > <Maxim.Kozlovsky@lsi.com> wrote:
> > >
> > > > This will make it easier for you to fix the problem, it means
> > > > the code is not completely broken, sometimes it works.
> > > >
> > > > -----Original Message-----
> > > > From: Andrew Sharp [mailto:andy.sharp@lsi.com]
> > > > Sent: Tuesday, March 23, 2010 11:10 AM
> > > > To: Kozlovsky, Maxim
> > > > Subject: Re: kernel stacktrace
> > > >
> > > > The stack trace works just fine any other time.
> > > >
> > > > On Tue, 23 Mar 2010 11:42:16 -0600 "Kozlovsky, Maxim"
> > > > <Maxim.Kozlovsky@lsi.com> wrote:
> > > >
> > > > > The stack is not hosed, the stack trace is. This example does
> > > > > not do anything with the threads, it crashes in the context of
> > > > > modprobe process. You need to add a task to the schedule for
> > > > > yourself to fix the stack trace, it should come at higher
> > > > > priority than 8-way.
> > > > >
> > > > > -----Original Message-----
> > > > > From: Andrew Sharp [mailto:andy.sharp@lsi.com]
> > > > > Sent: Monday, March 22, 2010 5:21 PM
> > > > > To: Kozlovsky, Maxim
> > > > > Subject: Re: kernel stacktrace
> > > > >
> > > > > On Mon, 22 Mar 2010 17:01:17 -0600 "Kozlovsky, Maxim"
> > > > > <Maxim.Kozlovsky@lsi.com> wrote:
> > > > >
> > > > > > Any ideas besides "your stack is hosed"?
> > > > > >
> > > > > >
> > > > > > Here is another bad stack trace, after modifying the
> > > > > > code/sm-tests/sm-test.c to crash in mytest_create():
> > > > > >
> > > > > >         ...
> > > > > > Call Trace:
> > > > > > [<ffffffffc03d10c0>] $LVL8+0x0/0x10 [sm_test]
> > > > > > [<ffffffffc03d10c0>] $LVL8+0x0/0x10 [sm_test]
> > > > >
> > > > > C'mon, that doesn't look hosed to you?  I don't know the full
> > > > > context of what you're up to, but it looks like you're mucking
> > > > > about in the esm_threads stuff.  Could be hosing the stack.
> > > > > Anyway, this example does a page fault right away, and the
> > > > > stack trace looks suspiciosly like the other one, suggesting
> > > > > it is crashing somewhat immediately as well.
> > > > >
> > > > >
> > > > > > int32
> > > > > > mytest_create(esm_handle_t handle, int32      next_state,
> > > > > > esm_event_t   this_event, void *user_info)
> > > > > > {
> > > > > >     printk("%s:\n", __func__);
> > > > > >     *(int *)0 =3D 0;
> > > > > >     return next_state;
> > > > > > }
> > > > > >
> > > > > > Clearly the stack is not hosed here. Full output below:
> > > > > >
> > > > > > modprobe sm-test
> > > > > > neteee2: module license 'unspecified' taints kernel.
> > > > > > eee_init: ENTRY
> > > > > > eee_initIPCQueues: ENTRY my_index 0
> > > > > > eee_initIPCQueues: EXIT; num_ipc_queues 6
> > > > > > eee_initFixedFwdQueues: Initialize eee.rcv_queue[]
> > > > > > eee_initFixedFwdQueues: EXIT
> > > > > > eee_initFWDQueues: Initialize eee.fwd_queue[]
> > > > > > eee_initFwdQueues() eee.fwd_queue[0] fwd_start
> > > > > > 0xa8000001ffa8f950, end 0xa8000001ffa8fa50
> > > > > > eee_initFwdQueues() eee.fwd_queue[0] fwd_head
> > > > > > 0xa8000001ffa8f950, tail 0xa8000001ff1c3a78
> > > > > > eee_initFwdQueues() eee.fwd_queue[1] fwd_start
> > > > > > 0xa8000001fe873000, end 0xa8000001fe874000
> > > > > > eee_initFwdQueues() eee.fwd_queue[1] fwd_head
> > > > > > 0xa8000001fe873000, tail 0xa8000001ff1c3f48
> > > > > > eee_initFwdQueues() eee.fwd_queue[2] fwd_start
> > > > > > 0xa8000001fdd52000, end 0xa8000001fdd53000
> > > > > > eee_initFwdQueues() eee.fwd_queue[2] fwd_head
> > > > > > 0xa8000001fdd52000, tail 0xa8000001ff1c3960
> > > > > > eee_initFWDQueues EXIT eee_app_init: ENTRY/EXIT eee_thread:
> > > > > > enter eee_init: EXIT eee_thread: enter mytest_create: CPU 2
> > > > > > Unable to handle kernel paging request at virtual address
> > > > > > 0000000000000000, epc =3D=3D ffffffffc03d10c0, ra =3D=3D
> > > > > > ffffffffc03d10c0 Oops[#1]: Cpu 2 $ 0   : 0000000000000000
> > > > > > 0000000010001fe0 0000000000000012 0000000000000000 $ 4   :
> > > > > > ffffffff832f2a60 0000000010001fe0 0000000000020000
> > > > > > a8000001ffab2d40 $ 8   : ffffffff832f2a68 0000000000000027
> > > > > > 0000000000000001 0000000000000080 $12   : 00000000000008fc
> > > > > > a8000000870c2000 ffffffff83330000 a8000001ff478000 $16   :
> > > > > > 0000000000000000 ffffffffc03d0000 ffffffffc03d7b18
> > > > > > 0000000000000001 $20   : 0000000000000000 ffffffffc03d7b08
> > > > > > ffffffffc03d7e20 c000000000000000 $24   : 0000000000000001
> > > > > > a8000001ff280000 $28   : a8000001ff478000 a8000001ff47bca0
> > > > > > ffffffff83050190 ffffffffc03d10c0 Hi    : 00000000000000a0
> > > > > > Lo    : 00000000000000be epc   : ffffffffc03d10c0
> > > > > > $LVL8+0x0/0x10 [sm_test]     Tainted: P ra    :
> > > > > > ffffffffc03d10c0 $LVL8+0x0/0x10 [sm_test] Status:
> > > > > > 10001fe3    KX SX UX KERNEL EXL IE Cause : 1080800c BadVA :
> > > > > > 0000000000000000 PrId  : 05041100 Modules linked in:
> > > > > > sm_test(P) esm_threads(P) elog_mod(P) neteee2(P) ipv6
> > > > > > Process modprobe (pid: 1075, threadinfo=3Da8000001ff478000,
> > > > > > task=3Da80000000b15edc0) Stack : 0000000000000000
> > > > > > 0000000000000000 0000000000000000 0000000000000000
> > > > > > a8000001fd36a758 ffffffffc03b3fb0 0000000000000000
> > > > > > 0000000000000020 0000000000000000 0000000000000000
> > > > > > 0000000000000000 a8000001ff47bd40 ffffffffc02e1398
> > > > > > ffffffffc03d1630 a8000001fee43640 0000000000000021
> > > > > > 0000000000000021 ffffffffc03d7b60 c000000000006878
> > > > > > ffffffffc03d11b4 ffffffffc03d7b60 ffffffff832f0000
> > > > > > ffffffff83051b14 ffffffff83053020 ffffffffc02ece38
> > > > > > ffffffffc03533a0 c000000000006748 0000000000000007
> > > > > > 0000000000000000 0000000000000345 0000000000000345
> > > > > > 0000000000000000 0000000000000006 c000000000007078
> > > > > > c000000000006af8 c000000000007038 0000000000000000
> > > > > > a8000001ff1c3d18 c0000000000112d0 000000000000001f ... Call
> > > > > > Trace: [<ffffffffc03d10c0>] $LVL8+0x0/0x10 [sm_test]
> > > > > > [<ffffffffc03d10c0>] $LVL8+0x0/0x10 [sm_test]
> > > > > >
> > > > > >
> > > > > > Code: ffa70008  0040f809  ffa80010 <ac000000> 0200102d
> > > > > > dfbf0028 dfb00020  03e00008  67bd0030 Crashdump saved to
> > > > > > prom Segmentation fault
> > > > > > tuxrx0:~#
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Kozlovsky, Maxim
> > > > > > Sent: Monday, March 22, 2010 2:56 PM
> > > > > > To: Sharp, Andy
> > > > > > Subject: RE: kernel stacktrace
> > > > > >
> > > > > >
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Andrew Sharp [mailto:andy.sharp@lsi.com]
> > > > > > Sent: Monday, March 22, 2010 2:56 PM
> > > > > > To: Kozlovsky, Maxim
> > > > > > Subject: Re: kernel stacktrace
> > > > > >
> > > > > > On Mon, 22 Mar 2010 15:49:16 -0600 "Kozlovsky, Maxim"
> > > > > > <Maxim.Kozlovsky@lsi.com> wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > Why I am getting so uninformative stack trace from kernel
> > > > > > > crash?
> > > > > > >
> > > > > > > Call Trace:
> > > > > > > [<ffffffffc0bba578>] $LVL86+0x8/0x10 [scsi_mod]
> > > > > > > [<ffffffffc0bba568>] $LVL85+0x0/0x8 [scsi_mod]
> > > > > > >
> > > > > > > Is there something I can do to make it better?
> > > > > > >
> > > > > > > Max
> > > > > > >
> > > > > >
> > > > > > Normally you shouldn't get such a short stack trace, hence I
> > > > > > would say your stack is hosed.
> > > > > >
> > > > > > [MK]
> > > > > > The stack is not hosed.
