X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C80B62.A2ADDB3D@onstor-exch02.onstor.net>; Wed, 10 Oct 2007 09:26:04 -0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: common ssc core stack
Date: Wed, 10 Oct 2007 09:26:04 -0800
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E030E38D6@onstor-exch02.onstor.net>
In-Reply-To: <20071009192416.2828d30a@ripper.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: common ssc core stack
Thread-Index: AcgK5Ki9uqD5HxavTsGf8mbCbXCQdgAfbiQg
From: "Mike Lee" <mike.lee@onstor.com>
To: "Andy Sharp" <andy.sharp@onstor.com>
Cc: "dl-Cougar" <dl-Cougar@onstor.com>

Andy:
I have a separte mail chain with Larry on this; I will work with him =
today.
Will take up on your offer as needed.  Thanks!
-Mike

-----Original Message-----
From: Andy Sharp=20
Sent: Tuesday, October 09, 2007 7:24 PM
To: Mike Lee
Cc: dl-Cougar
Subject: Re: common ssc core stack


you will likely need the packages to be part of the development (cross
build) area, as well as the runtime package.  you can install the
runtime package on your filer with this command:

# apt-get install libdmalloc4

installing the libdmalloc-dev package into the cross build area is
juuuuust a bit harder.  larry or i or maybe even maximillion can help
you.

On Tue, 9 Oct 2007 17:52:13 -0700 "Mike Lee" <mike.lee@onstor.com>
wrote:

> Thanks all.
> Since Max had already introduced dmalloc into our code, I will look
> into dmalloc first.
> However, I don't think it is part of the "golden" root file system
> yet.
>=20
> I will try to install it first; I'll check out mpatrol if dmalloc does
> not work out.
> -Mike
>=20
> -----Original Message-----
> From: Tim Gardner=20
> Sent: Tuesday, October 09, 2007 5:30 PM
> To: Mike Lee; dl-Cougar
> Subject: RE: common ssc core stack
>=20
>=20
> Exact same core that I saw when I worked on the spm problem that
> turned out to be a free of a pointer that was offset from the
> location returned from the malloc. Linux memory
> allocation/deallocation appears to be much more strict about proper
> coding than BSD.
>=20
> Tim
>=20
> -----Original Message-----
> From: Mike Lee=20
> Sent: Tuesday, October 09, 2007 5:24 PM
> To: dl-Cougar
> Subject: common ssc core stack
>=20
>=20
> Hi Team:
>=20
> In three of the ssc daemon crashes I've analyzed thus far, we're
> getting the same stack:
>=20
> Program terminated with signal 6, Aborted.
> #0  0x2b52ab04 in kill () from /lib/libc.so.6
> (gdb) where
> #0  0x2b52ab04 in kill () from /lib/libc.so.6
> #1  0x2b52c200 in abort () from /lib/libc.so.6
> #2  0x2b568454 in __fsetlocking () from /lib/libc.so.6
> #3  0x2b568454 in __fsetlocking () from /lib/libc.so.6
> Previous frame identical to this frame (corrupt stack?)
> (gdb)
> The instruction address on the stack frames are not the same, but the
> function names are.
>=20
> Specifically, this stack was observed in:
> defect 20632 - spm crash
> defect 20649 - vsd crash=20
> defect 20651 - sanmd crash
>=20
> So, I think we're seeing manifestations of the same problem. =20
> Please let me know if you have recommendations/insights on this
> symptom. For now, I'm trying to reproduce the crash on a case-by-case
> basis.
>=20
> Thanks.
>=20
> -Mike
