X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C7F72E.C724C2FA@onstor-exch02.onstor.net>; Fri, 14 Sep 2007 16:24:28 -0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: smells like hardware to me
Date: Fri, 14 Sep 2007 16:24:00 -0800
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E0587B60B@onstor-exch02.onstor.net>
In-Reply-To: <20070914151804.5f897191@ripper.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: smells like hardware to me
Thread-Index: Acf3HR9mJaDRfc4BRaKRbnZa7YLToQAD+hAQ
References: <20070913213514.167a4f46@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E0587B1E0@onstor-exch02.onstor.net> <20070914151804.5f897191@ripper.onstor.net>
From: "Brian Stark" <brian.stark@onstor.com>
To: "Andy Sharp" <andy.sharp@onstor.com>

Dude, I give you a ton of credit!  When I first got Cougar, I thought
1GB was populated on the SSC, so it tripped me up when accesses to the
upper 512MB didn't work.

I'm sure we can get this segment working by enabling the SX bit in PROM.
Since we normally don't use this region, we don't have it enabled.  The
address maps to physical zero, which is a memory region, so there's no
reason we can't use it.  We'll take a look at it, but you might also be
able to work some magic in Linux by setting that bit in the CP0 status
register. =20

The fact you're not seeing exceptions is a bit concerning.  We see the
following when accessing that region in PROM:

COUGAR-PROM> m ffffffffc0000000
XTLB Miss Exception:
   EPC =3D 0x80800580
   BADVADDR =3D 0xC0000000



Brian
=20

> -----Original Message-----
> From: Andy Sharp=20
> Sent: Friday, September 14, 2007 3:18 PM
> To: Brian Stark
> Subject: Re: smells like hardware to me
>=20
> On Fri, 14 Sep 2007 09:45:17 -0700 "Brian Stark"
> <brian.stark@onstor.com> wrote:
>=20
> > Andy,
> >=20
> > First, I just want to make sure you're not trying to access=20
> physical=20
> > 0xc0000000.  This points to memory that isn't populated at=20
> this point.
>=20
> Please.  Give me some credit.
>=20
> > Since cksseg maps to 0, I don't think this is the case, but=20
> I thought=20
> > I'd point it out.
> >=20
> > To use cksseg, you have to set the SX bit in the Status=20
> register.  We=20
> > can do this in the PROM if you want.  We typically have not=20
> used this=20
> > space for addressing, so we do not enable it by default.
>=20
> This is the kernel doing all this, I'm just reporting the=20
> facts.  It could be a kernel issue, but it doesn't seem=20
> likely based on current evidence.  But I thought you guys=20
> could easily try it yourselves to see if the same thing=20
> happens to you.
>=20
> I would normally expect to get an exception of some sort if=20
> it was some kind of paging error or address violation. =20
> Instead the whole thing just locks up.
>=20
> >=20
> >=20
> > Brian
> > =20
> >=20
> > > -----Original Message-----
> > > From: Andy Sharp
> > > Sent: Thursday, September 13, 2007 9:35 PM
> > > To: Brian Stark
> > > Subject: smells like hardware to me
> > >=20
> > > Hi Brian,
> > >=20
> > > I think I may have bumped into a hardware-ish problem=20
> while trying=20
> > > to diagnose the module loading problem.  It seems that the system=20
> > > locks solid when it tries to access address 0xffffffffc0000000.
> > >=20
> > > This should be standard CKSSEG space on mips64 so there=20
> shouldn't be=20
> > > a problem.  Is it possible that there is something we're=20
> not doing=20
> > > quite right when setting up the
> > > 1125 wrt all the mips64 segments and so on?
> > > The kernel should be programming things as much as=20
> possible, but I'm=20
> > > wondering if there isn't something that has to be done in=20
> the setup=20
> > > stream or something like that.
> > >=20
> > > Any of this ring any bells?  As always, an education is=20
> appreciated=20
> > > if warranted ~:^)
> > >=20
> > > Cheers,
> > >=20
> > > a
> > >=20
>=20
