X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C874E5.E570DA62@onstor-exch02.onstor.net>; Thu, 21 Feb 2008 17:00:12 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: Warren's PROM upgrade crash
Date: Thu, 21 Feb 2008 17:00:11 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E0875EB10@onstor-exch02.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E085F9D8A@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Warren's PROM upgrade crash
Thread-Index: Ach032swmPgmFLB2SRemN/use0cBvQAA1nCJAACFG5A=
References: <BB375AF679D4A34E9CA8DFA650E2B04E085F9D89@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E085F9D8A@onstor-exch02.onstor.net>
From: "Warren Gale" <warren.gale@onstor.com>
To: "Mike Lee" <mike.lee@onstor.com>
Cc: "dl-Cougar" <dl-Cougar@onstor.com>

Mike,

The code from the SSC: (which hasn't changed as far as I know)
 File: ssc-prom-upgrade/prom-upgrade.c

  upgrade_remote_prom(int slot, int cpu, char *prom_file)
     status =3D sendStartMessage(s, slot, cpu);

=20
static int
sendStartMessage(int s, int slot, int cpu)
{
        promUpgradeHdr_t hdr;
        promUpgradeHdr_t resp;
        struct timeval tv =3D { 10, 0};
        hdr.msgType =3D promUpgradeStart;
        if (sendAgileMsg(s, &hdr, sizeof(hdr), "prom-upgrade", cpu,
slot, 0,
                                                &tv, &resp,
sizeof(resp)) =3D=3D sizeof(resp))
        {
                /* ready to accept upgrade */
                if (resp.msgType =3D=3D promUpgradeStartAck) {
                        return 1;
                }
                else /* upgrade denied */
                if (resp.msgType =3D=3D promUpgradeStartNak) {
                        return 0;
                }
        }
        return -1;
}


If I read this right, in the "sendAgileMsg" the app_id is set to "
prom-upgrade".
So I don't know where it's getting dropped.

Warren

-----Original Message-----
From: Mike Lee=20
Sent: Thursday, February 21, 2008 3:46 PM
To: Warren Gale
Cc: dl-Cougar
Subject: RE: Warren's PROM upgrade crash


Warren:

The panic is triggered by a zero App Id in the destination port field of
SSC-bound ack edescriptor header.=20

This dest_port field should have originated from SSC, and presently has
a value of 0x400000000, where the lowest 3 nibbles makes up the app id,
and should be 46 for "prom upgrade".

I'm guessing that the message coming from the SSC was not correct to be
begin with, since this dest_port=20
was copied from the src_port field of the initial message from the SSC.

Is your code setting up any edescriptor chain that gets sent down to the
embedded processors?

-Mike


-----Original Message-----
From: Mike Lee
Sent: Thu 2/21/2008 3:13 PM
To: dl-Cougar
Subject: Warren's PROM upgrade crash
=20
Hi All:

Warren is seeing a crash on the embedded processors when he attempts a
PROM upgrade.

The crashing stack is as follows:

heuristic-fence-post' command.
(gdb) where
#0  eee_forwardPacket (edesc=3D0x200200b900) at eee-fwd.c:2266
#1  0x8301525c in eee_sendFragmentedMessage (src=3D0xffffffff86518c20,
    num_bytes=3D4, dest_port=3D1073741824, source_port=3D1073942574,
firstDesc=3D0x0)
    at eee-msg.c:183
#2  0x83015334 in eee_alloc_or_send_message (src=3D0xffffffff86518c20,
    num_bytes=3D4, dest_port=3D1073741824, source_port=3D1073942574,
first_edesc=3D0x0)
    at eee-msg.c:245
#3  0x830157e8 in eee_sendMessage (src=3D0xffffffff86518c20, =
num_bytes=3D4,
    dest_port=3D1073741824, source_port=3D1073942574) at eee-msg.c:384
#4  0x83417478 in do_ack (descr=3D0x40031d5d80,
msgType=3DpromUpgradeStartAck)
    at upgrade-app.c:74
#5  0x834174b4 in ack_start (descr=3D0x40031d5d80) at upgrade-app.c:92
#6  0x83417710 in do_start (descr=3D0x40031d5d80, =
cb=3D0xffffffff8383b590)
    at upgrade-app.c:206
#7  0x83417c58 in receive_upgrade_message (descr=3D0x40031d5d80)
    at upgrade-app.c:409
#8  0x83012d4c in eee_deliverPacketToApp (edesc=3D0x40031d5d80, =
app_id=3D46)
    at eee-fwd.c:2005
#9  0x83012e28 in eee_processPacket (edesc=3D0x40031d5d80) at
eee-fwd.c:2117
#10 0x830103bc in eee_poll_local_queue (cb=3D0xffffffff83936ec0, =
tref=3D2)
    at eee-fwd.c:242
#11 0x8301746c in eee_poll (num_loops=3D21) at eee-poll.c:551
#12 0x8304ef84 in getchar () at serio-api.c:333
#13 0x830435e8 in get_line (p=3D0xffffffff86518f28 "", usehist=3D1) at
hist.c:145
#14 0x83043d34 in get_input (p=3D0xffffffff86518f28 "") at hist.c:259
#15 0x83043d6c in get_cmd (p=3D0xffffffff86518f28 "") at hist.c:284
#16 0x8304d97c in runtime_prompt () at test.c:558
#17 0x8340f168 in smp_entry (a0=3D18446656112779329535, cpu_num=3D2, =
a2=3D0,
    a3=3D1128678705) at smp.c:246
warning: Warning: GDB can't find the start of the function at
0xffffffff9fc0bccc.
(gdb) list
2261
2262        dp =3D edesc->hdr.dest_port;
2263
2264        if (EEE_GET_APP_ID(dp) =3D=3D 0)
2265            panic("eee_forwardPacket, appid =3D 0, edesc =3D %p\n",
edesc);
2266        if (EEE_GET_PKT_LEN(edesc) =3D=3D 0)
2267            panic("eee_forwardPacket, pktlen =3D 0, edesc =3D %p\n",
edesc);
2268
2269        slot_id =3D EEE_GET_SLOT_NUM(dp);
2270        cpu_id =3D EEE_GET_DEST_CPU(dp);
(gdb) p/x edesc
$1 =3D 0x200200b900
(gdb) p/x *edesc
$2 =3D {hdr =3D {control =3D 0x40100004, offset =3D 0x4b000000,
    dest_port =3D 0x40000000, src_port =3D 0x4003102e, attr =3D 0x0, =
next =3D
0x0},
  bd =3D {{control =3D 0x0, len =3D 0x0, res_ =3D 0x0, buf =3D 0x0}, =
{control =3D
0x8000,
      len =3D 0x4, res_ =3D 0x0, buf =3D 0x200f9c5840}, {control =3D =
0x0, len =3D
0x0,
      res_ =3D 0x0, buf =3D 0x0}, {control =3D 0x0, len =3D 0x0, res_ =
=3D 0x0,
      buf =3D 0x0}, {control =3D 0x0, len =3D 0x0, res_ =3D 0x0, buf =3D =
0x0}, {
      control =3D 0x0, len =3D 0x0, res_ =3D 0x0, buf =3D 0x0}}}
(gdb)

Anyone can quickly tell what is going wrong? =20
Or willing to take up this one?

-Mike




