X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C740DE.57C14C17@onstor-exch02.onstor.net>; Thu, 25 Jan 2007 16:10:10 -0800
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----_=_NextPart_001_01C740DE.57C14C17"
Content-class: urn:content-classes:message
Subject: RE: CF Issue
Date: Thu, 25 Jan 2007 16:10:10 -0800
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E0138C429@onstor-exch02.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E022FA0E3@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: CF Issue
Thread-Index: Acc/+99wWNAdjQqjT0aNgaCoKidOeAAAR4+wACk5pSAABX+SAAAGTLRAAAJgfrA=
From: "Chris Vandever" <chris.vandever@onstor.com>
To: "Mark Farabaugh" <mark.farabaugh@onstor.com>
Cc: "Andy Sharp" <andy.sharp@onstor.com>

This is a multi-part message in MIME format.

------_=_NextPart_001_01C740DE.57C14C17
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

Andy, I'm not sure we need to know about the error handling for CF
errors, so I think you're off the hook here, although any insights would
still be appreciated.

Okay, I think I have the picture now, so correct me if I'm wrong.
*	We DON"T think it's a problem with the flash failing because we
can rewrite the flash with new images and it then works (the
"re-programming part you mentioned).
*	We think the problem is persistent with the flash because it
repeats after rebooting (the bit about our contract manufacturer seeing
the problem, shipping us the flash, and then us seeing it here).
*	You mentioned having "validated" the flash.  Does that mean you
did a "system compare -s" between the flash and the release supposedly
installed on it?  If not, what do you mean?

There was a similar problem that was fixed in R1.3.2.0 (change list
#19519) that was a transient problem where pm got bad data from BSD and
erroneously thought timekeeper had died and restarted it, the restarted
timekeeper failed, terminating pm, and pm killed all of its children
except the original timekeeper.  But, your data isn't consistent with
that problem and you're running code that should have the fix in it.

I'll need the following info:
*	How did the contract manufacturer get the image on the flash?
Did it ever work, or did it fail immediately?
*	If you haven't already done a "system compare -s" against the
appropriate R1.3.3.2 build I would suggest doing so.  It is entirely
possible this is a known problem where the system upgrade process
resulted in one or more files being corrupted.  I would guess this is
most likely the problem, although you didn't mention anyone doing a
system upgrade.  A few files are expected not to compare, so send me the
output.
*	If the compare looks good, then I'll need the elog (level info)
from a fresh boot that exhibits the problem.  Please include the syslog
as well just in case.

I sit on the last aisle between the Maui and Oahu conference rooms,
closer to Oahu, across from Bill Nadzam if that helps.

ChrisV

_____________________________________________
From: Mark Farabaugh=20
Sent: Thursday, January 25, 2007 3:32 PM
To: Chris Vandever
Cc: Andy Sharp
Subject: RE: CF Issue

Chris, Andy,
I'm relatively new here and ONstor so I'm not totally up on the Bobcat.
Our contract manufacture has had multiple failures with the symptom
below.
We have validated and re-programmed a flash with this symptom and it
came out working fine.  We do not believe this to be a part issue.
What could cause this symptom?  Could this be a corruption issue because
of an improper power down?  1.3.3.2 issue?  I have this flash if you
want to try it.
Also, please introduce yourselves, it's better to put faces with names.
Regards, Mark

_____________________________________________
From: Chris Vandever=20
Sent: Thursday, January 25, 2007 11:37 AM
To: Mark Farabaugh
Cc: Andy Sharp
Subject: FW: CF Issue

When the system boots it starts all of the OS processes for BSD and then
starts a program called pm (process manager).  pm will start all of the
other ONStor processes based on the contents of the
/usr/local/agile/etc/pmtab file.  This file tells it what processes to
start and in what order.  When pm starts a process it waits for a signal
from that process to indicate it has finished its initialization.  Once
it gets that signal pm will then start the next program in the list.

In your case pm has started the following programs:

		PID   TT  STAT    TIME    COMMAND
		30799 ??  S       0:02.26 /usr/local/agile/bin/elog
		19755 ??  I       0:00.02
/usr/local/agile/bin/registryMgr
		16903 ??  D       0:00.05 (ncmd)

A state of 'D' indicates that ncmd is in a disk I/O wait state.  I would
have expected the disk request to time out, fail, and cause ncmd to die,
resulting in pm trying to restart it.  This should be an infinite loop
unless we are eventually able to read what we need.  I would also expect
the CF error would be logged to the syslog (not elog) in
/var/log/messages*, although we may not be able to write the log.  If
this isn't happening, then we may need to get Andy Sharp involved to
find out what BSD and the CF driver are doing.  However, I'm not sure
what you're expecting to try to do.  If the CF is failing there's not
really anything we can do about it that I know of, although Andy would
have more insight.  You could try copying the files in
/usr/local/agile/bin and /usr/local/agile/lib to try to identify what
files are on failing parts of the flash, but again, I'm not sure what
purpose it would serve.

ChrisV

_____________________________________________
From: Mark Farabaugh=20
Sent: Thursday, January 25, 2007 9:03 AM
To: Chris Vandever
Cc: Brian Stark
Subject: FW: CF Issue

Chris,
Brian Stark suggested that I follow up with you on an issue.  Our
Contract Manufacturer is seeing several compact flash failures where
system commands will not execute (see below).  Brian asked that I run a
ps- ax to look at what processes are running.  I have a failing flash
available.  The output of the ps -ax is attached.

Regards, Mark

abcd diag> system show chassis
Timed out waiting for response
% Command failure.
abcd diag> system show version
Timed out waiting for response
% Command failure.
abcd diag> system show temperature
Timed out waiting for response
% Command failure.
abcd diag>
abcd diag>
abcd diag> ps -ax
% Unknown command/option.
abcd diag> system reboot
Are you sure ? [y|n] : y
system reboot
Are you sure ? [y|n] : y
nfxsh_send: Unable to open rmc session to eventd_rmc, error -20.

 << File: abcd diag.doc >>=20




_____________________________________________
From: Brian Stark=20
Sent: Wednesday, January 24, 2007 2:27 PM
To: Mark Farabaugh
Subject: RE: CF Issue

Mark,

The system commands are timing out because chassisd is not running.  At
this point, you'll need to get someone in the software group involved to
help understand why this is happening.  I think that Chris Vandever
would be the person to first talk with since she's helped Abdallah with
similar issues in the past.


Brian


	_____________________________________________=20
	From: 	Mark Farabaugh =20
	Sent:	Wednesday, January 24, 2007 1:09 PM
	To:	Brian Stark
	Subject:	CF Issue

	Brian,
	I received another CF where system commands time out.  I believe
you asked to run a ps -ax when=20
	I saw this.  Attached is the outputs.  Let me know what you want
me to do.

	Regards, Mark

	 << File: abcd diag.doc >>=20

------_=_NextPart_001_01C740DE.57C14C17
Content-Type: text/html;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Dus-ascii">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7650.28">
<TITLE>RE: CF Issue</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Arial">Andy</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">, I</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">m not sure we need to know =
about the error</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"> <FONT SIZE=3D2 FACE=3D"Arial">han</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">dling for CF errors, so I think you</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">re off the hook =
here</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">, although any insights would still be =
appreciated</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">Okay, =
I think I have the picture now, so correct me if I</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">m wrong.</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"></SPAN></P>

<P><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Symbol">&#183;<FONT =
FACE=3D"Courier =
New">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
<FONT SIZE=3D2 FACE=3D"Arial">We DON</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8221;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">T think =
it</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">s a problem with the flash failing because we can rewrite =
the flash with new images and it then works (the</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">&#8220;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">re-programming part you =
mentioned).</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Symbol">&#183;<FONT =
FACE=3D"Courier New">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></FONT> =
<FONT SIZE=3D2 FACE=3D"Arial">W</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">e think the problem is persistent with the flash because =
it repeats after rebooting (</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">the bit about our contract manufacture</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">r seeing the problem, shipping us the flash, and then us =
seeing it here).</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Symbol">&#183;<FONT =
FACE=3D"Courier New">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></FONT> =
<FONT SIZE=3D2 FACE=3D"Arial">You mentioned having</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">&#8220;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">validated</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">&#8221;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> the flash.&nbsp; Does that mean you did =
a</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT =
SIZE=3D2 FACE=3D"Arial">&#8220;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">system compare -s&quot; between the flash and the release =
supposedly installed on it?&nbsp; If not, what do you</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> mean?</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">There =
was a</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
<FONT SIZE=3D2 FACE=3D"Arial">similar</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">problem that was fixed in R1.3.2.0</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> (change list #19519)</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> that was a transient problem where pm</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> got bad data from BSD and</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> erroneously thought</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">timekeeper</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial"> had died and restarted =
it</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">, the restarted timekeeper failed, terminating =
pm, and pm killed all of its children except the original =
timekeeper.</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">&nbsp; But, your data =
isn</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">t consistent with that problem and you</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">re running code that should =
have the fix in it.</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">I</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">ll need the following info:</FONT></SPAN></P>

<P><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Symbol">&#183;<FONT =
FACE=3D"Courier =
New">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">How did the contract manufacturer get the image on the =
flash?&nbsp; Did it ever work, or did it fail immediately?</FONT></SPAN>

<BR><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Symbol">&#183;<FONT =
FACE=3D"Courier =
New">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">If you haven</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">t already done =
a</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT =
SIZE=3D2 FACE=3D"Arial">&#8220;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">system compare -s&quot; against the</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">appropriate R1.3.3.2 build I would suggest doing =
so.&nbsp; It is entirely possible this is a known problem =
where</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
<FONT SIZE=3D2 FACE=3D"Arial">the system upgrade process resulted in one =
or more files being corrupted.&nbsp; I would guess this is most likely =
the problem, although you didn</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">t mention anyone doing a =
system upgrade.</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">&nbsp; A few files are =
expected not to compare, so send me the output.</FONT></SPAN></P>

<P><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Symbol">&#183;<FONT =
FACE=3D"Courier New">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT></FONT> =
<FONT SIZE=3D2 FACE=3D"Arial">If the compare looks good, then =
I</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT =
SIZE=3D2 FACE=3D"Arial">&#8217;</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">ll need the elog (level info)</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">from a fresh boot that exhibits the problem.&nbsp; Please =
include the syslog as well just in case.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">I sit =
on the</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
<FONT SIZE=3D2 FACE=3D"Arial">last</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">aisle between</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D2 =
FACE=3D"Arial">the</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"> <FONT SIZE=3D2 FACE=3D"Arial">Maui and Oahu conference =
rooms</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">, closer to =
Oahu,</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
<FONT SIZE=3D2 FACE=3D"Arial">across</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial"> from Bill Nad</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">z</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">am if that =
helps.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">ChrisV</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma">_____________________________________________<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">From:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Mark Farabaugh<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Sent:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Thursday, January 25, =
2007 3:32 PM<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">To:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Chris Vandever<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Cc:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Andy Sharp<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Subject:</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma"> RE: CF Issue</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">Chris, Andy,</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">I&#8217;m relatively new here and ONstor so I&#8217;m not =
totally up on the Bobcat.&nbsp;&nbsp; Our contract manufacture has had =
multiple failures with the symptom below.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">We have validated and re-programmed a flash with this =
symptom and it came out working fine.&nbsp; We do not believe this to be =
a part issue.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">What could cause this symptom?&nbsp; Could this be a =
corruption issue because of an improper power down?&nbsp; 1.3.3.2 =
issue?&nbsp; I have this flash if you want to try it.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">Also, please introduce yourselves, it&#8217;s better to =
put faces with names.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">Regards, Mark</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma">_____________________________________________<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">From:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Chris Vandever<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Sent:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Thursday, January 25, =
2007 11:37 AM<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">To:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Mark Farabaugh<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Cc:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Andy Sharp<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Subject:</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma"> FW: CF Issue</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">When =
the system boots it starts all of the OS processes for BSD and then =
starts a program called pm (process manager).&nbsp; pm will start all of =
the other ONStor processes based on the contents of the =
/usr/local/agile/etc/pmtab file.&nbsp; This file tells it what processes =
to start and in what order.&nbsp; When pm starts a process it waits for =
a signal from that process to indicate it has finished its =
initialization.&nbsp; Once it gets that signal pm will then start the =
next program in the list.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">In =
your case pm has started the following programs:</FONT></SPAN></P>
<UL><UL>
<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Lucida =
Console">PID&nbsp;&nbsp; TT&nbsp; STAT&nbsp;&nbsp;&nbsp; =
TIME&nbsp;&nbsp;&nbsp; COMMAND</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Lucida =
Console">30799 ??&nbsp; S&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0:02.26 =
/usr/local/agile/bin/elog</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Lucida =
Console">19755 ??&nbsp; I&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0:00.02 =
/usr/local/agile/bin/registryMgr</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Lucida =
Console">16903 ??&nbsp; D&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 0:00.05 =
(ncmd)</FONT></SPAN></P>
</UL></UL>
<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">A =
state of &#8216;D&#8217; indicates that ncmd is in a disk I/O wait =
state.&nbsp; I would have expected the disk request to time out, fail, =
and cause ncmd to die, resulting in pm trying to restart it.&nbsp; This =
should be an infinite loop unless we are eventually able to read what we =
need.&nbsp; I would also expect the CF error would be logged to the =
syslog (not elog) in /var/log/messages*, although we may not be able to =
write the log.&nbsp; If this isn&#8217;t happening, then we may need to =
get Andy Sharp involved to find out what BSD and the CF driver are =
doing.&nbsp; However, I&#8217;m not sure what you&#8217;re expecting to =
try to do.&nbsp; If the CF is failing there&#8217;s not really anything =
we can do about it that I know of, although Andy would have more =
insight.&nbsp; You could try copying the files in /usr/local/agile/bin =
and /usr/local/agile/lib to try to identify what files are on failing =
parts of the flash, but again, I&#8217;m not sure what purpose it would =
serve.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">ChrisV</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma">_____________________________________________<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">From:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Mark Farabaugh<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Sent:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Thursday, January 25, =
2007 9:03 AM<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">To:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Chris Vandever<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Cc:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Brian Stark<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Subject:</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma"> FW: CF Issue</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">Chris,</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">Brian Stark suggested that I follow up with you on an =
issue.&nbsp; Our Contract Manufacturer is seeing several compact flash =
failures where system commands will not execute (see below).&nbsp; Brian =
asked that I run a ps- ax to look at what processes are running.&nbsp; I =
have a failing flash available.&nbsp; The output of the ps &#8211;ax is =
attached.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#000080" SIZE=3D2 =
FACE=3D"Arial">Regards, Mark</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt; system show =
chassis</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">Timed out waiting for =
response</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">% Command failure.</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt; system show =
version</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">Timed out waiting for =
response</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">% Command failure.</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt; system show =
temperature</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">Timed out waiting for =
response</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">% Command failure.</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt;</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt;</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt; ps -ax</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">% Unknown =
command/option.</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">abcd diag&gt; system =
reboot</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">Are you sure ? [y|n] : y</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">system reboot</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">Are you sure ? [y|n] : y</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">nfxsh_send: Unable to open rmc =
session to eventd_rmc, error -20.</SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us">&nbsp;&lt;&lt; File: abcd diag.doc =
&gt;&gt; </SPAN></P>
<BR>
<BR>
<BR>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma">_____________________________________________<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">From:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Brian Stark<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Sent:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Wednesday, January 24, =
2007 2:27 PM<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">To:</FONT></B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN =
LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Tahoma"> Mark Farabaugh<BR>
</FONT></SPAN><SPAN LANG=3D"en-us"><B></B></SPAN><SPAN =
LANG=3D"en-us"><B><FONT SIZE=3D2 =
FACE=3D"Tahoma">Subject:</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Tahoma"> RE: CF Issue</FONT></SPAN><SPAN =
LANG=3D"en-us"></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#0000FF" SIZE=3D2 =
FACE=3D"Arial">Mark,</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#0000FF" SIZE=3D2 =
FACE=3D"Arial">The system commands are timing out because chassisd is =
not running.&nbsp; At this point, you'll need to get someone in the =
software group involved to help understand why this is happening.&nbsp; =
I think that Chris Vandever would be the person to first talk with since =
she's helped Abdallah with similar issues in the past.</FONT></SPAN></P>
<BR>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT COLOR=3D"#0000FF" SIZE=3D2 =
FACE=3D"Arial">Brian</FONT></SPAN></P>
<BR>
<UL>
<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D1 =
FACE=3D"Tahoma">_____________________________________________ =
</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><B><FONT SIZE=3D1 =
FACE=3D"Tahoma">From: &nbsp;</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D1 =
FACE=3D"Tahoma">Mark Farabaugh&nbsp; </FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><B><FONT SIZE=3D1 =
FACE=3D"Tahoma">Sent:&nbsp;&nbsp;</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D1 =
FACE=3D"Tahoma">Wednesday, January 24, 2007 1:09 PM</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><B><FONT SIZE=3D1 =
FACE=3D"Tahoma">To:&nbsp;&nbsp;&nbsp;&nbsp;</FONT></B></SPAN><SPAN =
LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT SIZE=3D1 =
FACE=3D"Tahoma">Brian Stark</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><B><FONT SIZE=3D1 =
FACE=3D"Tahoma">Subject:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</FONT>=
</B></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> <FONT =
SIZE=3D1 FACE=3D"Tahoma">CF Issue</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">Brian,</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">I =
received another CF where system commands time out.&nbsp; I believe you =
asked to run a ps &#8211;ax when </FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 FACE=3D"Arial">I saw =
this.&nbsp; Attached is the outputs.&nbsp; Let me know what you want me =
to do.</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">Regards, Mark</FONT></SPAN></P>

<P ALIGN=3DLEFT><SPAN LANG=3D"en-us"><FONT SIZE=3D2 =
FACE=3D"Arial">&nbsp;&lt;&lt; File: abcd diag.doc =
&gt;&gt;</FONT></SPAN><SPAN LANG=3D"en-us"></SPAN><SPAN LANG=3D"en-us"> =
</SPAN></P>
</UL>
</BODY>
</HTML>
------_=_NextPart_001_01C740DE.57C14C17--
