X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C8B60B.7DEB8553@onstor-exch02.onstor.net>; Wed, 14 May 2008 14:43:05 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: Defect  SW-BSD Opened TED00023791
Date: Wed, 14 May 2008 14:43:04 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E09EE846B@onstor-exch02.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E09EE8468@onstor-exch02.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Defect  SW-BSD Opened TED00023791
Thread-Index: Aci2BtTJKjsjhoXeRyWXATcxyCFOOQAABypwAACMEIAAACfEUAAAG5GAAAAgJmAAADCbcA==
References: <BB375AF679D4A34E9CA8DFA650E2B04E09EE842B@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E09EE8455@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E09EE8459@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E09EE845B@onstor-exch02.onstor.net> <BB375AF679D4A34E9CA8DFA650E2B04E09EE8468@onstor-exch02.onstor.net>
From: "Maxim Kozlovsky" <maxim.kozlovsky@onstor.com>
To: "Raj Kumar" <raj.kumar@onstor.com>,
	"Andy Sharp" <andy.sharp@onstor.com>
Cc: "Jonathan Goldick" <jonathan.goldick@onstor.com>,
	"Tim Gardner" <tim.gardner@onstor.com>,
	"Sandrine Boulanger" <sandrine.boulanger@onstor.com>

Well find the one who did it and kick him.

WAD.

>-----Original Message-----
>From: Raj Kumar
>Sent: Wednesday, May 14, 2008 2:42 PM
>To: Maxim Kozlovsky; Andy Sharp
>Cc: Jonathan Goldick; Tim Gardner; Sandrine Boulanger
>Subject: RE: Defect SW-BSD Opened TED00023791
>
>Hemm, that's weird. I have only 2 ssh sessions that I have started but
I
>see at least 10 sh sessions. Wonder who kicked these off. There are 521
>processes but most of them are idle.
>
>
>Based on top, it looks like we have enough memory:
>
>load averages:  1.14,  1.13,  1.09
>14:41:34
>521 processes: 2 running, 519 idle
>CPU states:  1.6% user,  0.0% nice,  0.0% system,  0.0% interrupt,
98.4%
>idle
>Memory: Real: 35M/108M act/tot  Free: 123M  Swap: 4K/30M used/tot
>
>  PID USERNAME PRI NICE  SIZE   RES STATE WAIT     TIME    CPU COMMAND
>28107 root       2    0 1132K 1396K sleep select  10:54  0.15% pm
> 9876 root      28    0  612K 1220K run   -        0:28  0.00% top
> 1356 root       2    0 1428K 2216K run   -        0:06  0.00% vtmd
>
>
>
>-----Original Message-----
>From: Maxim Kozlovsky
>Sent: Wednesday, May 14, 2008 2:34 PM
>To: Raj Kumar; Andy Sharp
>Cc: Jonathan Goldick; Tim Gardner; Sandrine Boulanger
>Subject: RE: Defect SW-BSD Opened TED00023791
>
>Quit some of the shells that you have started. You have at least 5 in
the
>"top" output.
>
>>-----Original Message-----
>>From: Raj Kumar
>>Sent: Wednesday, May 14, 2008 2:31 PM
>>To: Maxim Kozlovsky; Andy Sharp
>>Cc: Jonathan Goldick; Tim Gardner; Sandrine Boulanger
>>Subject: RE: Defect SW-BSD Opened TED00023791
>>
>>Ps fails.
>>
>>># ps ax | grep onstor
>>>sh: cannot fork - try again
>>
>>However I had "top" running on a session, so I have provided that
>>information in the defect
>>
>>-----Original Message-----
>>From: Maxim Kozlovsky
>>Sent: Wednesday, May 14, 2008 2:30 PM
>>To: Raj Kumar; Andy Sharp
>>Cc: Jonathan Goldick; Tim Gardner; Sandrine Boulanger
>>Subject: RE: Defect SW-BSD Opened TED00023791
>>
>>Yes, the list of the processes.
>>
>>>-----Original Message-----
>>>From: Raj Kumar
>>>Sent: Wednesday, May 14, 2008 2:11 PM
>>>To: Maxim Kozlovsky; Andy Sharp
>>>Cc: Jonathan Goldick; Tim Gardner; Sandrine Boulanger
>>>Subject: FW: Defect SW-BSD Opened TED00023791
>>>
>>>Guys,
>>>
>>>Is there anything needs to be collected?
>>>
>>>Thanks.
>>>
>>>-----Original Message-----
>>>From: raj.kumar@onstor.com [mailto:raj.kumar@onstor.com]
>>>Sent: Wednesday, May 14, 2008 2:10 PM
>>>To: Andy Sharp
>>>Cc: Raj Kumar
>>>Subject: Defect SW-BSD Opened TED00023791
>>>
>>>id: TED00023791
>>>Headline: S-Soak (G8R9): BSD can not fork any more processes (May 14
>>>14:07:59 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs)
>>>Severity: 2-Major
>>>Build: Submittal 20 Beta
>>>Description: Submittal : 20_BETA
>>>Setup: SS
>>>Node: G8r9
>>>Elog at /n/newcorevol/defect_23791
>>>
>>>BSD on thsi particular node is not able to fork any more processes.
>>>
>>>I was trying to get a SGA on this node and the CLI failed. Then I
noticed
>>>several pm related messages on the elog. When I tried to look at
process
>>>list using ps, ps failed.
>>>
>>>I wonder whether this is due to the fact that I have startedusing NCM
on
>>>this node or not.
>>>
>>># ps ax | grep onstor
>>>sh: cannot fork - try again
>>># Connection to g8r9 closed.
>>>
>>>g8r9 diag> system get all
>>>% Command failure.
>>>
>>># nfxsh
>>>
>>>sh: cannot fork - try again
>>>
>>>************** Elog*********
>>>
>>>May 14 14:07:59 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:07:59 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:00 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:00 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:01 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:01 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:02 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:02 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:03 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:03 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:04 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:04 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:05 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:05 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:06 g2r5-2280.onstor.lab : 0:0:cluster2:INFO:
>>>Cluster_SendMsgSock: sendto to 10.4.1.1 failed, msgId 10452, code 64
>(Host
>>>is down)
>>>May 14 14:08:06 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:06 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:07 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:07 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:09 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:09 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:10 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:10 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:11 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:11 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:12 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:12 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:13 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:13 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:14 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:14 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>May 14 14:08:16 g8r9-2260.onstor.lab : 0:0:pm:WARNING: pm_get_procs:
not
>>>enough pid entries, got(512) need(521)
>>>May 14 14:08:16 g8r9-2260.onstor.lab : 0:0:pm:WARNING:
pm_timeout_work:
>>>pm_get_procs failed, -13
>>>
>>>
>>>Release_Project: Cougar
>>>

