AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20081031105618.2e26831c@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<maxim.kozlovsky@onstor.com>,<ed.kwan@onstor.com>,<jonathan.goldick@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0C341120@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Fri, 31 Oct 2008 10:56:41 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Maxim Kozlovsky" <maxim.kozlovsky@onstor.com>
Cc: "Ed Kwan" <ed.kwan@onstor.com>, "Jonathan Goldick"
 <jonathan.goldick@onstor.com>
Subject: Re: Defect  Yes TED00025710 [10206 - Onstor] Over 200 Exim
 processes running Onstor
Message-ID: <20081031105641.02f29b44@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0C341120@onstor-exch02.onstor.net>
References: <ONSTOR-EXCH01GcvBNn000055b5@onstor-exch01.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0C341104@onstor-exch02.onstor.net>
	<20081031102718.5dd17ab7@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0C341120@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Only one process [that utilizes RMC] will ever be running at a time if
they don't hang, so no limit needed.  Besides, as I've mentioned
before, a fork limit won't do what you think it will.  Exim is just not
as simplistic a program as you imagine it is.

On Fri, 31 Oct 2008 10:49:36 -0700 "Maxim Kozlovsky"
<maxim.kozlovsky@onstor.com> wrote:

> By the way we still will need to implement the fork limit on the
> exims. Even if they will not hang forever too many exim processes
> running simultaneously can still cause the temporary service
> disruption. 
> 
> >-----Original Message-----
> >From: Andy Sharp
> >Sent: Friday, October 31, 2008 10:27 AM
> >To: Maxim Kozlovsky
> >Cc: Ed Kwan; Jonathan Goldick
> >Subject: Re: Defect Yes TED00025710 [10206 - Onstor] Over 200 Exim
> >processes running Onstor
> >
> >I'll take care of it.  The suspected (likely) use of SIGALRM in exim
> >hopefully can be replaced with something equivalent.  If not, then a
> >save and restore like step as Max suggests will have to be used.
> >
> >What I will need to test it is some way to force RMC to [always]
> >retry, which apparently only happens on MD ~:^)
> >
> >Cheers,
> >
> >a
> >
> >
> >On Fri, 31 Oct 2008 10:04:34 -0700 "Maxim Kozlovsky"
> ><maxim.kozlovsky@onstor.com> wrote:
> >
> >> Hello,
> >>
> >> I think Ed's team should be able to take care of the problem from
> >> here. The SIGARARM usage in the exim4 code needs to be investigated
> >> and the code needs to be modified to co-exist with the RMC. In the
> >> current code the exim SIGALRM handler is called and the RMC SIGALRM
> >> handler is not. The signal handling code needs to be modified to
> >> save the previously established signal handlers and to call them,
> >> similar to what is done in the RMC signal handling code. Andy
> >> should be able to help if necessary.
> >>
> >> Max
> >>
> >> >-----Original Message-----
> >> >From: maxim.kozlovsky@onstor.com
> >> >[mailto:maxim.kozlovsky@onstor.com] Sent: Thursday, October 30,
> >> >2008 6:03 PM To: dl-Escalation
> >> >Cc: Andy Sharp; Timothy Swenson
> >> >Subject: Defect Yes TED00025710 [10206 - Onstor] Over 200 Exim
> >> processes
> >> >running Onstor
> >> >
> >> >company_name: Onstor
> >> >id: TED00025710
> >> >Headline: [10206 - Onstor] Over 200 Exim processes running
> >> >State: Opened
> >> >Note_Entry: The exim4 hangs forever in the clustering code because
> >> >the
> >> RMC
> >> >retries are not working. The RMC retries are not working because
> both
> >> rmc
> >> >and exim mess with SIGALRM. The exim4 code needs to be fixed to
> >> >coexist with rmc.
> >> >Release_Project: 4.0.1.0
> >>
