AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20081031110140.6317d9cc@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<maxim.kozlovsky@onstor.com>,<ed.kwan@onstor.com>,<jonathan.goldick@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0C341120@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Fri, 31 Oct 2008 11:02:12 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Maxim Kozlovsky" <maxim.kozlovsky@onstor.com>
Cc: "Ed Kwan" <ed.kwan@onstor.com>, "Jonathan Goldick"
 <jonathan.goldick@onstor.com>
Subject: Re: Defect  Yes TED00025710 [10206 - Onstor] Over 200 Exim
 processes running Onstor
Message-ID: <20081031110212.0dcdc429@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0C341120@onstor-exch02.onstor.net>
References: <ONSTOR-EXCH01GcvBNn000055b5@onstor-exch01.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0C341104@onstor-exch02.onstor.net>
	<20081031102718.5dd17ab7@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0C341120@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Let me say that there is some kind of config option to allow exim to
limit it's processes, and I will check into using that, which would
allow exim to manage how many processes it creates according to its own
design, which I'm fine with.

On Fri, 31 Oct 2008 10:49:36 -0700 "Maxim Kozlovsky"
<maxim.kozlovsky@onstor.com> wrote:

> By the way we still will need to implement the fork limit on the
> exims. Even if they will not hang forever too many exim processes
> running simultaneously can still cause the temporary service
> disruption. 
> 
> >-----Original Message-----
> >From: Andy Sharp
> >Sent: Friday, October 31, 2008 10:27 AM
> >To: Maxim Kozlovsky
> >Cc: Ed Kwan; Jonathan Goldick
> >Subject: Re: Defect Yes TED00025710 [10206 - Onstor] Over 200 Exim
> >processes running Onstor
> >
> >I'll take care of it.  The suspected (likely) use of SIGALRM in exim
> >hopefully can be replaced with something equivalent.  If not, then a
> >save and restore like step as Max suggests will have to be used.
> >
> >What I will need to test it is some way to force RMC to [always]
> >retry, which apparently only happens on MD ~:^)
> >
> >Cheers,
> >
> >a
> >
> >
> >On Fri, 31 Oct 2008 10:04:34 -0700 "Maxim Kozlovsky"
> ><maxim.kozlovsky@onstor.com> wrote:
> >
> >> Hello,
> >>
> >> I think Ed's team should be able to take care of the problem from
> >> here. The SIGARARM usage in the exim4 code needs to be investigated
> >> and the code needs to be modified to co-exist with the RMC. In the
> >> current code the exim SIGALRM handler is called and the RMC SIGALRM
> >> handler is not. The signal handling code needs to be modified to
> >> save the previously established signal handlers and to call them,
> >> similar to what is done in the RMC signal handling code. Andy
> >> should be able to help if necessary.
> >>
> >> Max
> >>
> >> >-----Original Message-----
> >> >From: maxim.kozlovsky@onstor.com
> >> >[mailto:maxim.kozlovsky@onstor.com] Sent: Thursday, October 30,
> >> >2008 6:03 PM To: dl-Escalation
> >> >Cc: Andy Sharp; Timothy Swenson
> >> >Subject: Defect Yes TED00025710 [10206 - Onstor] Over 200 Exim
> >> processes
> >> >running Onstor
> >> >
> >> >company_name: Onstor
> >> >id: TED00025710
> >> >Headline: [10206 - Onstor] Over 200 Exim processes running
> >> >State: Opened
> >> >Note_Entry: The exim4 hangs forever in the clustering code because
> >> >the
> >> RMC
> >> >retries are not working. The RMC retries are not working because
> both
> >> rmc
> >> >and exim mess with SIGALRM. The exim4 code needs to be fixed to
> >> >coexist with rmc.
> >> >Release_Project: 4.0.1.0
> >>
