AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080926225423.0031f3b3@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<john.rogers@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E09C429EA@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Fri, 26 Sep 2008 22:54:50 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "John Rogers" <john.rogers@onstor.com>
Subject: Re: proper procedure for exim
Message-ID: <20080926225450.5cfd1eb3@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E09C429EA@onstor-exch02.onstor.net>
References: <20080925173806.0b909ed2@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E09C429EA@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Thu, 25 Sep 2008 17:57:11 -0700 "John Rogers"
<john.rogers@onstor.com> wrote:

> In the same log file, you can see there was still wedged messages in
> the spool directory and the number of processes running on the system
> was piling up. At the time I had last check it was in the 90's.
> 
> I suspect that those messages in the spool would have remained there
> indefinitely if their headers were derived from bunk config files.
> Exim would have continued to jam up.

Interesting.  Very possible.  I wish I could have had a chance to
analyze that. The tug-of-war between us is that you want the resource
(mightydog) do it's job and leave you alone, and I want to get at the
details of the situation and debug some shit and like that.  Such is
life.

> There wasn't any log at /var/log/onstor/exim-cleaning.log. So
> apparently my cron.daily/exim-rm-frozen script didn't decide that any
> of the messages in the queue were rm worthy. The script says don't
> exceed 40M and rm messages older then 5 days. I don't think we were
> even close to 40M, but certainly there were messages older then 5
> days. In the elog we had the mta mail-queue full messages, maybe that
> prevents the script from running or something.

No, and I've been thinking that perhaps there's some more work that can
be done there with regard to making exim-cleaning take some action based
on the number of useless or frozen jobs, not just the amount of disk
space used.  I'll be looking at that going forward.

> I think the process should be posted in the wiki, it can be denoted as
> an internal engineering procedure.

OK, that's where I'll put some info up.  I'll let you know when I have
something and you can give me your opinion of it.  In my mind, I think
that there's some more stuff that needs doing here in regards to the
product but I'm still not sure what that is, so this experience, and
your opinion as a user of the product, hopefully will provide some
direction on how to sew this part of the product up better than it is.

Thanks,

a

> 
> -----Original Message-----
> From: Andy Sharp 
> Sent: Thursday, September 25, 2008 5:38 PM
> To: John Rogers
> Cc: Larry Scheer
> Subject: Re: proper procedure for exim
> 
> On Thu, 25 Sep 2008 17:30:43 -0700 "John Rogers"
> <john.rogers@onstor.com> wrote:
> 
> > We did just stop and killall and start yesterday and it didn't
> > provide any relief, so the log you saw was what was done today. Btw
> > it seems to have worked great.
> 
> Can you be more specific about what "any relief" means?  It should
> have worked great, so something tells me we're not out of the woods,
> but I'll keep my fingers crossed anyway.
> 
> I'm asking about where to document this because I want to write down
> the procedure for clearing out the queue, but just sending it to the
> two of you seems like not quite getting the word out widely enough.
> 
> > -----Original Message-----
> > From: Andy Sharp 
> > Sent: Thursday, September 25, 2008 5:08 PM
> > To: John Rogers; Larry Scheer
> > Subject: fyi: proper procedure for exim
> > 
> > Howdy guys,
> > 
> > I saw this in the screen log files for the dogfood, and I said,
> > huh, I never said to do that, so I guess you were just doing what
> > comes natural after using BSD for so long.
> > 
> > Dogfood:/var/spool/exim4/msglog# rm -rf *
> > 
> > Please do not do this in the future, it is not the correct thing to
> > do with exim.  Generally the files in these directories should not
> > be mucked with by hand.  For the future, that's why I told Larry
> > that all that needed doing is to stop exim, do a "killall exim4"
> > and then to start exim.  If file deleting had been necessary, I
> > would have mentioned that.
> > 
> > If things are working as they should, exim will clean out the queue
> > as it sees fit.
> > 
> > I'm not sure where the right place is to document this sort of
> > thing. I think it's important to try and do so because I'm sure
> > that there is nastalgic information floating about the company
> > about how to deal with email queues and things filling up, and I
> > would like to inform folks of the new ways.
> > 
> > Cheers,
> > 
> > a
> > 
