AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20081109211308.29f3db0e@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:exch1.onstor.net
NSV:
SSH:
R:<sandrine.boulanger@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@exch1.onstor.net/INBOX	0	2779531E7C760D4491C96305019FEEB5175D40243A@exch1.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Sun, 9 Nov 2008 21:13:17 -0800
From: Andrew Sharp <andy.sharp@onstor.com>
To: Sandrine Boulanger <sandrine.boulanger@onstor.com>
Subject: Re: status after reboot
Message-ID: <20081109211317.15c5d3e0@ripper.onstor.net>
In-Reply-To: <2779531E7C760D4491C96305019FEEB5175D40243A@exch1.onstor.net>
References: <2779531E7C760D4491C96305019FEEB5175C0037A4@exch1.onstor.net>
	<20081107193442.0a23194a@ripper.onstor.net>
	<2779531E7C760D4491C96305019FEEB5175D4023F6@exch1.onstor.net>
	<20081109113029.11ecfcfe@ripper.onstor.net>
	<2779531E7C760D4491C96305019FEEB5175D40243A@exch1.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

I put yet another test version on g11r10, I'm hoping for the best.
I'd like to see how it's doing in the morning.

Thanks,

a


On Sun, 9 Nov 2008 14:33:43 -0800 Sandrine Boulanger
<sandrine.boulanger@onstor.com> wrote:

> Too bad. Good luck. I'm wondering if we would be better off using the
> Bobcat method for sending emails. 
> 
> -----Original Message-----
> From: Andy Sharp 
> Sent: Sunday, November 09, 2008 11:30 AM
> To: Sandrine Boulanger
> Subject: Re: status after reboot
> 
> I'm running an even more experimental version on my cougar, which I
> thought would work better, but it has similar stuck processes on it.
> I'll work on it some more today.  ~:^(
> 
> On Sat, 8 Nov 2008 11:11:28 -0800 Sandrine Boulanger
> <sandrine.boulanger@onstor.com> wrote:
> 
> > If you said you thawed everything on g11r10 last night and fixed the
> > hosts file, how come we have so many stuck since then. I have never
> > seen that many on cougar soak yet. What can we try next?
> > 
> > g11r10:~# ps ax -o pid,ppid,tt,wchan,state,start,time,command | grep
> > exim 771  1311 ?        wait   S 04:29:54 00:00:00 /usr/sbin/exim4
> > -q 775   771 ?        select S 04:29:55 00:00:01 /usr/sbin/exim4 -q
> >   802     1 ?        wait   S 16:33:19 00:00:00 /usr/sbin/exim4 -q
> >   811   802 ?        select S 16:33:20 00:00:02 /usr/sbin/exim4 -q
> >  1311     1 ?        select S 16:34:07 00:00:00 /usr/sbin/exim4 -bd
> > -q30m 1972     1 ?        select S 23:12:02 00:00:01 /usr/sbin/exim4
> > -Mc 1KyhzG-0000Vc-Dz 6102     1 ?        select S 23:14:02
> > 00:00:01 /usr/sbin/exim4 -Mc 1Kyi1C-0001aP-7m 6475     1 ?
> > select S 06:50:04 00:00:00 /usr/sbin/exim4 -Mc 1Kyp8V-0001gQ-MV 9068
> > 1311 ?        wait   S 04:59:54 00:00:00 /usr/sbin/exim4 -q 9080
> > 9068 ?        select S 04:59:58 00:00:00 /usr/sbin/exim4 -q 9875
> > 6694 pts/1    -      R 11:07:46 00:00:00 grep exim 13653
> > 1 ?        select S 05:20:03 00:00:00 /usr/sbin/exim4 -Mc
> > 1KynjP-0003YC-0e 15420     1 ?        select S 07:30:03
> > 00:00:00 /usr/sbin/exim4 -Mc 1KyplD-00040B-6C 15558     1 ?
> > select S 09:30:04 00:00:00 /usr/sbin/exim4 -Mc 1KyrdL-00042U-M0
> > 15897  1311 ?        wait   S 05:29:54 00:00:00 /usr/sbin/exim4 -q
> > 15902 15897 ?        select S 05:29:55 00:00:00 /usr/sbin/exim4 -q
> > 17151  1311 ?        wait   S 03:29:54 00:00:00 /usr/sbin/exim4 -q
> > 17154 17151 ?        select S 03:29:55 00:00:01 /usr/sbin/exim4 -q
> > 20159     1 ?        select S 07:38:04 00:00:00 /usr/sbin/exim4 -MCS
> > -MCP -MC remote_smtp mail.onstor.com 66.201.51.107 2
> > 1Kypr0-00056v-3v 20866     1 ?        select S 07:40:07
> > 00:00:00 /usr/sbin/exim4 -MCS -MCP -MC remote_smtp mail.onstor.com
> > 66.201.51.107 3 1Kypn8-0004uM-69 20988     1 ?        select S
> > 07:42:04 00:00:00 /usr/sbin/exim4 -MCS -MCP -MC remote_smtp
> > mail.onstor.com 66.201.51.107 2 1KypjG-0003v2-8F 21430
> > 1 ?        select S 05:40:04 00:00:01 /usr/sbin/exim4 -Mc
> > 1Kyo2l-0005Zd-MT 22904     1 ?        select S 23:06:03
> > 00:00:01 /usr/sbin/exim4 -Mc 1KyhtS-0005x7-NV 26583  1311 ? wait
> > S 03:59:54 00:00:00 /usr/sbin/exim4 -q 26586 26583 ? select S
> > 03:59:55 00:00:00 /usr/sbin/exim4 -q 32630     1 ? select S
> > 02:30:05 00:00:00 /usr/sbin/exim4 -Mc 1Kyl4t-0008TS-Kt g11r10:~#
> > 
> > -----Original Message-----
> > From: Andy Sharp 
> > Sent: Friday, November 07, 2008 7:35 PM
> > To: Sandrine Boulanger
> > Subject: Re: status after reboot
> > 
> > I fixed g11r10, the format of the hosts file is very important to
> > exim for some reason.  The bare node name has to come before the
> > node.sc0 name.
> > 
> > I unfroze the messages, and the next queue run they all got thrown
> > away.
> > 
> > I'm having trouble getting to g12r10.
> > 
> > 
> > 
> > 
> > On Fri, 7 Nov 2008 17:43:22 -0800 Sandrine Boulanger
> > <sandrine.boulanger@onstor.com> wrote:
> > 
> > > The 2 nodes I rebooted have extra processes and one of them has
> > > already 160 frozen messages.
> > > 
> > > 
> > > g12r10:/var/log/onstor#  ps ax | grep exim
> > >   816 ?        S      0:00 /usr/sbin/exim4 -q
> > >   819 ?        S      0:00 /usr/sbin/exim4 -q
> > >  1263 ?        Ss     0:00 /usr/sbin/exim4 -bd -q30m
> > > 13474 pts/0    R+     0:00 grep exim
> > > 
> > > g11r10:/var/log/onstor# ps ax | grep exim
> > >   802 ?        S      0:00 /usr/sbin/exim4 -q
> > >   811 ?        S      0:00 /usr/sbin/exim4 -q
> > >  1311 ?        Ss     0:00 /usr/sbin/exim4 -bd -q30m
> > > 25076 pts/0    R+     0:00 grep exim
> > > 
> > > g11r10:/var/log/onstor# exiqgrep -z -c
> > > 160 matches out of 161 messages
> > > 
