AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20081109112936.081c809c@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:exch1.onstor.net
NSV:
SSH:
R:<sandrine.boulanger@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@exch1.onstor.net/INBOX	0	2779531E7C760D4491C96305019FEEB5175D4023F6@exch1.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Sun, 9 Nov 2008 11:30:29 -0800
From: Andrew Sharp <andy.sharp@onstor.com>
To: Sandrine Boulanger <sandrine.boulanger@onstor.com>
Subject: Re: status after reboot
Message-ID: <20081109113029.11ecfcfe@ripper.onstor.net>
In-Reply-To: <2779531E7C760D4491C96305019FEEB5175D4023F6@exch1.onstor.net>
References: <2779531E7C760D4491C96305019FEEB5175C0037A4@exch1.onstor.net>
	<20081107193442.0a23194a@ripper.onstor.net>
	<2779531E7C760D4491C96305019FEEB5175D4023F6@exch1.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

I'm running an even more experimental version on my cougar, which I
thought would work better, but it has similar stuck processes on it.
I'll work on it some more today.  ~:^(

On Sat, 8 Nov 2008 11:11:28 -0800 Sandrine Boulanger
<sandrine.boulanger@onstor.com> wrote:

> If you said you thawed everything on g11r10 last night and fixed the
> hosts file, how come we have so many stuck since then. I have never
> seen that many on cougar soak yet. What can we try next?
> 
> g11r10:~# ps ax -o pid,ppid,tt,wchan,state,start,time,command | grep
> exim 771  1311 ?        wait   S 04:29:54 00:00:00 /usr/sbin/exim4 -q
>   775   771 ?        select S 04:29:55 00:00:01 /usr/sbin/exim4 -q
>   802     1 ?        wait   S 16:33:19 00:00:00 /usr/sbin/exim4 -q
>   811   802 ?        select S 16:33:20 00:00:02 /usr/sbin/exim4 -q
>  1311     1 ?        select S 16:34:07 00:00:00 /usr/sbin/exim4 -bd
> -q30m 1972     1 ?        select S 23:12:02 00:00:01 /usr/sbin/exim4
> -Mc 1KyhzG-0000Vc-Dz 6102     1 ?        select S 23:14:02
> 00:00:01 /usr/sbin/exim4 -Mc 1Kyi1C-0001aP-7m 6475     1 ?
> select S 06:50:04 00:00:00 /usr/sbin/exim4 -Mc 1Kyp8V-0001gQ-MV 9068
> 1311 ?        wait   S 04:59:54 00:00:00 /usr/sbin/exim4 -q 9080
> 9068 ?        select S 04:59:58 00:00:00 /usr/sbin/exim4 -q 9875
> 6694 pts/1    -      R 11:07:46 00:00:00 grep exim 13653
> 1 ?        select S 05:20:03 00:00:00 /usr/sbin/exim4 -Mc
> 1KynjP-0003YC-0e 15420     1 ?        select S 07:30:03
> 00:00:00 /usr/sbin/exim4 -Mc 1KyplD-00040B-6C 15558     1 ?
> select S 09:30:04 00:00:00 /usr/sbin/exim4 -Mc 1KyrdL-00042U-M0
> 15897  1311 ?        wait   S 05:29:54 00:00:00 /usr/sbin/exim4 -q
> 15902 15897 ?        select S 05:29:55 00:00:00 /usr/sbin/exim4 -q
> 17151  1311 ?        wait   S 03:29:54 00:00:00 /usr/sbin/exim4 -q
> 17154 17151 ?        select S 03:29:55 00:00:01 /usr/sbin/exim4 -q
> 20159     1 ?        select S 07:38:04 00:00:00 /usr/sbin/exim4 -MCS
> -MCP -MC remote_smtp mail.onstor.com 66.201.51.107 2 1Kypr0-00056v-3v
> 20866     1 ?        select S 07:40:07 00:00:00 /usr/sbin/exim4 -MCS
> -MCP -MC remote_smtp mail.onstor.com 66.201.51.107 3 1Kypn8-0004uM-69
> 20988     1 ?        select S 07:42:04 00:00:00 /usr/sbin/exim4 -MCS
> -MCP -MC remote_smtp mail.onstor.com 66.201.51.107 2 1KypjG-0003v2-8F
> 21430     1 ?        select S 05:40:04 00:00:01 /usr/sbin/exim4 -Mc
> 1Kyo2l-0005Zd-MT 22904     1 ?        select S 23:06:03
> 00:00:01 /usr/sbin/exim4 -Mc 1KyhtS-0005x7-NV 26583  1311 ?
> wait   S 03:59:54 00:00:00 /usr/sbin/exim4 -q 26586 26583 ?
> select S 03:59:55 00:00:00 /usr/sbin/exim4 -q 32630     1 ?
> select S 02:30:05 00:00:00 /usr/sbin/exim4 -Mc 1Kyl4t-0008TS-Kt
> g11r10:~#
> 
> -----Original Message-----
> From: Andy Sharp 
> Sent: Friday, November 07, 2008 7:35 PM
> To: Sandrine Boulanger
> Subject: Re: status after reboot
> 
> I fixed g11r10, the format of the hosts file is very important to exim
> for some reason.  The bare node name has to come before the node.sc0
> name.
> 
> I unfroze the messages, and the next queue run they all got thrown
> away.
> 
> I'm having trouble getting to g12r10.
> 
> 
> 
> 
> On Fri, 7 Nov 2008 17:43:22 -0800 Sandrine Boulanger
> <sandrine.boulanger@onstor.com> wrote:
> 
> > The 2 nodes I rebooted have extra processes and one of them has
> > already 160 frozen messages.
> > 
> > 
> > g12r10:/var/log/onstor#  ps ax | grep exim
> >   816 ?        S      0:00 /usr/sbin/exim4 -q
> >   819 ?        S      0:00 /usr/sbin/exim4 -q
> >  1263 ?        Ss     0:00 /usr/sbin/exim4 -bd -q30m
> > 13474 pts/0    R+     0:00 grep exim
> > 
> > g11r10:/var/log/onstor# ps ax | grep exim
> >   802 ?        S      0:00 /usr/sbin/exim4 -q
> >   811 ?        S      0:00 /usr/sbin/exim4 -q
> >  1311 ?        Ss     0:00 /usr/sbin/exim4 -bd -q30m
> > 25076 pts/0    R+     0:00 grep exim
> > 
> > g11r10:/var/log/onstor# exiqgrep -z -c
> > 160 matches out of 161 messages
> > 
