AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080207180442.1664797a@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<john.rogers@onstor.com>,<brian.baker@onstor.com>,<john.vanderwerf@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0346AF5B@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Thu, 7 Feb 2008 18:05:04 -0800
From: Andrew Sharp <andy.sharp@onstor.com>
To: "John Rogers" <john.rogers@onstor.com>
Cc: "Brian Baker" <brian.baker@onstor.com>, "John VanderWerf"
 <john.vanderwerf@onstor.com>
Subject: Re: Possible link problem in the lab
Message-ID: <20080207180504.3557042a@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0346AF5B@onstor-exch02.onstor.net>
References: <BB375AF679D4A34E9CA8DFA650E2B04E08350B74@onstor-exch02.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0346AF5B@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Well, no, this is a routing problem.  Besides, that traffic is not
high.  And there are only 7 -- mine is running off flash ~:^)  This NFS
root traffic is decent only when booting/starting up, or if a filer was
trying to write an infinite # of elog messages per second, which we all
know they aren't because there aren't any bugs.

Here's another trace route I just did when things hung up for 20
seconds:

$ traceroute -n 10.2.10.235
traceroute to 10.2.10.235 (10.2.10.235), 30 hops max, 52 byte packets
 1  10.0.0.19  0.992 ms  0.863 ms  0.924 ms
 2  10.0.0.1  5.101 ms 10.3.0.1  7.861 ms  2.954 ms
 3  * * *
 4  * * *
 5  10.2.10.235  1.166 ms  0.989 ms  1.021 ms

But traceroutes from, say 10.1.1.189, are always instantaneous:

$ traceroute -n 10.2.10.235
traceroute to 10.2.10.235 (10.2.10.235), 64 hops max, 40 byte packets
 1  10.1.1.1  1.831 ms  0.971 ms  1.70 ms
 2  66.201.51.116  3.670 ms  3.803 ms  2.944 ms
 3  10.0.0.1  2.693 ms  2.881 ms  6.215 ms
 4  10.3.0.1  2.225 ms  3.752 ms  2.196 ms
 5  10.2.10.235  4.280 ms  2.883 ms  2.840 ms
$ traceroute -n 10.2.10.235 
traceroute to 10.2.10.235 (10.2.10.235), 64 hops max, 40 byte packets
 1  10.1.1.1  1.857 ms  1.47 ms  0.966 ms
 2  66.201.51.116  4.897 ms  2.830 ms  6.757 ms
 3  10.0.0.1  2.739 ms  5.300 ms  5.778 ms
 4  10.3.0.1  3.736 ms  2.260 ms  3.264 ms
 5  10.2.10.235  2.874 ms  4.502 ms  2.856 ms




On Thu, 7 Feb 2008 17:26:14 -0800 "John Rogers"
<john.rogers@onstor.com> wrote:

> We are looking into this. Off hand I would say that it's the nfs root
> file systems on rack 10 causing the network there to be near or at
> capacity. There are 8 cougars all nfs root mounted there and most of
> them are in heavy qe test. I'd say I wouldn't be too far off in saying
> that the 100Mb network there just aint cutting it.
> 
>  
> 
> ________________________________
> 
> From: Brian Baker 
> Sent: Thursday, February 07, 2008 5:23 PM
> To: John Rogers; John VanderWerf; Andy Sharp
> Subject: Possible link problem in the lab
> 
>  
> 
> John's
> 
> Andy is experiencing high latency in the lab. Corp appears to hand off
> this traffic but 10.3.0.1 is lagging to 10.2.10.235
> 
>  
> 
> 3 - 2/7/2008 5:19:05 PM - Brian Baker (Brian Baker)
> <http://altiris.onstor.net/AeXHD/worker/?cmd=viewContact&id=211>  -
> Closed
> 
>  
> 
>  
> <http://altiris.onstor.net/AeXHD/worker/Default.aspx?cmd=editItemComment
> &version=3&id=1998> 
> 
> Andy thanks for the info. It tells me enough to know its not my
> problem ;) Corp hands off at 10.0.0.1. The problem appears to be at
> 10.3.0.1. This is elabs domain. I will forward this info to the
> John's but you may want to re-enter this ticket through the elab
> support system. 
> 
> 
> 
>  
> 
> 2 - 2/7/2008 5:11:38 PM - Andy Sharp (Guest)
> <http://altiris.onstor.net/AeXHD/worker/?cmd=viewContact&id=582>  -
> Edit
> 
> 
> 
>  
> <http://altiris.onstor.net/AeXHD/worker/Default.aspx?cmd=editItemComment
> &version=2&id=1998> 
> 
> Another traceroute:
> 
> ripper:~$ traceroute 10.2.10.235 
> traceroute to 10.2.10.235 (10.2.10.235), 30 hops max, 52 byte packets
> 1  10.0.0.19 (10.0.0.19)  0.990 ms  0.871 ms  0.960 ms
> 2  10.0.0.1 (10.0.0.1)  0.901 ms 10.3.0.1 (10.3.0.1)  2.004 ms  0.454
> ms 3  * * *
> 4  * * *
> 5  * 10.2.10.235 (10.2.10.235)  28.732 ms *
> 
> 
> 
>  
> 
> 1 - 2/7/2008 4:52:15 PM - Andy Sharp (Guest)
> <http://altiris.onstor.net/AeXHD/worker/?cmd=viewContact&id=582>  -
> Create
> 
> 
> 
>  
> <http://altiris.onstor.net/AeXHD/worker/Default.aspx?cmd=editItemComment
> &version=1&id=1998> 
> 
> OK I got a little excited there, but network routing between my
> workstation on 10.0.0.42 and terminal servers (10.2.10.23[56]) seems
> to be hurting, causing very bad response times sometimes.  5-10
> seconds for a keystroke sometimes.
> 
> I ran 3 traceroutes in a row, you can see something isn't right:
> 
> ripper:~/src/dev$ traceroute 10.2.10.235
> traceroute to 10.2.10.235 (10.2.10.235), 30 hops max, 52 byte packets
> 1  10.0.0.19 (10.0.0.19)  1.001 ms  0.909 ms  0.877 ms
> 2  10.0.0.1 (10.0.0.1)  1.010 ms 10.3.0.1 (10.3.0.1)  1.921 ms  0.399
> ms 3  * 10.2.10.235 (10.2.10.235)  37.671 ms *
> ripper:~/src/dev$ traceroute 10.2.10.235
> traceroute to 10.2.10.235 (10.2.10.235), 30 hops max, 52 byte packets
> 1  10.0.0.1 (10.0.0.1)  1.061 ms  0.980 ms  0.940 ms
> 2  10.3.0.1 (10.3.0.1)  0.469 ms  0.400 ms  0.448 ms
> 3  * * 10.2.10.235 (10.2.10.235)  1.242 ms
> ripper:~/src/dev$ traceroute 10.2.10.235
> traceroute to 10.2.10.235 (10.2.10.235), 30 hops max, 52 byte packets
> 1  10.0.0.1 (10.0.0.1)  0.987 ms  2.099 ms  0.976 ms
> 2  10.3.0.1 (10.3.0.1)  0.600 ms  0.403 ms  0.397 ms
> 3  10.2.10.235 (10.2.10.235)  1.056 ms  0.917 ms  0.913 ms
> 
> 
> 
> 
> 
>  
> 
