AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<Maxim.Kozlovsky@lsi.com>,<Rendell.Fong@lsi.com>,<Brian.Stark@lsi.com>,<Bill.Fisher@lsi.com>,<Jobi.Ariyamannil@lsi.com>,<Chris.Vandever@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/LSI/INBOX	0	861DA0537719934884B3D30A2666FECC010B98CCF8@cosmail02.lsi.com
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 21 Oct 2009 15:45:03 -0700
From: Andrew Sharp <andy.sharp@lsi.com>
To: "Kozlovsky, Maxim" <Maxim.Kozlovsky@lsi.com>
Cc: "Fong, Rendell" <Rendell.Fong@lsi.com>, "Stark, Brian"
 <Brian.Stark@lsi.com>, "Fisher, Bill" <Bill.Fisher@lsi.com>, "Ariyamannil,
 Jobi" <Jobi.Ariyamannil@lsi.com>, "Vandever, Chris"
 <Chris.Vandever@lsi.com>
Subject: Re: description of virtual server changes
Message-ID: <20091021154503.2edd5d7e@ripper.onstor.net>
In-Reply-To: <861DA0537719934884B3D30A2666FECC010B98CCF8@cosmail02.lsi.com>
References: <1255973717.20354.112.camel@rendellf>
	<861DA0537719934884B3D30A2666FECC010B98C96A@cosmail02.lsi.com>
	<1256146428.20354.174.camel@rendellf>
	<861DA0537719934884B3D30A2666FECC010B98CCF8@cosmail02.lsi.com>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Wed, 21 Oct 2009 13:42:27 -0600 "Kozlovsky, Maxim"
<Maxim.Kozlovsky@lsi.com> wrote:

> 
> 
> -----Original Message-----
> From: Rendell Fong [mailto:Rendell.Fong@lsi.com] 
> Sent: Wednesday, October 21, 2009 10:34 AM
> To: Kozlovsky, Maxim
> Cc: Sharp, Andy; Stark, Brian; Fisher, Bill; Ariyamannil, Jobi;
> Vandever, Chris Subject: RE: description of virtual server changes
> 
> > 4. Which setsockopt()/getsockopt() call is that? 
> IP_PKTINFO or perhaps SO_BINDTODEVICE.  What ever will work for us.
> Do you know of some other way to accomplish this?
> [MK] 
> 
> IP_PKTINFO might work. In the cases where the virtual server has
> multiple addresses this may become ugly, you will have to make some
> other call into the kernel to determine which source IP address to
> pick before sending each packet. SO_BINDTODEVICE can't be of use here.

Can you list the use cases you're referring to?

> How is connect() going to be handled? This needs to be spelled out.

You need to spell out what you're talking about.  Connect() in what
context?  A client calling connect()?  One of our daemons calling
connect()?

> > How are packets from the service VM going to be forwarded through
> > the virtual server interfaces? 
> This is not an issue in the merged approach.  If not the case, the
> existing scheme could be used.
> [MK] The document needs to describe how this is going to be done.

The document will only cover vsvrs in phase I and II.  VMs, if they're
still part of the picture, will have to be examined at that time.

> > They can't call setsockopt(). What is the kernel interface for the
> > same? 
> Don't know yet. Needs some investigation.
> [MK] ok, please investigate.

We will investigate in the fullness of time.

In the mean time the kernel interfaces are unchanged from what is
currently in eee now, and the document lists those API functions.  For
some definition of unchanged. Ie., very slight modifications that no one
need concern themselves with too terribly much right now.

How the kernel does this or that trivial network operation is an
unimportant implementation detail as far as I can see.  Right before
someone needs to implement the functionality of one of the APIs,
they will look up the relevant kernel function call and type it in.

> > > Doing setsockopt() and then sending a packet will not work because
> > > kernel is multithreaded. Setsockopt() will not work for
> > > multithreaded user space code either.
> > 
> > > 5. Currently there are separate listening sockets per virtual
> > > server on txrx. Is this going to be changed?  
> > 
> > Should it be changed?  Is there some issue that needs to be dealt
> > with?

> [MK] The nfs and netbios outgoing udp sockets need to change, they
> bind to INADDR_ANY, which will not work anymore.

It should work if we can bind them to just the interfaces.  Some extra
footwork might be needed for multiple interface cases.

> The NFS/CIFS create a listening socket per IP address. While this is
> not optimal solution, if it works there is no need to change.

> > 6. If this is going to be changed, there needs to be some code to
> > find out correct virtual server when a new connection is created or
> > a packet is received on listening udp socket.
> > 
> There will have to be some new api created to use instead of the
> vstack macros for getting/setting vsvr context.
> [MK] I don't see a need for such api, the virtual server context
> should not be a global variable and should be determined from a
> connection or a packet structures.
> 
> > 7. Why do we need to reduce the reference count dependencies?
> > Reference counts are there so you can depend on them.
> > 
> Some vsvr references may not need to be held in which case the
> reference count wouldn't be changed.
> [MK] Sounds like hand waving. Do you have an example of the code
> which has a problem? If not please remove this section from the
> document.

This is just about the arguments to some of the vs_ methods.  Reference
counting isn't free. Rather than copying a pointer and then maintaining
a ref count regardless, copy an integer and done.  Those places that
*actually* need to get a copy of a pointer can do it.  A minor point
really.

> > 8. In the current code the networking can be reconfigured without
> > disabling a virtual server, are there going to be any limitations
> > regarding that? The supporting code has to deal with terminating
> > the connections that have been established to the deleted ip
> > addresses.
> > 
> There shouldn't be.
> 
> > 9. Is the code to terminate all virtual server connections when the
> > virtual server is disabled part of these changes?
> > 
> I guess so.  Haven't thought about it yet.
> [MK] We need to know who is going to do this code. If it is not you
> then it has to be Bill. You can't both implement this functionality,
> or both not implement it.

What's wrong with the current code?  In the case of an interface being
brought down, the kernel will do the right thing.  The backend code
should already be there in some form.

> > 10. interface enable/disable - this does not enable or disable the
> > physical link, this disables the virtual server ability to
> > send/receive packets using the ip addresses assigned to the
> > interface and closes all the connections established on these
> > addresses.
> > 
> ok
> 
> > Max
> > 
> > -----Original Message-----
> > From: Rendell Fong [mailto:Rendell.Fong@lsi.com] 
> > Sent: Monday, October 19, 2009 10:35 AM
> > To: Sharp, Andy; Stark, Brian; Kozlovsky, Maxim; Fisher, Bill;
> > Ariyamannil, Jobi Cc: Vandever, Chris
> > Subject: description of virtual server changes
> > 
> > Enclosed is a description of the virtual server changes (as I see
> > it so far) for TuxStor.
> > 
> 
