AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20071219160650.7b1d653c@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<mrw@originatelabs.com>,<jonathan.goldick@onstor.com>,<robert@originatelabs.com>,<andrew@originatelabs.com>,<mosath@originatelabs.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	2EED6A16-5B1C-4E96-A135-8020BA70198D@originatelabs.com
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 19 Dec 2007 16:07:02 -0800
From: Andrew Sharp <andy.sharp@onstor.com>
To: Matt Williams <mrw@originatelabs.com>
Cc: Jonathan Goldick <jonathan.goldick@onstor.com>, Robert Diamond
 <robert@originatelabs.com>, Andrew Thompson <andrew@originatelabs.com>,
 mosath@originatelabs.com
Subject: Re: notes from 12/19 meeting
Message-ID: <20071219160702.6295507a@ripper.onstor.net>
In-Reply-To: <2EED6A16-5B1C-4E96-A135-8020BA70198D@originatelabs.com>
References: <2EED6A16-5B1C-4E96-A135-8020BA70198D@originatelabs.com>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Wed, 19 Dec 2007 15:49:25 -0800 Matt Williams
<mrw@originatelabs.com> wrote:

> for custom builds, start with debian source packages, build our own
> packages off of that..  for handling updates and upgrades.
> 
> does stuff with patches -- merges main source tree back into

examples:

any of the packages in perforce://depot/dev/linux/Pkgs/source/

Currently only kexec-tools, exim4 and wget are there, but after some
floggings, sshd, kerberos and openldap will also be there.  libc is an
example of a package that is too difficult to do in this manner.  At
least at the last assessment.  It is my hope and dream that System-X
will not require any libc modifications.

> file system standard - LSB
> 
> choose what requires the least amount of maintenance
> 
> trick with debian:
> often, we can install kernel packages from next release
> 
> in final blush of this thing in '09, we'll probably be running a
> custom kernel.  - stable release, go for it.  pick something that will
> be up in february.
> 
> think we're okay with per-interface arp as-is; was unclear
> documentation.
> 
> physical interfaces are parents of logical interfaces
> 
> packages -> binary / source
>   check in binaries - those are used to build the running system
>   check in the .deb files
>   when we do compile our own, we'll check in code *and* the .deb
> 
> specifics of customizing things wrt a custom debian repository and
> relative to perforce.  don't need to reinvent wheel here but we do
> need to pick a wheel.  pita to change it after the fact.
> 
> minimize compilation for developers.
> 
> have some good documentation - a howto - for how to compile build and
> work with.
> 
> action item for us to work on that issue.
> 
> git - read up on it
> Andy keeps full git repository checked into perforce - but that's
> mostly for his MIPS work, not sure if it applies to x86 work.
> Absolutely needed for his MIPS work.
> 
> 
> Have done some work on socket-binding API.  For Cougar project.  Andy
> will send us pointer to source.

see pointer above.  also, the actual hacks on glibc are in
//depot/dev/linux/src/glibc-2.3.6.ds1/

> 
> 
> ---
> 
> 
> mgmt0 can be busted!  make sure that the one IP address does wind up
> going to an interface that's actually up.
> 
> 
> would need a wrapper to implement retry loops -- have to handle
> "reliable open"
> 
> receive callback, close callback
> 
> throttle not needed for sctp
> 
> would be nice to have an "open" callback, a "receive" callback, and a
> "close" callback.
> 
> one API for both userspace and kernelspace implementations of it
> 
> have to have acknowledgements that send is completed.
> 
> don't think that we need a "send noack" - unreliable send
> 
> currently includes opaque pointer to convey context
> 
> currently includes a return code for the embedded RPC errors (which is
> largely unneccessary).  not a lot of value out of having multiple
> error code spaces.  transport should only know status of transport
> layer.
> 
> current problem is that RMC is so hard to call that nobody ever calls
> it right.
> 
> RMC includes an alloc function for its own crazy thing.
> 
> "RMC has an RPC capability.  Don't ever use it."
> 
> RMC header included session information; only one file descriptor per
> communications pair (no matter how many sections)
> 
> <process monitor>
> 
> everybody talks to mxa
> 
> ncmd handles lots of communication
> 
> scenario of starting virtual server -- client sends message to ncmd,
> which sends messae to vcmd, which sends message to vsd.
> 
> we need to have a kernel module which serves as a listener for
> kernel-level communications.
> 
> kernel-level stuff is cifs server, nfs server, file system.
> 
> should be one per kernel that is shared amongst these.
> 
> There will be server- and client-side uses of sctp in the kernel.
> 
> Wants to pass pointers for buffers.
> 
> Use case (client and server in kernel space, no user-space compononent
> at all):
> mirroring product
> block-based replication
> figures out next list of blocks to send
> sends 200 parallel ios to filesystem, they come back asynchronously
> and out of order.
> scatter-gather
> this is the buffer for block 200, that gets written to block 200.
> needs to get acknowledges, throttling, etc.
> a knob to set throttling down to less than network can handle.
> 
> needs to know from limits perspective when will the thing break.
> 
> portmapper question:  how do we locate services?
> 
> not a burden for our services to put them on fixed port numbers.  we
> control them.  not particularly motivated to configure.
> 
> if the other side has flow-control, full-buffer, etc., and it can't
> process them fast enough, does SCTP tell the source side anything?
> E_PIPE_FULL.
> 
> Wants to know what the error codes are.
> 
> How much work are we going to push out into the application?
> 
> Any layering we put on top, we want the ability to trace it to figure
> out what's going on.  Stats, figure out when it's throttled, flow
> control, etc.  Infrastructure to troubleshoot this system is
> essential.
> 
> Fine with options as long as they don't require a PhD to use.
> 
> How long without forward progress should I wait?  Arbitrary timeouts
> so that retry loop doesn't become infinite.
> 
> 
> 
