AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<Jobi.Ariyamannil@lsi.com>,<Chris.Vandever@lsi.com>,<Anurag.Agarwal@lsi.com>,<Jonathan.Goldick@lsi.com>,<dl-designreview@lsi.com>,<brian.stark@lsi.com>,<narayan.venkat@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#mh/Mailbox/design review	0	5D3F3304755C724285AA5789BA159920BAD3A179@cosmail03.lsi.com
X-Sylpheed-End-Special-Headers: 1
Date: Mon, 8 Feb 2010 16:22:25 -0800
From: Andrew Sharp <andy.sharp@lsi.com>
To: "Ariyamannil, Jobi" <Jobi.Ariyamannil@lsi.com>
Cc: "Vandever, Chris" <Chris.Vandever@lsi.com>, "Agarwal, Anurag"
 <Anurag.Agarwal@lsi.com>, "Goldick, Jonathan" <Jonathan.Goldick@lsi.com>,
 DL-ONStor-Design Review <dl-designreview@lsi.com>, Brian Stark
 <brian.stark@lsi.com>, Narayan Venkat <narayan.venkat@lsi.com>
Subject: Re: Flexible Log Design Document for Review
Message-ID: <20100208162225.66241dcb@ripper.onstor.net>
In-Reply-To: <5D3F3304755C724285AA5789BA159920BAD3A179@cosmail03.lsi.com>
References: <5D3F3304755C724285AA5789BA159920BAD3959C@cosmail03.lsi.com>
	<2E3074EBA7791D4E8CDEEAA5DC8EFC27103232EB@cosmail03.lsi.com>
	<4B6B9C4E.3070306@lsi.com>
	<F9DCB1C30AC37B4EB352D0DE0AE18E5F011D35A1BE@cosmail02.lsi.com>
	<5D3F3304755C724285AA5789BA159920BAD3A179@cosmail03.lsi.com>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

On Fri, 5 Feb 2010 16:34:55 -0700 "Ariyamannil, Jobi"
<Jobi.Ariyamannil@lsi.com> wrote:

> Hi Chris,
> 
> With the new file system we are working on, no background
> compatibility is expected. The current users are supposed to migrate
> their data using backup/restore or possibly with a new tool(?) from
> us.
> 
> The new file system will have entirely different disk layout to
> support various new file system features and it is virtually
> impossible to do that supporting the current layout as well. Also the
> luns are going to use a new lun label scheme to avoid the many
> problems we had with the current label scheme.


I find it difficult to believe that we think we can get away with no
upgrade path.  Or we are going to try to float the
"start-the-restore-and-run-for-the-door" upgrade method with Marketing?

There's lot's of customers that don't have backup facilities
in place that could handle a full backup of their datasets.

I just don't see how you are going to get away without an
upgrade/migration path for both the new lun label and new filesystem
layout.  Even if all it can do it read the old layout well enough to
convert it, which could potentially take days, but a backup/restore
would take even longer....



> Regards,
> Jobi
> 
> _____________________________________________
> From: Vandever, Chris
> Sent: Friday, February 05, 2010 3:28 PM
> To: Agarwal, Anurag; Goldick, Jonathan
> Cc: Ariyamannil, Jobi; DL-ONStor-Design Review
> Subject: RE: Flexible Log Design Document for Review
> 
> 
> *       Section 3, last bullet: We are currently planning a
> gateway-only product.  I would expect we would require support for a
> migration from cougar.  This means the code must understand the older
> FS format.  Given that, support of a mixed revision cluster is easy -
> you simply disallow "vol create", "vol modify", and "fs convert" to
> the new format until the entire cluster has been upgraded.  Also,
> "vol show" will need to support both formats.  By not supporting
> this, you are precluding the possibility of using any TuxStor code in
> a cougar refresh release.
> *       Section 5.1: The owner block cannot move until after the
> entire cluster has been upgraded, as it is used for split-brain
> resolution.  This will require a clusDb upgrade of the version number.
> *       Section 5.12:  Add a section for testing in a mixed revision
> cluster to make sure the specified commands are disabled until the
> entire cluster has been upgraded.
> *       Section 5.13:  Clustering requires that all nodes in the
> cluster be able to access the fs-owner block for split-brain
> resolution.
> 
> ChrisV
> 
> -----Original Message-----
> From: Anurag Agarwal [mailto:anurag.agarwal@lsi.com]
> Sent: Thursday, February 04, 2010 8:19 PM
> To: Goldick, Jonathan
> Cc: Ariyamannil, Jobi; DL-ONStor-Design Review
> Subject: Re: Flexible Log Design Document for Review
> 
> Hi Jonathan,
> 
> Thanks for the comments. My replies are inline.
> 
> Goldick, Jonathan wrote:
> > 1. Fix the copyright to be 2010 LSI
> >
> > 2. Table of Contents is empty
> > 3. Chapter 1, Can we get a hyperlink to a document that we add to
> > the ONStorNAS sharepoint site?  We really need to get all documents
> > placed in the appropriate directory.  If there are access problems
> > you can let me, BrianS, or the helpdesk know. 4. Chapter 3, the
> > default listed is after block 1026 but that is inconsistent with
> > the subsequent statement of being 1MB aligned.  This comment
> > appears twice.
> >
> I will update all these in the document.
> > 5. In 5.1, you cannot have the owner block move.  This has very bad
> > failure modes in a cluster during a change that could impact split
> > brain resolution. 6. In 5.3, how can the log possibly be extended
> > to the next MByte if the owner block immediately follows it?
> > Doesn't this ensure that there will never be available contiguous
> > space?  Again, we should not move the owner block.
> >
> This has been clarified by Jobi. Owner block is still in the file
> system. Size of reserved blocks has been changed from 64M to 1M and it
> now just contains the owner block at the end of reserve area. File
> system space starts at the end of reserved blocks. Now log is part of
> file system space. Log can be  now any where in the file system not
> at a fixed location.
> > 7. In 5.4, how will you store the log lun with only these changes?
> > I imagine you could add the log lun to the start or end of the file
> > system and then use the associated block number but that should be
> > spelled out.
> >
> This feature is not adding a new lun to file system as log lun. It
> just allows placing of log on a lun which is already part of file
> system. That lun will not be exclusively used for log. Just the start
> block for log will be computed from that lun.
> > 8. In 5.6, keeping the log segment count the same may not be
> > optimal.  We only truncate a segment at a time and that can be
> > pretty large now.  We could reduce the problems described in 5.7 if
> > the segment is not so large.  What is the advantage of keeping the
> > segments limited to 64?
> >
> This was discussed during the design. Increasing the segments would
> have required more bits in each inode. Increasing number of segment
> would have increased size of incore inode. So it was decided to keep
> the number of segments to constant to 64 and change the size of
> segment depending on the log size. Even with largest log of 1G, max
> segment size will be 16M.
> > 9. In 5.7, what are the best case and worst case number of disk
> > I/O(s) needed as a function of log size?  Same for memory.  Given a
> > 1-3ms avg read I/O time how long will replay take?
> >
> Number of disk I/O would depending of the size of active log. I have
> not worked out the worst case log replay time. I would assume that in
> the worst case size of active log could be 1G, and hence worst case
> disk I/O can be 16 times the current worst case. In fact, it will be
> 32 times, as now log needs to be read twice from the disk. In the
> current logic, complete log is read once and both the passes of log
> replay are done on in-core log. Now for both the passes log needs to
> be read from the disk.
> 
> As far as memory consumption is considered, now complete log is not
> going to be kept in memory so memory pressure of log replay should not
> be very high. This was discussed at length.
> > 10. Will this change affect the number of parallel log replays we
> > allow?  Currently that is a function of log I/O(s) that will be
> > required.
> >
> This logic is yet to be worked out.
> > 11. In 5.13, this is not true if we allow the owner block to move
> > as this would affect split brain resolution.
> >
> >
> >
> Owner block is still in the file system, only location of owner block
> is now read from the lun label. All the logic of owner block update
> still remains in the file system, so it should not affect split brain
> resolution logic. There should not be any clustering related issues.
> 
> Regards,
> Anurag.
> >
> >
> > -----Original Message-----
> > From: Ariyamannil, Jobi
> > Sent: Monday, February 01, 2010 1:22 PM
> > To: DL-ONStor-Design Review
> > Cc: Agarwal, Anurag
> > Subject: FW: Flexible Log Design Document for Review
> >
> >
> >
> > -----Original Message-----
> > From: Anurag Agarwal [mailto:anurag.agarwal@lsi.com]
> > Sent: Thursday, January 28, 2010 6:14 PM
> > To: Ariyamannil, Jobi; Kozlovsky, Maxim
> > Subject: Flexible Log Design Document for Review
> >
> > Hi,
> >
> > Here is the design document for Flexible Log for review. Please
> > send me your review comments.
> >
> > I still have problem sending email to following address:
> >
> > <dl-onstor-design review@lsi.com>. It does not work from my
> > thunderbird email-client.
> >
> > Regards,
> > Anurag.
> >
> 
> 
