AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:<20080716184729.6b5c9021@ripper.onstor.net>
CFG:
PT:0
S:andy.sharp@onstor.com
RQ:
SSV:onstor-exch02.onstor.net
NSV:
SSH:
R:<jonathan.goldick@onstor.com>
MAID:1
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
RMID:#imap/andys@onstor.net@onstor-exch02.onstor.net/INBOX	0	BB375AF679D4A34E9CA8DFA650E2B04E0AF1DD0E@onstor-exch02.onstor.net
X-Sylpheed-End-Special-Headers: 1
Date: Wed, 16 Jul 2008 18:47:39 -0700
From: Andrew Sharp <andy.sharp@onstor.com>
To: "Jonathan Goldick" <jonathan.goldick@onstor.com>
Subject: Re: clustering fencing
Message-ID: <20080716184739.00f3d6fc@ripper.onstor.net>
In-Reply-To: <BB375AF679D4A34E9CA8DFA650E2B04E0AF1DD0E@onstor-exch02.onstor.net>
References: <BB375AF679D4A34E9CA8DFA650E2B04E0AF1DD09@onstor-exch02.onstor.net>
	<20080716183542.2cc89bce@ripper.onstor.net>
	<BB375AF679D4A34E9CA8DFA650E2B04E0AF1DD0E@onstor-exch02.onstor.net>
Organization: Onstor
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

They already consider it to be down, or it things wouldn't have gotten
to this state.  But they/it expect the 'old' owner to nicely give up
ownership, and if it doesn't then reboot, which is not how it should be
done according to this explanation.  I'm just saying.

On Wed, 16 Jul 2008 18:37:44 -0700 "Jonathan Goldick"
<jonathan.goldick@onstor.com> wrote:

> Not exactly, in the case we are talking about we never tell the FP to
> stop renewing its equivalent of scsi reserve (i.e. writing to the lun
> label) so the other cluster nodes would consider it to be up.
> 
> -----Original Message-----
> From: Andy Sharp 
> Sent: Wednesday, July 16, 2008 6:36 PM
> To: Jonathan Goldick
> Subject: Re: clustering fencing
> 
> At least the Solaris explanation would be the exact opposite of what
> we're doing.  So in that sense I'm completely correct.  Instead of the
> new node panicing/rebooting because the old node is still pinging the
> volume, it should force takeover of the volume ownership and fence the
> "dying" node out.
> 
> On Wed, 16 Jul 2008 18:27:33 -0700 "Jonathan Goldick"
> <jonathan.goldick@onstor.com> wrote:
> 
> > Linux-HA  http://en.wikipedia.org/wiki/STONITH  The other node is
> > powered down.  
> > 
> > Solaris clusterig
> > http://docs.sun.com/app/docs/doc/819-2969/6n57kl13o?a=view#caccajda
> 
> > About Failure Fencing
> > A major issue for clusters is a failure that causes the cluster to
> > become partitioned (called split brain). When split brain occurs,
> > not all nodes can communicate, so individual nodes or subsets of
> > nodes might try to form individual or subset clusters. Each subset
> > or partition might "believe" it has sole access and ownership to the
> > multihost devices. When multiple nodes attempt to write to the
> > disks, data corruption can occur.
> > Failure fencing limits node access to multihost devices by
> > physically preventing access to the disks. Failure fencing applies
> > only to nodes, not to zones. When a node leaves the cluster (it
> > either fails or becomes partitioned), failure fencing ensures that
> > the node can no longer access the disks. Only current member nodes
> > have access to the disks, resulting in data integrity.
> > Device services provide failover capability for services that use
> > multihost devices. When a cluster member that currently serves as
> > the primary (owner) of the device group fails or becomes
> > unreachable, a new primary is chosen. The new primary enables
> > access to the device group to continue with only minor
> > interruption. During this process, the old primary must forfeit
> > access to the devices before the new primary can be started.
> > However, when a member drops out of the cluster and becomes
> > unreachable, the cluster cannot inform that node to release the
> > devices for which it was the primary. Thus, you need a means to
> > enable surviving members to take control of and access global
> > devices from failed members. The Sun Cluster software uses SCSI
> > disk reservations to implement failure fencing. Using SCSI
> > reservations, failed nodes are "fenced" away from the multihost
> > devices, preventing them from accessing those disks. SCSI-2 disk
> > reservations support a form of reservations, which either grants
> > access to all nodes attached to the disk (when no reservation is in
> > place). Alternatively, access is restricted to a single node (the
> > node that holds the reservation). When a cluster member detects
> > that another node is no longer communicating over the cluster
> > interconnect, it initiates a failure fencing procedure to prevent
> > the other node from accessing shared disks. When this failure
> > fencing occurs, the fenced node panics with a "reservation
> > conflict" message on its console. The discovery that a node is no
> > longer a cluster member, triggers a SCSI reservation on all the
> > disks that are shared between this node and other nodes. The fenced
> > node might not be "aware" that it is being fenced and if it tries
> > to access one of the shared disks, it detects the reservation and
> > panics.
> > 
