X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C7A79D.F0CBC9DD@onstor-exch02.onstor.net>; Tue, 5 Jun 2007 11:18:39 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: watchdog device
Date: Tue, 5 Jun 2007 11:18:39 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E04020423@onstor-exch02.onstor.net>
In-Reply-To: <20070605103919.6df10e8d@ripper.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: watchdog device
Thread-Index: AcenmHIFA4uijk/1TJGdhE1Wy4zrgQABWHvA
References: <20070604161540.711da0db@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E1049D7@onstor-exch02.onstor.net><20070605083854.06e4b50d@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E04020374@onstor-exch02.onstor.net><20070605101359.3f05f35c@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E0402038C@onstor-exch02.onstor.net><20070605102049.0ebc9368@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E040203AD@onstor-exch02.onstor.net> <20070605103919.6df10e8d@ripper.onstor.net>
From: "Tim Gardner" <tim.gardner@onstor.com>
To: "Andy Sharp" <andy.sharp@onstor.com>

Nothing. Its ok for the timer to be disabled by another device.
This is typically done when doing things like system upgrade.

-----Original Message-----
From: Andy Sharp=20
Sent: Tuesday, June 05, 2007 10:39 AM
To: Tim Gardner
Subject: Re: watchdog device

Unless some sync method is used, what would stop one process from
disabling it a microsecond before another enables it again?

On Tue, 5 Jun 2007 10:29:08 -0700 "Tim Gardner"
<tim.gardner@onstor.com> wrote:

> Mostly. But the chassis application does not behave this way and
> sends the syscalls directly. Not sure if other applications
> utilize the chassis library to also send syscalls directly.
>=20
> -----Original Message-----
> From: Andy Sharp=20
> Sent: Tuesday, June 05, 2007 10:21 AM
> To: Tim Gardner
> Subject: Re: watchdog device
>=20
> Isn't it really only one process?  Other processes merely send
> messages to that process?  I can't imagine how it could possibly work
> reliably otherwise.
>=20
> On Tue, 5 Jun 2007 10:15:43 -0700 "Tim Gardner"
> <tim.gardner@onstor.com> wrote:
>=20
> > Yup. That is the current architecture and I don't want to make
> > unnecessary architecture changes.
> >=20
> > -----Original Message-----
> > From: Andy Sharp=20
> > Sent: Tuesday, June 05, 2007 10:14 AM
> > To: Tim Gardner
> > Subject: Re: watchdog device
> >=20
> > Really?
> >=20
> > On Tue, 5 Jun 2007 10:07:04 -0700 "Tim Gardner"
> > <tim.gardner@onstor.com> wrote:
> >=20
> > > The watchdog needs to be disabled by multiple processes.
> > >=20
> > > -----Original Message-----
> > > From: Andy Sharp=20
> > > Sent: Tuesday, June 05, 2007 8:39 AM
> > > To: Tim Gardner
> > > Subject: Re: watchdog device
> > >=20
> > > You can disable it simply by closing the file descriptor.
> > >=20
> > > Cheers,
> > >=20
> > > a
> > >=20
> > >=20
> > > On Mon, 4 Jun 2007 20:22:02 -0700 "Tim Gardner"
> > > <tim.gardner@onstor.com> wrote:
> > >=20
> > > > Thanks Andy. Very close to what we want but not quite.
> > > > We need to be able to enable/disable the watchdog as well as set
> > > > the timeout value. I will probably just steal the source for
> > > > this driver and add a few ioctls.
> > > >=20
> > > > ________________________________
> > > >=20
> > > > From: Andy Sharp
> > > > Sent: Mon 6/4/2007 4:15 PM
> > > > To: Tim Gardner
> > > > Subject: watchdog device
> > > >=20
> > > >=20
> > > >=20
> > > > Here is the kernel help text for the watchdog device.  You can
> > > > configure the software watchdog by adding support for
> > > > SOFT_WATCHDOG. CONFIG_WATCHDOG is already set to 'y'.  So, add a
> > > > line CONFIG_SOFT_WATCHDOG=3Dy
> > > > after the CONFIG_WATCHDOG line in .config and do a 'make' in
> > > > linux-mips-2.6, or a 'make kernel-build' in the directory above
> > > > (cougar/linux/kernel).
> > > >=20
> > > > The user process then has to open and write to the file
> > > > descriptor at least once a minute or the kernel will reboot.  I
> > > > haven't tested it ~:^)
> > > >=20
> > > >=20
> > > > CONFIG_WATCHDOG=3Dy
> > > >=20
> > > > If you say Y here (and to one of the following options) and
> > > > create a character special file /dev/watchdog with major number
> > > > 10 and minor number 130 using mknod ("man mknod"), you will get
> > > > a watchdog, i.e.: subsequently opening the file and then
> > > > failing to write to it for longer than 1 minute will result in
> > > > rebooting the machine. This could be useful for a networked
> > > > machine that needs to come back on-line as fast as possible
> > > > after a lock-up. There's both a watchdog implementation
> > > > entirely in software (which can sometimes fail to reboot the
> > > > machine) and a driver for hardware watchdog boards, which are
> > > > more robust and can also keep track of the temperature inside
> > > > your computer. For details, read
> > > > <file:Documentation/watchdog/watchdog.txt> in the kernel source.
> > > >=20
> > > > The watchdog is usually used together with the watchdog daemon
> > > > which is available from
> > > > <ftp://ibiblio.org/pub/Linux/system/daemons/watchdog/>. This
> > > > daemon can also monitor NFS connections and can reboot the
> > > > machine when the process table is full.
> > > >=20
> > > >=20
> > > >=20
> > > >=20
> > > > CONFIG_SOFT_WATCHDOG=3D[y|m]
> > > >=20
> > > > A software monitoring watchdog. This will fail to reboot your
> > > > system from some situations that the hardware watchdog will
> > > > recover from. Equally it's a lot cheaper to install.
> > > >=20
> > > > To compile this driver as a module, choose M here: the
> > > > module will be called softdog.
> > > >=20
> > > >=20
> > > >=20
> > > > CONFIG_WATCHDOG_NOWAYOUT=3Dn
> > > >=20
> > > > The default watchdog behaviour (which you get if you say N here)
> > > > is to stop the timer if the process managing it closes the file
> > > > /dev/watchdog. It's always remotely possible that this process
> > > > might get killed. If you say Y here, the watchdog cannot be
> > > > stopped once it has been started.
> > > >=20
> > > >=20
