X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: by onstor-exch02.onstor.net 
	id <01C7A797.064B7273@onstor-exch02.onstor.net>; Tue, 5 Jun 2007 10:29:09 -0700
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Content-class: urn:content-classes:message
Subject: RE: watchdog device
Date: Tue, 5 Jun 2007 10:29:08 -0700
Message-ID: <BB375AF679D4A34E9CA8DFA650E2B04E040203AD@onstor-exch02.onstor.net>
In-Reply-To: <20070605102049.0ebc9368@ripper.onstor.net>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: watchdog device
Thread-Index: Acenldx3gLf7iB8mTVy46aV2/pKYbQAAQEmQ
References: <20070604161540.711da0db@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E1049D7@onstor-exch02.onstor.net><20070605083854.06e4b50d@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E04020374@onstor-exch02.onstor.net><20070605101359.3f05f35c@ripper.onstor.net><BB375AF679D4A34E9CA8DFA650E2B04E0402038C@onstor-exch02.onstor.net> <20070605102049.0ebc9368@ripper.onstor.net>
From: "Tim Gardner" <tim.gardner@onstor.com>
To: "Andy Sharp" <andy.sharp@onstor.com>

Mostly. But the chassis application does not behave this way and
sends the syscalls directly. Not sure if other applications
utilize the chassis library to also send syscalls directly.

-----Original Message-----
From: Andy Sharp=20
Sent: Tuesday, June 05, 2007 10:21 AM
To: Tim Gardner
Subject: Re: watchdog device

Isn't it really only one process?  Other processes merely send messages
to that process?  I can't imagine how it could possibly work reliably
otherwise.

On Tue, 5 Jun 2007 10:15:43 -0700 "Tim Gardner"
<tim.gardner@onstor.com> wrote:

> Yup. That is the current architecture and I don't want to make
> unnecessary architecture changes.
>=20
> -----Original Message-----
> From: Andy Sharp=20
> Sent: Tuesday, June 05, 2007 10:14 AM
> To: Tim Gardner
> Subject: Re: watchdog device
>=20
> Really?
>=20
> On Tue, 5 Jun 2007 10:07:04 -0700 "Tim Gardner"
> <tim.gardner@onstor.com> wrote:
>=20
> > The watchdog needs to be disabled by multiple processes.
> >=20
> > -----Original Message-----
> > From: Andy Sharp=20
> > Sent: Tuesday, June 05, 2007 8:39 AM
> > To: Tim Gardner
> > Subject: Re: watchdog device
> >=20
> > You can disable it simply by closing the file descriptor.
> >=20
> > Cheers,
> >=20
> > a
> >=20
> >=20
> > On Mon, 4 Jun 2007 20:22:02 -0700 "Tim Gardner"
> > <tim.gardner@onstor.com> wrote:
> >=20
> > > Thanks Andy. Very close to what we want but not quite.
> > > We need to be able to enable/disable the watchdog as well as set
> > > the timeout value. I will probably just steal the source for this
> > > driver and add a few ioctls.
> > >=20
> > > ________________________________
> > >=20
> > > From: Andy Sharp
> > > Sent: Mon 6/4/2007 4:15 PM
> > > To: Tim Gardner
> > > Subject: watchdog device
> > >=20
> > >=20
> > >=20
> > > Here is the kernel help text for the watchdog device.  You can
> > > configure the software watchdog by adding support for
> > > SOFT_WATCHDOG. CONFIG_WATCHDOG is already set to 'y'.  So, add a
> > > line CONFIG_SOFT_WATCHDOG=3Dy
> > > after the CONFIG_WATCHDOG line in .config and do a 'make' in
> > > linux-mips-2.6, or a 'make kernel-build' in the directory above
> > > (cougar/linux/kernel).
> > >=20
> > > The user process then has to open and write to the file descriptor
> > > at least once a minute or the kernel will reboot.  I haven't
> > > tested it ~:^)
> > >=20
> > >=20
> > > CONFIG_WATCHDOG=3Dy
> > >=20
> > > If you say Y here (and to one of the following options) and
> > > create a character special file /dev/watchdog with major number
> > > 10 and minor number 130 using mknod ("man mknod"), you will get a
> > > watchdog, i.e.: subsequently opening the file and then failing to
> > > write to it for longer than 1 minute will result in rebooting the
> > > machine. This could be useful for a networked machine that needs
> > > to come back on-line as fast as possible after a lock-up. There's
> > > both a watchdog implementation entirely in software (which can
> > > sometimes fail to reboot the machine) and a driver for hardware
> > > watchdog boards, which are more robust and can also keep track of
> > > the temperature inside your computer. For details, read
> > > <file:Documentation/watchdog/watchdog.txt> in the kernel source.
> > >=20
> > > The watchdog is usually used together with the watchdog daemon
> > > which is available from
> > > <ftp://ibiblio.org/pub/Linux/system/daemons/watchdog/>. This
> > > daemon can also monitor NFS connections and can reboot the
> > > machine when the process table is full.
> > >=20
> > >=20
> > >=20
> > >=20
> > > CONFIG_SOFT_WATCHDOG=3D[y|m]
> > >=20
> > > A software monitoring watchdog. This will fail to reboot your
> > > system from some situations that the hardware watchdog will
> > > recover from. Equally it's a lot cheaper to install.
> > >=20
> > > To compile this driver as a module, choose M here: the
> > > module will be called softdog.
> > >=20
> > >=20
> > >=20
> > > CONFIG_WATCHDOG_NOWAYOUT=3Dn
> > >=20
> > > The default watchdog behaviour (which you get if you say N here)
> > > is to stop the timer if the process managing it closes the file
> > > /dev/watchdog. It's always remotely possible that this process
> > > might get killed. If you say Y here, the watchdog cannot be
> > > stopped once it has been started.
> > >=20
> > >=20
