AF:
NF:0
PS:10
SRH:1
SFN:
DSR:
MID:
CFG:
PT:0
S:andy.sharp@lsi.com
RQ:
SSV:mhbs.lsil.com
NSV:
SSH:
R:<brian.stark@lsi.com>
MAID:2
X-Sylpheed-Privacy-System:
X-Sylpheed-Sign:0
SCF:#mh/Mailbox/sent
X-Sylpheed-End-Special-Headers: 1
Date: Fri, 18 Dec 2009 14:46:04 -0800
From: Andrew Sharp <andy.sharp@lsi.com>
To: Brian Stark <brian.stark@lsi.com>
Subject: ecc errors on ripper
Message-ID: <20091218144604.6a6924ea@ripper.onstor.net>
Organization: LSI
X-Mailer: Sylpheed-Claws 2.6.0 (GTK+ 2.8.20; x86_64-pc-linux-gnu)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Aieee,

Suddenly I'm getting a rash of ECC errors on ripper:

Dec 18 14:11:32 ripper kernel:  Northbridge Error, node 0, core: -1
Dec 18 14:11:32 ripper kernel: K8 ECC error.
Dec 18 14:11:32 ripper kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0xc4f80900
Dec 18 14:11:32 ripper kernel: EDAC MC0: CE page 0xc4f80, offset 0x900, grain 0, syndrome 0x3ea8, row 0, channel 1, label "": amd64_edac
Dec 18 14:11:32 ripper kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow
Dec 18 14:11:33 ripper kernel:  Northbridge Error, node 0, core: 0
Dec 18 14:11:33 ripper kernel: K8 ECC error.
Dec 18 14:11:33 ripper kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x6918cd80
Dec 18 14:11:33 ripper kernel: EDAC MC0: CE page 0x6918c, offset 0xd80, grain 0, syndrome 0xfcd6, row 0, channel 1, label "": amd64_edac
Dec 18 14:11:33 ripper kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow
Dec 18 14:27:48 ripper kernel:  Northbridge Error, node 0, core: 1
Dec 18 14:27:48 ripper kernel: K8 ECC error.
Dec 18 14:27:48 ripper kernel: EDAC amd64 MC0: CE ERROR_ADDRESS= 0x6818cdb0
Dec 18 14:27:48 ripper kernel: EDAC MC0: CE page 0x6818c, offset 0xdb0, grain 0, syndrome 0xfcd6, row 0, channel 1, label "": amd64_edac
Dec 18 14:27:48 ripper kernel: EDAC MC0: CE - no information available: amd64_edacError Overflow


It seemed to coincide with doing compiles, ie., heavy CPU usage.
This wasn't happening yesterday!  Do I have a cpu or memory module
going bad?  I just shelled out $400 for these cpus!
