nanog mailing list archives

Re: Most energy efficient (home) setup


From: Steven Bellovin <smb () cs columbia edu>
Date: Wed, 18 Apr 2012 23:09:44 -0400


On Apr 18, 2012, at 5:55 32PM, Douglas Otis wrote:

On 4/18/12 12:35 PM, Jeroen van Aart wrote:
Laurent GUERBY wrote:
Do you have reference to recent papers with experimental data about
non ECC memory errors? It should be fairly easy to do
Maybe this provides some information:

http://en.wikipedia.org/wiki/ECC_memory#Problem_background

"Work published between 2007 and 2009 showed widely varying error
rates with over 7 orders of magnitude difference, ranging from
10−10−10−17 error/bit·h, roughly one bit error, per hour, per
gigabyte of memory to one bit error, per century, per gigabyte of
memory.[2][4][5] A very large-scale study based on Google's very
large number of servers was presented at the
SIGMETRICS/Performance’09 conference.[4] The actual error rate found
was several orders of magnitude higher than previous small-scale or
laboratory studies, with 25,000 to 70,000 errors per billion device
hours per megabit (about 3–10×10−9 error/bit·h), and more than 8% of
DIMM memory modules affected by errors per year."
Dear Jeroen,

In the work that led up to RFC3309, many of the errors found on the Internet pertained to single interface bits, and 
not single data bits.  Working at a large chip manufacturer that removed internal memory error detection to foolishly 
save space, cost them dearly in then needing to do far more exhaustive four corner testing.  Checksums used by TCP 
and UDP are able to detect single bit data errors, but may miss as much as 2% of single interface bit errors.  It 
would be surprising to find memory designs lacking internal error detection logic.


mallet:~ smb$ head -14 doc/ietf/rfc/rfc3309.txt | sed 1,7d | sed 2,5d; date
Request for Comments: 3309                                      Stanford
                                                          September 2002

Wed Apr 18 23:07:53 EDT 2012


We are not in a static field...  (3309 is one of my favorite RFCs -- but
the specific findings (errors happen more often than you think), as
opposed the general lesson (understand your threat model) may be OBE.


                --Steve Bellovin, https://www.cs.columbia.edu/~smb







Current thread: