Full Disclosure mailing list archives

Re: [PAPER] Juggling with packets: floating data storage


From: Nicholas Weaver <nweaver () CS berkeley edu>
Date: Wed, 8 Oct 2003 12:47:17 -0700

On Wed, Oct 08, 2003 at 08:18:12PM +0200, Michal Zalewski composed:
The higher level juggling is still pointless.  Lets assume I still have
the 1 Gb link to the outside world, and the data round trip latency is a
minute (the amount of time data can be stored externally before it comes
back to me).  Thats still just 60 Gb of data, or 7.5 GB.  And I'm having
to burn a 1 Gb link to do it!

Should you actually read the paper, it would be easier to comprehend the
idea. The paper proposes several approaches that do not require you to
send the data back and forth all the time; in some cases, you can achieve
near-infinite latency with a very low "sustaining traffic" necessary to
prevent the data from being discarded. Besides, the storage size is not
the main point, the entire approach is much more interesting from the
limited deniability high-privacy storage point of view.

You have some refresh time required, just like DRAM or any other
lossy/decaying medium.  Without the refresh, you will have
uncorrectible data decay.  Its simply the refresh bandwidth and
refresh rate.

So lets assume a 1 DAY refresh time, and a 1 Gb refresh bandwidth.

That's still a maximum of ~10 TB of storage (3600 * 24 / 8), and you
are going to have to be saturating the link to maintain it.

With a 100 Mb refresh bandwidth, thats 1 TB.  I can go out and buy a
box which holds nearly 1 TB for <$2000.  Heck, I can buy a >2 TB RAID 
array, FROM APPLE (hardly a low price vendor) for $11k!


In terms of serving data, CDN's reportedly charge around $20/GB served
or so, or you can always construct a BitTorrent-like network to use
the edge-hosts, which is what Valve wants to do for selling/patching
HalfLife.



Even given a low/moderate-user (~100 user) network, the data cost is
going to be pretty extreem to maintain refresh, and as a result,
rather/very unstealthy.  Lets assume 1 TB shared between 1k users, and
a 1 hour refresh time (probably still way high).  

Thats still 200 Kb/s/user continuous bandwidth.  I don't know about
you, but I can't get that reliably on my cable modem upload, and if I
DID use that much bandwidth, ALL THE TIME, the cable company would
probably come knocking.

Why not just build a distributed filesystem for those 1K users?



The only real advantage is a monocrum of stealth, but it isn't really
stealthy if you are storing a non-trivial amount of data: the network
traffic is "Storage volume / refresh time", and the refresh time is
going to have to be fairly short to have a monocrum of reliability.
Burning even just 10Kb/s/user isn't going to be "stealthy" in
practice, due to the continual load.


For storing a small quantity of data (eg, ~1-2 GB or less), there are
much better covert repositories one could consider.

The second observation is that not all the data has to be send back and
forth all the time. Interesting.

The refresh rate is required for parasitic storages, to prevent data
decay and to recompute checksums etc to handle data loss.

Even given a 1 WEEK refresh cycle, a 100 Mb continual refresh
bandwidth only gets you 7 TB of storage, 1 Gb aggregate bandwidth gets
70 TB.

-- 
Nicholas C. Weaver                                 nweaver () cs berkeley edu

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.netsys.com/full-disclosure-charter.html


Current thread: