Wireshark mailing list archives

Re: slow when loading big pcaps


From: Guy Harris <guy () alum mit edu>
Date: Mon, 25 Oct 2010 16:52:43 -0700


On Oct 25, 2010, at 12:25 PM, cco wrote:

On Wed, Oct 20, 2010 at 04:07:22AM -0700, Guy Harris wrote:

On Oct 20, 2010, at 3:42 AM, cco wrote:

why is wireshark so slow when loading up >500 MB pcaps?

Are you saying that the time taken to read a file, as a function of the size of the file, is discontinuous, with a 
jump at about 500 MB?

cristian: hi! I have not tested with continous values of file sizes (I
hope this is not becoming too mathematical...)

Well, if we want to get *mathematical*, you can't test with continuous values of file sizes, as e, for example, isn't a 
valid file size. :-)  (Well, maybe it is if your machine uses base e - it is, if I remember correctly, the base that 
requires the fewest bits, on average, to encode a number.  I'm still not sure what .718281828... of a byte would be. 
:-))

I guess a better-phrased question would be whether, for example, a file of 200MB takes about twice as long to read as a 
file of 100MB, and a file of 300MB takes about 3 times as long, and a file of 400MB takes about 4 times as long, and a 
file of 500MB takes about 5 times as long, but a file of 600MB takes significantly longer than 6 times as long?  (Or, 
if there's a discontinuity - in the informal sense, not the mathematical sense :-) - at some other value.)

If so, that might be due to the working set size of Wireshark growing above the amount of memory available on the 
machine.  The main way to improve that would be to try to reduce the per-packet memory consumption of Wireshark; there 
are a number of ways that this might be doable, although a number of them involve a significant amount of work.

If not - but the time is, as indicated, linear in the file size - that might just be an O(n) algorithm, of which there 
are probably many in Wireshark, and, for a lot of them, the best we can do is reduce the constant factor, which might 
also be doable (at least some of the fixes for the previous case would help here as well).

If not - and the time *isn't* linear in the file size, so a file of 200MB takes significantly more than twice as long 
as a file of 100MB, and a file of 300MB takes even more "significantly more", etc., then there might be some O(bigger 
than n) algorithms in there.

what I wanted to say was that large files take far too long to get
loaded by wireshark. (2gb file takes 45 minutes...)

So how long does, for example, a 1GB file take?  About 22 minutes, or significantly less than 22 minutes?  And what 
about a 500MB file?

If you're paging:

Make sure you're running Wireshark 1.4.0 or later - *no* columns can have their text generated on the fly in earlier 
releases, but some can in 1.4.0.

cristian: do you mean the gui is that so slow?

To the extent that the GUI requires that, when you're reading in a capture, the values of several of the columns be 
computed, for each packet, and stored as a string, yes, the GUI could be that slow.  The more such columns there are, 
the slower it gets.  If we can reduce that to 0, that would improve both the memory use (and thus reduce the working 
set size - even though, while reading the file in, the column text for columns read in a while ago will eventually 
leave the working set - as well as reducing the constant factor).
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe


Current thread: