Dailydave mailing list archives
Re: A small fun Python puzzle
From: Adrien Krunch Kunysz <adrien () kunysz be>
Date: Wed, 10 Feb 2010 21:45:19 +0000
On Tue, Feb 09, 2010 at 07:13:47PM -0800, travis+ml-dailydave () subspacefield org wrote:
On Tue, Apr 01, 2008 at 04:08:41PM +0200, ergosum wrote:Hi all,for l in [100000, 1000000, 5000000, 10000000]:... print '%10d %f' % (l, test(l)) ... 100000 0.006711 1000000 0.764886 5000000 28.554786 10000000 111.738498 (wow - so not linear ...)So if this is true, slicing data[1024:] should be O(M) where M = 100000, 1000000, 5000000, 10000000 respectively, while slicing data[i:i+1024] should always be O(N) where N = 1024. There is a huge gain here that accounts for the more or less homogeneous times of the second algorithm. Nevertheless it puzzles me why the slicing operation isn't linear. Anyone with a better python internal knowledge can throw some light?I think the slicing operation /is/ linear (O(M)), assuming malloc/free as O(1) and no GC overhead. However, in the first version of the program, it is called in a loop, which is called ceil(M/1024) times. So as a result, the whole loop runs in O(M^2) time. The second, iterative version of the program has a loop called ceil(M/1024) times, and it uses the array slice operator, which runs in O(1024) time. So the run time is O(M), which is borne out by the timings. I can't find the link now, but I saw a similar issue on ha.ckers.org recently where he was doing something similar with strings in javascript. There's nothing surprising when you realize that creating a new string or buffer object takes time O(M) to copy the data, where M is the size of the new object. You could possibly do zero-copy operations, where one object points into the buffer used for another, but then garbage collection and object deallocation becomes trickier.
For those who are more familiar with C than with Python and algorithm-speak, I think it means the program is doing this: for (len = strlen(data); len > 0; len -= 1024) { for (i = 1024; i < len; i++) data[i-1024] = data[i]; } Which was this in Python: ~ while data!="": ~ data=data[1024:] Fixing the buffer underrun in the C version is left as an exercise for the reader.
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Dailydave mailing list Dailydave () lists immunitysec com http://lists.immunitysec.com/mailman/listinfo/dailydave
Current thread:
- Re: A small fun Python puzzle travis+ml-dailydave (Feb 10)
- Re: A small fun Python puzzle Adrien Krunch Kunysz (Feb 10)