Dailydave mailing list archives

Re: PAPER: Dynamic Data Flow Analysis via Virtual Code Integration (aka The SpiderPig case)


From: "Piotr Bania" <bania.piotr () gmail com>
Date: Mon, 18 May 2009 20:20:20 +0200

Yo,

I've got a few questions regarding your approach.

1) In section 4.4 you discuss predicting data propogation and you use
the term 'symbolic execution'. Does this mean you treat all input as
symbolic? e.g. everything from a recv() call is marked as 'tainted'
2) If the answer to the previous question is 'yes'; how do you deal
with symbolic read/writes using your O_in/O_out register mechanism? I
can't see this working for memory, as the size of those sets becomes
potentially unbounded (well, bounded by the amount of usable memory)
e.g how do you describe the memory written to by **mov dword ptr
[eax], ebx** if eax is symbolic and dependent on user input? A more
tangible situation might be the case where a child object is created,
then written to memory at a symbolic offset and then later read again.


First of all, SpiderPig in current shape requires the user to pick the 
starting point (that was the main idea of it from the beginning). In other 
words user must specify the register or memory region which is tainted (pick 
a root of the taint). SpiderPig can taint either memory location or CPU 
elements like registers, flags etc. etc. Regarding the 4.4 section 
(Predicting Data Propagation) the symbolic execution approach (O_in/O_out 
variants) refers only to the elements of the CPU architecture - not the 
memory locations pointed by them.

So if SpiderPig meets instruction like "mov dword ptr [eax], ebx", the 
"ProcessStandardInstruction()" function (see Algorithm 1, page 22) is used. 
So basically in this case it does following thing:
1) kills the 4 byte memory region pointed by EAX (saves all the information 
about the killer instruction)
2) if the EBX value is tainted then the 4 byte memory region pointed by EAX 
is also tainted

To preserve some time the referenced memory address (in this case pointed by 
EAX) is computed by the instrumentation code (on the fly) inside of target 
process.

Now if there are any possible data propagations afterwards in the Dataflow 
Region between CPU elements, the symbolic execution approach is used. I 
think it is important to notice that each Dataflow Region is considered as 
side-effect free (see Definition 4, page 24).

3) What DynamoRIO plugin are you comparing your code to?

If you refer to "Test application's performance" (page 39), it was a very 
simple plugin (made by myself) which task was to gather and save a CPU 
context for each executed instruction. Like i have stated in 5.2.3 section 
("Analysis (Instrumentation) Performance", page 38) there is nothing really 
to compare. VCI is VCI, DBI is DBI and IMHO they should be treated 
separately. Shortly in case of VCI i dont need to waste time for dispatcher 
calls "every"* transfer instruction. Anyway personally DynamoRIO is my 
favorite DBI so far, and i really admire Derek and rest of the authors for 
providing such an excellent tool. I think it is quite possible I will port 
SpiderPig to DynamoRIO, especially after it became open source project[1].


Cheers, and good work,

I'm glad you liked it and i hope i have answered your questions.

cheers,
pb


* i am aware DynamoRio has some optimizations for that

[1] - http://code.google.com/p/dynamorio/

On Mon, May 18, 2009 at 1:32 PM, Piotr Bania <bania.piotr () gmail com> wrote:
SpiderPig is a project created for performing and visualizing data flow
analysis of a selected binary program. SpiderPig was created in the 
purpose
of providing a tool which would be able to help vulnerability and security
researchers with tracing and analyzing any necessary data and it's further
propagation. Such tasks are very often crucial in the vulnerability
discovering/identifying process and typically require a lot of time
consuming manual work. Following paper discusses methods and techniques
implemented in SpiderPig in order to perform semi-automatic data flow
analysis.

Paper is available here:
http://piotrbania.com/all/spiderpig/pbania-spiderpig2008.pdf


Simple video demo and some other things available on project website:
http://piotrbania.com/all/spiderpig/


best regards,
Piotr Bania

--
--------------------------------------------------------------------
Piotr Bania - <bania.piotr () gmail com> - 0xCD, 0x19
Fingerprint: 413E 51C7 912E 3D4E A62A BFA4 1FF6 689F BE43 AC33
http://www.piotrbania.com - Key ID: 0xBE43AC33
--------------------------------------------------------------------

- "The more I learn about men, the more I love dogs."


P.S Did ya know adult pigs can run at speeds of up to 11 miles an hour?

_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave




-- 
http://www.unprotectedhex.com
http://www.smashthestack.org 

_______________________________________________
Dailydave mailing list
Dailydave () lists immunitysec com
http://lists.immunitysec.com/mailman/listinfo/dailydave


Current thread: