tcpdump mailing list archives
Optimizer uses stale control flow graph data structures
From: "Archit P. Shah via tcpdump-workers" <tcpdump-workers () lists tcpdump org>
Date: Tue, 03 Nov 2020 22:22:06 -0800
--- Begin Message --- From: "Archit P. Shah" <archit.shah () alum mit edu>
Date: Tue, 03 Nov 2020 22:22:06 -0800
I submitted two pull requests fixing bugs in the libpcap optimizer (#972 and #976). Both proposed patches are relatively narrow hacks to address specific bugs (links [1] and [2] below). I'm writing here because it appears both bugs are specific instances of a broader issue that can be resolved with a larger restructuring of the optimzier. I have mocked up a patch with such a larger restructuring of the optimizer's looping structure. See https://github.com/the-tcpdump-group/libpcap/compare/master...tenarchits:2020-recalculate-cfg Currently opt_loop builds a series of data structures related to the control flow graph (block levels, dominators, use/def information, edge dominators, etc.) and then calls opt_blks which walks through the control graph calling opt_blk on each block to optimize. The issue is that individual optimizations to one block or statement may make alterations that render the computed data structures stale. The next optimization may then be relying on stale data. In the bug in link [1] below, or_pullup rearranges control flow to invalidate the calculations made by find_dom. The next call to or_pullup or and_pullup may rely on stale data. In the bug in link [2] below, opt_deadstore deletes instructions impacting the structures computed by find_ud, which may impact optimization of the of the next block. First, my patch uses setjmp/longjmp to jump back to the calculation of the control flow graph-related data structures in opt_loop any time an optimization is made that invalidates any of the computed structures. Obviously, a more finegrained recalculation would be preferable, but I believe this approach prioritizes correctness at a reasonable performance penalty. Second, my patch restructures the control flow in opt_loop. Currently, opt_loop runs opt_blks repeatedly until a fixed point is reached. It is first run with do_stmts=false to find a fixed point and then with do_stmts=true to find a fixed point. With the patch, opt_loop alternates between running opt_blks with do_stmts=false and running opt_blks with do_stmts=true until a fixed point is reached. Thus first there is a recalculation of the control flow graph data structures, then a pass of opt_blks with do_stmts=false and then a pass of opt_blks with do_stmts=true. I hope these observations and this proposal is useful. I welcome any feedback for improvements to this patch or suggestions for alternate approaches. I have also mocked a patch adding tests to the tcpdump project reflecting these two bugs. Please let me know if it would helpful for me to create a pull request for these tests. See [3]. Links [1] https://github.com/the-tcpdump-group/libpcap/issues/945 [2] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=144325 [3] https://github.com/the-tcpdump-group/tcpdump/compare/master...tenarchits:2020-optimizer-tests Regards, -- Archit
--- End Message ---
_______________________________________________ tcpdump-workers mailing list tcpdump-workers () lists tcpdump org https://lists.sandelman.ca/mailman/listinfo/tcpdump-workers
Current thread:
- Optimizer uses stale control flow graph data structures Archit P. Shah via tcpdump-workers (Nov 03)