Full Disclosure mailing list archives

Lame patch to flawfinder for format string hunting


From: Georgi Guninski <guninski () guninski com>
Date: Tue, 06 May 2003 13:06:10 +0300

Comes the time to throw some of my lame stuff on the water.

Attached is a patch which adds some fuzzy format string hunting
capabilities to flawfinder.
Also available at:
http://www.guninski.com/flaw-patch.1
http://www.guninski.com/flaw.README.georgi.txt

Georgi Guninski

--- flawfinder  2002-09-06 21:05:29.000000000 +0300
+++ flawfinder6 2003-05-06 12:11:04.000000000 +0300
@@ -36,7 +36,9 @@
 #    along with this program; if not, write to the Free Software
 #    Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
-
+#    Georgi's patch is distributed in the hope that it will be useful,
+#    but WITHOUT ANY WARRANTY; without even the implied warranty of
+#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 
 import sys, re, string, getopt
 import pickle               # To support load/save/diff of hitlist
@@ -46,6 +48,8 @@
 # import formatter
 
 # Program Options - these are the default values:
+do_hunt = 0
+huntaggressive = 0
 show_context = 0
 minimum_level = 1
 show_immediately = 0
@@ -67,6 +71,8 @@
 displayed_header = 0    # Have we displayed the header yet?
 num_ignored_hits = 0    # Number of ignored hits (used if never_ignore==0)
 
+huntfname="loghu1"
+
 def error(message):
   sys.stderr.write("Error: %s\n"% message)
 
@@ -283,12 +289,13 @@
         if parenlevel <= 0:
             parameters.append(string.strip(text[currentstart:i]))
             return parameters
-      elif c == ';':
+      elif c == ';' and not do_hunt:
           internal_warn("Parsing failed to find end of parameter list; "
                         "semicolon terminated it in %s" % text[pos:pos+200])
           return parameters
     i = i + 1
-  internal_warn("Parsing failed to find end of parameter list in %s" %
+  if not do_hunt:
+    internal_warn("Parsing failed to find end of parameter list in %s" %
                 text[pos:pos+200])
 
 
@@ -398,6 +405,67 @@
       return
   c_buffer(hit)
 
+def c_applog(hit):
+  format_position = hit.format_position
+  # if format_position <= len(hit.parameters)-1:
+#  hit.category = hit.parameters[4]
+#  add_warning(hit)
+  try:
+   if len(hit.parameters) <= (format_position+1) and len(hit.parameters)>0 and hit.parameters[format_position] and not 
(hit.parameters[format_position][0] == '"' or hit.parameters[format_position][0]=='('):
+     hit.category += " argument: " + hit.parameters[format_position]
+     add_warning(hit)
+     # hit.category = hit.parameters[4]
+     # hit.level = 7
+     # Assume that translators are trusted to not insert "evil" formats:
+     #source = strip_i18n(hit.parameters[4])
+     #if c_constant_string(source):
+     #  # Parameter is constant, so there's no risk of format string problems.
+     #  if hit.name == "snprintf" or hit.name == "vsnprintf":
+     #    hit.level = 1
+     #    hit.warning = \
+     #      "On some very old systems, snprintf is incorrectly implemented " \
+     #      "and permits buffer overflows; there are also incompatible " \
+     #      "standard definitions of it"
+     #    hit.suggestion = "Check it during installation, or use something else"
+     #    hit.category = "port"
+     #  else:
+       # We'll pass it on, just in case it's needed, but at level 0 risk.
+     #         hit.level = 0
+     #         hit.note = "Constant format string, so not considered very risky (there's some residual risk, 
especially in a loop)."
+ #  add_warning(hit)
+  except:
+   return 0
+p_c_fmtstring = re.compile( r'%[sdxlcu]')
+p_c_fmtstring2 = re.compile( r'\.\.\.')
+p_c_fmtstring3 = re.compile( r'va_list')
+
+def c_fmtstring(text):
+  "Returns true if text looks like a format string."
+  if p_c_fmtstring.search(text): 
+   return 1 
+  else:
+   if p_c_fmtstring2.search(text) and huntaggressive: 
+    return 1
+   else:
+    if p_c_fmtstring3.search(text) and huntaggressive: 
+     return 1
+  return 0
+
+huntlist = {}
+whitehuntlist = {"if":1, "for":1,"while":1,"switch":1,"volatile":1}
+
+def c_hunter(hit):
+  i = 1
+  while hit.parameters and i < len(hit.parameters):
+    if c_fmtstring(hit.parameters[i]) and not c_ruleset.has_key(hit.name) and not whitehuntlist.has_key(hit.name):
+#      print " format string in " + hit.name + " argument " + "%d" % i
+#      add_warning(hit)
+      if not (huntlist.has_key(hit.name) and huntlist[hit.name]<i):
+        huntlist[hit.name] = i
+      return 1
+    i = i + 1          
+  return 0
+ 
 def c_printf(hit):
   format_position = hit.format_position
   if format_position <= len(hit.parameters)-1:
@@ -612,19 +680,28 @@
       "Use snprintf or vsnprintf",
       "buffer", "", {}),
 
-  # TODO: Add "wide character" versions of these functions.
-  "printf|vprintf":
-     (c_printf, 4,
-      "If format strings can be influenced by an attacker, they can be exploited",
-      "Use a constant for the format specification",
-      "format", "", {}),
 
-  "fprintf|vfprintf":
-     (c_printf, 4,
-      "If format strings can be influenced by an attacker, they can be exploited",
-      "Use a constant for the format specification",
-      "format", "", { 'format_position' : 2}),
+#  "ap_log_error":
+#     (c_applog, 4,
+#      "app_log_error",
+#      "Use snprintf or vsnprintf",
+#      "format", "", { 'format_position' : 4}),
+##!
 
+##!
+  # TODO: Add "wide character" versions of these functions.
+#  "printf|vprintf":
+#7     (c_printf, 4,
+#    "If format strings can be influenced by an attacker, they can be exploited",
+#      "Use a constant for the format specification",
+#      "format", "", {}),
+#
+#  "fprintf|vfprintf":
+#     (c_printf, 4,
+#      "If format strings can be influenced by an attacker, they can be exploited",
+#      "Use a constant for the format specification",
+#      "format", "", { 'format_position' : 2}),
+#
   # The "syslog" hook will raise "format" issues.
   "syslog":
      (c_printf, 4,
@@ -1057,7 +1134,17 @@
             i = endpos
             word = text[startpos:endpos]
             # print "Word is:", text[startpos:endpos]
-            if c_ruleset.has_key(word):  # FOUND A MATCH, setup & call hook.
+            if do_hunt:
+                  hit = Hit( (c_buffer,4,"","","","",{}) )
+                  hit.start, hit.end = startpos, endpos
+                  hit.line = linenumber
+                  hit.line, hit.column = linenumber, find_column(text, startpos)
+                  hit.filename=filename
+                  hit.context_text = get_context(text, startpos)
+                  hit.parameters = extract_c_parameters(text, endpos)
+                  hit.name = word
+                  apply(c_hunter, (hit, ))
+            elif c_ruleset.has_key(word) and do_hunt != 1:  # FOUND A MATCH, setup & call hook.
               # print "HIT: #%s#\n" % word
               hit = Hit(c_ruleset[word])
               hit.name = word
@@ -1128,9 +1215,9 @@
       print "<h1>Flawfinder Results</h1>"
       print "Here are the security scan results from"
       print '<a href="http://www.dwheeler.com/flawfinder";>Flawfinder version %s</a>,' % version
-      print '(C) 2001-2002 <a href="http://www.dwheeler.com";>David A. Wheeler</a>.'
+      print '(C) 2001-2002 <a href="http://www.dwheeler.com";>David A. Wheeler</a>.<br>Format string patch by Georgi 
Guninski.<br>Georgi: If it was compatible with the GPL, I would have forbidden access to this patch to microsoft and 
governments.'
     else:
-      print "Flawfinder version %s, (C) 2001-2002 David A. Wheeler." % version
+      print "Flawfinder version %s, (C) 2001-2002 David A. Wheeler.\nFormat string patch by Georgi Guninski.\nGeorgi: 
If it was compatible with the GPL, I would have forbidden access to this patch to microsoft and governments." % version
     displayed_header = 1
 
 
@@ -1210,7 +1297,8 @@
 flawfinder [--help] [--context]  [-c]  [--columns]  [--html] [  -m  X ] [ -minlevel=X ]
            [--immediate] [-i] [--inputs] [-n] [--neverignore] [--quiet]
            [--loadhitlist=F ] [ --savehitlist=F ] [ --diffhitlist=F ]
-           [--listrules]
+           [--listrules] [--hunt=filename] [--loadhunt=filename] 
+               [--huntaggressive]
            [--] [ source code file or source root directory ]+
 
   --help      Show this usage help
@@ -1267,17 +1355,22 @@
 
   --diffhitlist=F
               Show only hits (loaded or analyzed) not in F.
-
+  --hunt=filename
+               Store format functions in filename, first pass.
+  --loadhunt=filename
+               Load format functions from filename and examine their usage, second pass. 
+  --huntaggressive
+               Looks also for functions whose defintions contain "..." or "va_list", even more false positives, not 
quite recommended.
 
   For more information, please consult the manpage or available
   documentation.
 """
 
 def process_options():
-  global show_context, show_inputs, allowlink, omit_time
+  global show_context, show_inputs, allowlink, omit_time, do_hunt, huntaggressive
   global output_format, minimum_level, show_immediately, single_line
   global show_columns, never_ignore, quiet, showheading, list_rules
-  global loadhitlist, savehitlist, diffhitlist
+  global loadhitlist, savehitlist, diffhitlist,huntfname
   try:
     # Note - as a side-effect, this sets sys.argv[].
     optlist, args = getopt.getopt(sys.argv[1:], "cm:nih?S",
@@ -1286,7 +1379,7 @@
                      "columns", "listrules", "omittime", "allowlink",
                      "neverignore", "quiet", "dataonly", "html", "singleline",
                      "loadhitlist=", "savehitlist=", "diffhitlist=",
-                     "version", "help" ])
+                     "version", "help", "hunt=","loadhunt=","huntaggressive" ])
     for (opt,value) in optlist:
       if   opt == "--context" or opt == "-c":
         show_context = 1
@@ -1333,6 +1426,14 @@
       elif opt == "--version":
         print version
         sys.exit(0)
+      elif opt == "--hunt":
+        huntfname=value
+        do_hunt = 1
+      elif opt == "--huntaggressive":
+        huntaggressive=1
+      elif opt == "--loadhunt":
+        huntfname=value
+        loadhuntlist()
       elif opt in [ '-h', '-?', '--help' ]:
         usage()
         sys.exit(0)
@@ -1369,8 +1470,32 @@
     return 1
 
 
+def loadhuntlist():
+ global huntlist
+ global huntfname
+ print 
+ print "FNAME="+huntfname
+ print
+ f = open(huntfname)
+ huntlist = pickle.load(f)
+ print " Pontential format strings bugs (function, argument)"
+ for i in huntlist:
+  print " * " + i + " "+"%d" % huntlist[i]
+  c_ruleset[i]= (c_applog,4,"app_log_error","Generated","generic format","", { 'format_position' : huntlist[i] } )   
 def show_final_results():
   global hitlist
+  global huntfname
+#joro
+  if do_hunt:
+    print " Pontential format strings bugs (function, argument)"
+    for i in huntlist:
+     print " * " + i + " "+"%d" % huntlist[i] 
+    print " Generated code:\n"
+    for i in huntlist:
+     print "  \"" + i + "\":\n (c_applog, 4,\"app_log_error\",\"Generated\",\"generic format\",\"\", { 
'format_position' : "+"%d" % huntlist[i]+"})," 
+    f = open (huntfname,"w")
+    pickle.dump(huntlist,f)
+    f.close()
   count = 0
   if show_immediately:   # Separate the final results.
     print
This is  a patch for  flawfinder 1.21 which  adds some fuzzy  format string
hunting capabilities.
The basic idea is to look  for functions which contain format strings and
record the format mask position - the first pass. Then at the second pass
these functions are searched for ones  in which the format position is the
last and it is not a constant string. This patch caught few bugs (grep
catches them also of course). Wrote it in a few hours without having a clue
about Python, so don't flame me I can't code, I know it :)


Options:
--hunt=filename
Store format functions in filename, first pass.
--loadhunt=filename
Load format functions from filename and examine their usage, second pass. 
--huntaggressive
Looks also for functions whose definitions contain "..." or "va_list",
even more false positives, not quite recommended.

Usage:
./flawfinder6 --hunt=loghu ../ethereal-0.9.0
#^^^^(first pass, stores potential functions in file loghu)
./flawfinder6 --loadhunt=loghu ../ethereal-0.9.0 >logethe2
#^^^^(second pass, examine the usage of functions).
Sometimes there are functions which does not contain format mask are
reported.
In this case, open loghu (or the name to which it is saved) and add "-"
(minus sign) after the "I".
For example, to exclude "return" edit loghu this way:
-----
sS'return'
p25
I-1
-----
Minus added between I and 1.
Then in logethe2 after skipping a lot of false positives the 
following line shows an exploitable format string bug:
----
../ethereal-0.9.0/packet-socks.c:914  [4] 
(generic format argument: format_text(data, linelen)) proto_tree_add_text:
  app_log_error. Generated. 
----

Bugs:
False positives and others. 

Availability:
flawfinder 1.21 is at 
http://www.dwheeler.com/flawfinder/flawfinder-1.21.tar.gz
or at
http://packetstormsecurity.nl/UNIX/security/
The patch is at:
http://www.guninski.com/flaw-patch.1
md5sum f5d91465ed91a44eceaf7faea9519a99  flaw-patch.1
This readme is at:
http://www.guninski.com/flaw.README.georgi.txt

Misc:
If it was compatible with the GPL, I would have forbidden access to this 
patch to microsoft and governments.




Georgi Guninski


Current thread: