1
0
mirror of https://github.com/erusev/parsedown.git synced 2023-08-10 21:13:06 +03:00

Compare commits

...

31 Commits
1.6.4 ... 1.7.0

Author SHA1 Message Date
6678d59be4 Merge pull request #495 from aidantwoods/anti-xss
Prevent various XSS attacks [rebase and update of #276]
2018-02-28 13:41:37 +02:00
c999a4b61b improve readme 2018-01-29 20:55:30 +02:00
e938ab4ffe improve readme 2018-01-29 20:54:40 +02:00
e69374af0d improve readme 2018-01-29 20:52:27 +02:00
1196ed9512 Merge pull request #548 from m1guelpf-forks/patch-1
Update license year
2018-01-01 18:48:54 +02:00
1244122b84 Update LICENSE.txt 2018-01-01 14:09:31 +01:00
d98d60aaf3 Update license year 2017-12-31 22:10:48 +01:00
296ebf0e60 Merge pull request #429 from pablotheissen/patch-1
Support html tags containing dashes
2017-11-19 11:15:43 +02:00
a60ba300b1 Merge pull request #540 from jbafford/patch-1
Fix typo in README
2017-11-15 10:31:22 +02:00
089789dfff Fix typo in README 2017-11-14 17:13:31 -05:00
67c3efbea0 according to https://tools.ietf.org/html/rfc3986#section-3 the colon is a required part of the syntax, other methods of achieving the colon character (as to browser interpretation) should be taken care of by htmlencoding that is done on all attribute content 2017-05-10 16:57:18 +01:00
bbb7687f31 safeMode will either apply all sanitisation techniques to an element or none (note that encoding HTML entities is done regardless because it speaks to character context, and that the only attributes/elements we should permit are the ones we actually mean to create) 2017-05-09 19:31:36 +01:00
b1e5aebaf6 add single safeMode option that encompasses protection from link destination xss and plain markup based xss into a single on/off switch 2017-05-09 19:22:58 +01:00
c63b690a79 remove duplicates 2017-05-09 14:50:15 +01:00
226f636360 remove $safe flag 2017-05-07 13:45:59 +01:00
2e4afde68d faster check substr at beginning of string 2017-05-06 16:32:51 +01:00
dc30cb441c add more protocols to the whitelist 2017-05-05 21:32:27 +01:00
054ba3c487 urlencode urls that are potentially unsafe:
this should break urls that attempt to include a protocol, or port (these are absolute URLs and should have a whitelisted protocol for use)
but URLs that are relative, or relative from the site root should be preserved (though characters non essential for the URL structure may be urlencoded)

this approach has significant advantages over attempting to locate something like `javascript:alert(1)` or `javascript:alert(1)` (which are both valid) because browsers have been known to ignore ridiculous characters when encountered (meaning something like `jav\ta\0\0script:alert(1)` would be xss :( ). Instead of trying to chase down a way to interpret a URL to decide whether there is a protocol, this approach ensures that two essential characters needed to achieve a colon are encoded `:` (obviously) and `;` (from `:`). If these characters appear in a relative URL then they are equivalent to their URL encoded form and so this change will be non breaking for that case.
2017-05-03 17:01:27 +01:00
4bae1c9834 whitelist regex for good attribute (no
no chars that could form a delimiter allowed
2017-05-03 00:39:01 +01:00
aee3963e6b jpeg, not jpg 2017-05-02 19:55:03 +01:00
4dc98b635d whitelist changes:
* add gif and jpg as allowed data images
* ensure that user controlled content fall only in the "data section" of the data URI (and does not intersect content-type definition in any way (best to be safe than sorry ;-)))
  "data section" as defined in: https://tools.ietf.org/html/rfc2397#section-3
2017-05-02 19:48:25 +01:00
e4bb12329e array_keys is probably faster 2017-05-02 01:32:24 +01:00
6d0156d707 dump attributes that contain characters that are impossible for validity, or very unlikely 2017-05-02 00:48:48 +01:00
131ba75851 filter onevent attributes 2017-05-01 15:44:04 +01:00
af04ac92e2 add xss tests 2017-05-01 03:33:49 +01:00
6bb66db00f anti-xss
protect all attributes and content from xss via element method
filter special attributes (a href, img src)
expand url whitelist slightly to permit data images and mailto links
2017-05-01 03:25:07 +01:00
b3d45c4bb9 Add html escaping to all attributes capable of holding user input. 2017-05-01 02:00:38 +01:00
1d4296f34d Customizable whitelist of schemas for safeLinks 2017-05-01 01:58:34 +01:00
bf5105cb1a Improve safeLinks with whitelist. 2017-05-01 01:58:34 +01:00
1140613fc7 Prevent various XSS attacks 2017-05-01 01:58:34 +01:00
1a44cbd62c Update Parsedown.php
Made parsedown compatible with html-tags containing dashes.
see https://github.com/erusev/parsedown/issues/407#issuecomment-248833563
2016-09-22 12:21:39 +02:00
10 changed files with 202 additions and 24 deletions

View File

@ -1,6 +1,6 @@
The MIT License (MIT)
Copyright (c) 2013 Emanuil Rusev, erusev.com
Copyright (c) 2013-2018 Emanuil Rusev, erusev.com
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
@ -17,4 +17,4 @@ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

View File

@ -75,6 +75,32 @@ class Parsedown
protected $urlsLinked = true;
function setSafeMode($safeMode)
{
$this->safeMode = (bool) $safeMode;
return $this;
}
protected $safeMode;
protected $safeLinksWhitelist = array(
'http://',
'https://',
'ftp://',
'ftps://',
'mailto:',
'data:image/png;base64,',
'data:image/gif;base64,',
'data:image/jpeg;base64,',
'irc:',
'ircs:',
'git:',
'ssh:',
'news:',
'steam:',
);
#
# Lines
#
@ -342,8 +368,6 @@ class Parsedown
{
$text = $Block['element']['text']['text'];
$text = htmlspecialchars($text, ENT_NOQUOTES, 'UTF-8');
$Block['element']['text']['text'] = $text;
return $Block;
@ -354,7 +378,7 @@ class Parsedown
protected function blockComment($Line)
{
if ($this->markupEscaped)
if ($this->markupEscaped or $this->safeMode)
{
return;
}
@ -457,8 +481,6 @@ class Parsedown
{
$text = $Block['element']['text']['text'];
$text = htmlspecialchars($text, ENT_NOQUOTES, 'UTF-8');
$Block['element']['text']['text'] = $text;
return $Block;
@ -515,10 +537,10 @@ class Parsedown
),
);
if($name === 'ol')
if($name === 'ol')
{
$listStart = stristr($matches[0], '.', true);
if($listStart !== '1')
{
$Block['element']['attributes'] = array('start' => $listStart);
@ -678,12 +700,12 @@ class Parsedown
protected function blockMarkup($Line)
{
if ($this->markupEscaped)
if ($this->markupEscaped or $this->safeMode)
{
return;
}
if (preg_match('/^<(\w*)(?:[ ]*'.$this->regexHtmlAttribute.')*[ ]*(\/)?>/', $Line['text'], $matches))
if (preg_match('/^<(\w[\w-]*)(?:[ ]*'.$this->regexHtmlAttribute.')*[ ]*(\/)?>/', $Line['text'], $matches))
{
$element = strtolower($matches[1]);
@ -1074,7 +1096,6 @@ class Parsedown
if (preg_match('/^('.$marker.'+)[ ]*(.+?)[ ]*(?<!'.$marker.')\1(?!'.$marker.')/s', $Excerpt['text'], $matches))
{
$text = $matches[2];
$text = htmlspecialchars($text, ENT_NOQUOTES, 'UTF-8');
$text = preg_replace("/[ ]*\n/", ' ', $text);
return array(
@ -1253,8 +1274,6 @@ class Parsedown
$Element['attributes']['title'] = $Definition['title'];
}
$Element['attributes']['href'] = str_replace(array('&', '<'), array('&amp;', '&lt;'), $Element['attributes']['href']);
return array(
'extent' => $extent,
'element' => $Element,
@ -1263,12 +1282,12 @@ class Parsedown
protected function inlineMarkup($Excerpt)
{
if ($this->markupEscaped or strpos($Excerpt['text'], '>') === false)
if ($this->markupEscaped or $this->safeMode or strpos($Excerpt['text'], '>') === false)
{
return;
}
if ($Excerpt['text'][1] === '/' and preg_match('/^<\/\w*[ ]*>/s', $Excerpt['text'], $matches))
if ($Excerpt['text'][1] === '/' and preg_match('/^<\/\w[\w-]*[ ]*>/s', $Excerpt['text'], $matches))
{
return array(
'markup' => $matches[0],
@ -1284,7 +1303,7 @@ class Parsedown
);
}
if ($Excerpt['text'][1] !== ' ' and preg_match('/^<\w*(?:[ ]*'.$this->regexHtmlAttribute.')*[ ]*\/?>/s', $Excerpt['text'], $matches))
if ($Excerpt['text'][1] !== ' ' and preg_match('/^<\w[\w-]*(?:[ ]*'.$this->regexHtmlAttribute.')*[ ]*\/?>/s', $Excerpt['text'], $matches))
{
return array(
'markup' => $matches[0],
@ -1343,14 +1362,16 @@ class Parsedown
if (preg_match('/\bhttps?:[\/]{2}[^\s<]+\b\/*/ui', $Excerpt['context'], $matches, PREG_OFFSET_CAPTURE))
{
$url = $matches[0][0];
$Inline = array(
'extent' => strlen($matches[0][0]),
'position' => $matches[0][1],
'element' => array(
'name' => 'a',
'text' => $matches[0][0],
'text' => $url,
'attributes' => array(
'href' => $matches[0][0],
'href' => $url,
),
),
);
@ -1363,7 +1384,7 @@ class Parsedown
{
if (strpos($Excerpt['text'], '>') !== false and preg_match('/^<(\w+:\/{2}[^ >]+)>/i', $Excerpt['text'], $matches))
{
$url = str_replace(array('&', '<'), array('&amp;', '&lt;'), $matches[1]);
$url = $matches[1];
return array(
'extent' => strlen($matches[0]),
@ -1401,6 +1422,11 @@ class Parsedown
protected function element(array $Element)
{
if ($this->safeMode)
{
$Element = $this->sanitiseElement($Element);
}
$markup = '<'.$Element['name'];
if (isset($Element['attributes']))
@ -1412,7 +1438,7 @@ class Parsedown
continue;
}
$markup .= ' '.$name.'="'.$value.'"';
$markup .= ' '.$name.'="'.self::escape($value).'"';
}
}
@ -1426,7 +1452,7 @@ class Parsedown
}
else
{
$markup .= $Element['text'];
$markup .= self::escape($Element['text'], true);
}
$markup .= '</'.$Element['name'].'>';
@ -1485,10 +1511,77 @@ class Parsedown
return $markup;
}
protected function sanitiseElement(array $Element)
{
static $goodAttribute = '/^[a-zA-Z0-9][a-zA-Z0-9-_]*+$/';
static $safeUrlNameToAtt = array(
'a' => 'href',
'img' => 'src',
);
if (isset($safeUrlNameToAtt[$Element['name']]))
{
$Element = $this->filterUnsafeUrlInAttribute($Element, $safeUrlNameToAtt[$Element['name']]);
}
if ( ! empty($Element['attributes']))
{
foreach ($Element['attributes'] as $att => $val)
{
# filter out badly parsed attribute
if ( ! preg_match($goodAttribute, $att))
{
unset($Element['attributes'][$att]);
}
# dump onevent attribute
elseif (self::striAtStart($att, 'on'))
{
unset($Element['attributes'][$att]);
}
}
}
return $Element;
}
protected function filterUnsafeUrlInAttribute(array $Element, $attribute)
{
foreach ($this->safeLinksWhitelist as $scheme)
{
if (self::striAtStart($Element['attributes'][$attribute], $scheme))
{
return $Element;
}
}
$Element['attributes'][$attribute] = str_replace(':', '%3A', $Element['attributes'][$attribute]);
return $Element;
}
#
# Static Methods
#
protected static function escape($text, $allowQuotes = false)
{
return htmlspecialchars($text, $allowQuotes ? ENT_NOQUOTES : ENT_QUOTES, 'UTF-8');
}
protected static function striAtStart($string, $needle)
{
$len = strlen($needle);
if ($len > strlen($string))
{
return false;
}
else
{
return strtolower(substr($string, 0, $len)) === strtolower($needle);
}
}
static function instance($name = 'default')
{
if (isset(self::$instances[$name]))

View File

@ -1,4 +1,4 @@
> You might also like [Caret](https://caret.io?ref=parsedown) - our Markdown editor for Mac / Windows / Linux.
> I also make [Caret](https://caret.io?ref=parsedown) - a Markdown editor for Mac and PC.
## Parsedown
@ -38,7 +38,7 @@ More examples in [the wiki](https://github.com/erusev/parsedown/wiki/) and in [t
### Security
Parsedown does not sanitize the HTML that it generates. When you deal with untrusted content (ex: user commnets) you should also use a HTML sanitizer like [HTML Purifier](http://htmlpurifier.org/).
Parsedown does not sanitize the HTML that it generates. When you deal with untrusted content (ex: user comments) you should also use a HTML sanitizer like [HTML Purifier](http://htmlpurifier.org/).
### Questions

View File

@ -48,6 +48,8 @@ class ParsedownTest extends TestCase
$expectedMarkup = str_replace("\r\n", "\n", $expectedMarkup);
$expectedMarkup = str_replace("\r", "\n", $expectedMarkup);
$this->Parsedown->setSafeMode(substr($test, 0, 3) === 'xss');
$actualMarkup = $this->Parsedown->text($markdown);
$this->assertEquals($expectedMarkup, $actualMarkup);

View File

@ -0,0 +1,6 @@
<p><a href="https://www.example.com&quot;">xss</a></p>
<p><img src="https://www.example.com&quot;" alt="xss" /></p>
<p><a href="https://www.example.com&#039;">xss</a></p>
<p><img src="https://www.example.com&#039;" alt="xss" /></p>
<p><img src="https://www.example.com" alt="xss&quot;" /></p>
<p><img src="https://www.example.com" alt="xss&#039;" /></p>

View File

@ -0,0 +1,11 @@
[xss](https://www.example.com")
![xss](https://www.example.com")
[xss](https://www.example.com')
![xss](https://www.example.com')
![xss"](https://www.example.com)
![xss'](https://www.example.com)

View File

@ -0,0 +1,16 @@
<p><a href="javascript%3Aalert(1)">xss</a></p>
<p><a href="javascript%3Aalert(1)">xss</a></p>
<p><a href="javascript%3A//alert(1)">xss</a></p>
<p><a href="javascript&amp;colon;alert(1)">xss</a></p>
<p><img src="javascript%3Aalert(1)" alt="xss" /></p>
<p><img src="javascript%3Aalert(1)" alt="xss" /></p>
<p><img src="javascript%3A//alert(1)" alt="xss" /></p>
<p><img src="javascript&amp;colon;alert(1)" alt="xss" /></p>
<p><a href="data%3Atext/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==">xss</a></p>
<p><a href="data%3Atext/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==">xss</a></p>
<p><a href="data%3A//text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==">xss</a></p>
<p><a href="data&amp;colon;text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==">xss</a></p>
<p><img src="data%3Atext/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==" alt="xss" /></p>
<p><img src="data%3Atext/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==" alt="xss" /></p>
<p><img src="data%3A//text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==" alt="xss" /></p>
<p><img src="data&amp;colon;text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==" alt="xss" /></p>

31
test/data/xss_bad_url.md Normal file
View File

@ -0,0 +1,31 @@
[xss](javascript:alert(1))
[xss]( javascript:alert(1))
[xss](javascript://alert(1))
[xss](javascript&colon;alert(1))
![xss](javascript:alert(1))
![xss]( javascript:alert(1))
![xss](javascript://alert(1))
![xss](javascript&colon;alert(1))
[xss](data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
[xss]( data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
[xss](data://text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
[xss](data&colon;text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
![xss](data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
![xss]( data:text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
![xss](data://text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)
![xss](data&colon;text/html;base64,PHNjcmlwdD5hbGVydCgxKTwvc2NyaXB0Pg==)

View File

@ -0,0 +1,7 @@
<p>&lt;script&gt;alert(1)&lt;/script&gt;</p>
<p>&lt;script&gt;</p>
<p>alert(1)</p>
<p>&lt;/script&gt;</p>
<p>&lt;script&gt;
alert(1)
&lt;/script&gt;</p>

View File

@ -0,0 +1,12 @@
<script>alert(1)</script>
<script>
alert(1)
</script>
<script>
alert(1)
</script>