this should break urls that attempt to include a protocol, or port (these are absolute URLs and should have a whitelisted protocol for use)
but URLs that are relative, or relative from the site root should be preserved (though characters non essential for the URL structure may be urlencoded)
this approach has significant advantages over attempting to locate something like `javascript:alert(1)` or `javascript:alert(1)` (which are both valid) because browsers have been known to ignore ridiculous characters when encountered (meaning something like `jav\ta\0\0script:alert(1)` would be xss :( ). Instead of trying to chase down a way to interpret a URL to decide whether there is a protocol, this approach ensures that two essential characters needed to achieve a colon are encoded `:` (obviously) and `;` (from `:`). If these characters appear in a relative URL then they are equivalent to their URL encoded form and so this change will be non breaking for that case.
* add gif and jpg as allowed data images
* ensure that user controlled content fall only in the "data section" of the data URI (and does not intersect content-type definition in any way (best to be safe than sorry ;-)))
"data section" as defined in: https://tools.ietf.org/html/rfc2397#section-3
protect all attributes and content from xss via element method
filter special attributes (a href, img src)
expand url whitelist slightly to permit data images and mailto links
Failing tests don't break builds on purpose, Parsedown doesn't fully comply with the CommonMark specs at the moment. We should switch to test/CommonMarkTest.php later, see #423 for details.
This commit serves to add comments detailing parts of the new functionality, and to adjust syntax preferences to match that of the surrounding document. The commit title also now reflects the most significant change made.
1. Add the ability for a parsed element to enforce that the line handler not parse any (immediate) child elements of a specified type.
2. Use 1. to allow parsed Url elements to tell the line handler not to parse any child Links or Urls where they are immediate children.
Okay, so maybe I should have looked 20 lines or so above where I made the edit in the element function – looks like it already supports adding attributes ;p
Have amended the change to blocklist to use the already existing functionality, and have reverted the change that I made to the element function.
Looks like I might need to return the pattern which was used previously
Reverting last change as build still failed
This build will still fail, but I'm hoping it will only fair where the list start value has been inserted
As Packagist has now implemented the feature of listing packages
depending on another package, VersionEye is no longer needed for that.
As VersionEye scrapes the Packagist API to do the same, the original
source of information should be preferred.
Fixeserusev/parsedown-extra#67.
This introduces PHP 5.3+ late static binding to the Singleton pattern in Parsedown.
It will return an instance of Parsedown which inherits the class which
called the `instance()` method rather than always returning instance of just `Parsedown`.
Tests are testing this feature with a test class which inherits from Parsedown.
Notice that calling `instance()` with the default arguments after an instance of
`Parsedown` was already created, it will return it even though it is from just
an instance of `Parsedown`. So this is fixing the problem just partially.
Fixes inline reference links with int 0 as reference
The link [link][0] where [0] is set at the bottom of the md file current breaks and it's truthy value is false.
You could run the Parsedown testsuite only with:
phpunit --testsuite ParsedownTests
And you could run the Standard Markdown one with:
phpunit --testsuite StandardMarkdown
See more at http://standardmarkdown.com/
2014-09-05 23:12:33 +03:00
43 changed files with 1255 additions and 600 deletions
More examples in [the wiki](https://github.com/erusev/parsedown/wiki/Usage) and in [this video tutorial](http://youtu.be/wYZBY8DEikI).
More examples in [the wiki](https://github.com/erusev/parsedown/wiki/) and in [this video tutorial](http://youtu.be/wYZBY8DEikI).
### Security
Parsedown is capable of escaping user-input within the HTML that it generates. Additionally Parsedown will apply sanitisation to additional scripting vectors (such as scripting link destinations) that are introduced by the markdown syntax itself.
To tell Parsedown that it is processing untrusted user-input, use the following:
```php
$parsedown = new Parsedown;
$parsedown->setSafeMode(true);
```
If instead, you wish to allow HTML within untrusted user-input, but still want output to be free from XSS it is recommended that you make use of a HTML sanitiser that allows HTML tags to be whitelisted, like [HTML Purifier](http://htmlpurifier.org/).
In both cases you should strongly consider employing defence-in-depth measures, like [deploying a Content-Security-Policy](https://scotthelme.co.uk/content-security-policy-an-introduction/) (a browser security feature) so that your page is likely to be safe even if an attacker finds a vulnerability in one of the first lines of defence above.
#### Security of Parsedown Extensions
Safe mode does not necessarily yield safe results when using extensions to Parsedown. Extensions should be evaluated on their own to determine their specific safety against XSS.
### Escaping HTML
> ⚠️ **WARNING:** This method isn't safe from XSS!
If you wish to escape HTML **in trusted input**, you can use the following:
```php
$parsedown = new Parsedown;
$parsedown->setMarkupEscaped(true);
```
Beware that this still allows users to insert unsafe scripting vectors, such as links like `[xss](javascript:alert%281%29)`.
### Questions
**How does Parsedown work?**<br/>
Parsedown recognises that the Markdown syntax is optimised for humans so it tries to read like one. It goes through text line by line. It looks at how lines start to identify blocks. It looks for special characters to identify inline elements.
**How does Parsedown work?**
**Why doesn’t Parsedown use namespaces?**<br/>
Using namespaces would mean dropping support for PHP 5.2. We believe that since Parsedown is a single class with an uncommon name, making this trade wouldn't be worth it.
It tries to read Markdown like a human. First, it looks at the lines. It’s interested in how the lines start. This helps it recognise blocks. It knows, for example, that if a line starts with a `-` then perhaps it belongs to a list. Once it recognises the blocks, it continues to the content. As it reads, it watches out for special characters. This helps it recognise inline elements (or inlines).
**Is Parsedown compliant with CommonMark?**<br/>
We are [working on it](https://github.com/erusev/parsedown/tree/commonmark).
We call this approach "line based". We believe that Parsedown is the first Markdown parser to use it. Since the release of Parsedown, other developers have used the same approach to develop other Markdown parsers in PHP and in other languages.
**Who uses Parsedown?**<br/>
[phpDocumentor](http://www.phpdoc.org/), [October CMS](http://octobercms.com/), [Bolt CMS](http://bolt.cm/), [RaspberryPi.org](http://www.raspberrypi.org/) and [more](https://www.versioneye.com/php/erusev:parsedown/references).
**Is it compliant with CommonMark?**
It passes most of the CommonMark tests. Most of the tests that don't pass deal with cases that are quite uncommon. Still, as CommonMark matures, compliance should improve.
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.