Issue #6930 open

Support some or all HTML in Markdown (BB-6931)

SynapDx Inc
created an issue

Markdown doc says embedded HTML should work; it doesn't. This would be a huge workaround to the inordinate number of missing features in Markdown.

Comments (64)

  1. Christopher Keele

    While there is no Markdown spec, inline HTML is supported by most markdown implementations. Notably, Github. Comparisons will be made.

    I bring this up because no markdown implementation supports some kind of anchor construct outside of hyperlinks. This makes it impossible to link to particular segments of a README or add a nice internally linked table of contents without adding your own raw HTML anchor tags. My READMEs kind of fell apart after migrating to Bitbucket, because it doesn't support raw tags.

    Daring Fireball mentions this feature pretty explicitly. It's not exactly a spec, but it's the main implementation documentation for the language.

  2. trgn

    I agree with the above:

    "My READMEs kind of fell apart after migrating to Bitbucket, because it doesn't support raw tags"

    I had the same thing happening after migrating from github. Support for this would be a nice improvement.

  3. trgn

    FWIW:

    maybe this relates more to #6322 which is only about the anchor links. But that was closed as duplicate of this one.

    The headers in the HTML generated from the markdown get an id-property.

    Bitbucket seems to prepend "markdown-header" to the text of the header, strip all special characters (./#), transform to lower case, and replace whitespace with a dash.

    So as a work-around, I hardcode these IDs as my anchor links

    e.g. my markdown looks like this.

    The forward anchor link

    1. **[Link to Header](#markdown-header-1-header)**: some 
    

    Somewhere else in the document, the actual anchor.

    ##  1. Header
    

    The header tag in the generated html looks something like this.

    <h2 id="markdown-header-1-header">1. Header</h2>
    

    The anchor link will work correctly. It's hacky I know - and presumable only works because of bitbucket's particular html-generator, but at least it gets the anchor links working. It's probably not the most robust forward-compatible solution though.

  4. jeduden

    Daring Fireball explicitly says in the Spec:

    For any markup that is not covered by Markdown’s syntax, 
    you simply use HTML itself. 
    There’s no need to preface it or delimit it to indicate 
    that you’re switching from Markdown to HTML; 
    you just use the tags.
    

    Even if you decide not to include the feature, it would be good to have the differences documented somewhere.

  5. SynapDx Inc reporter

    For folks trying to create a table of contents note that putting [TOC] in the document will do this automatically. This doesn't eliminate the desirability of embedded HTML (e.g. for better list control) but it does solve one important use case.

  6. Jesper Öqvist

    It would be great if embedded HTML could be supported by bitbucket. It would be very useful together with the Wiki feature. How can we make Atlassian change their mind on this?

  7. Kirill Muzykov

    Not sure this is a duplicate. First of all duplicate issue is about supporting "small subset" of HTML and "strike" tag in particular, and this issue has a wider scope. Secondly, this issue has bigger amount of votes. I would reverse duplicate relationship between those 2 issues.

  8. Gabriel Schubiner

    I agree, Kirill Muzykov

    Also, as a note on only supporting a subset of HTML tags, I use emacs org-mode for my documentation, which can export to markdown, but assumes a full set of HTML tags are available. It would be incredibly helpful to have the full set of tags available, since this is the assumption in general, and it would make bitbucket compatible with any number of tools that can export to markdown.

  9. Christopher Keele

    I'd concur with Kirill Muzykov. The markdown spec encourages full access to HTML; obviously this is a tall order if you're compiling other peoples' arbitrary markdown the way Bitbucket is. There has to be a blacklist/whitelist. On the other hand, it's not clear which tags people on this issue want supported, but it's a lot more than <strike>.

    Perhaps we can start to list the tags we'd like to have access to, so that they have a starting point? I, for one, would love full access to the anchor element, with internal linking, and all tags I'm likely to embed inside a link (as markdown won't process markdown inside raw HTML).

    Alternatively, what's a reasonable blacklist of HTML elements?

  10. Gabriel Schubiner

    I'm not sure why there has to be a blacklist.. I understand it may not be an insignificant amount of work to implement, but there could be an option to use a library like PageDown for rendering.

    I don't want to be presumptuous, since I don't know how bitbucket's backend for rendering markdown currently works, but they are already rendering the markdown to html. just leave the HTML in markdown files alone, and send it to the browser to render. it undoubtedly takes more than just that to integrate HTML in the current rendering engine, but i can't imagine there isn't a generic solution to recognizing and embedding html.

    The only tags I can think that should be blacklisted are <script> and other tags that bring security concerns.

  11. Christopher Case

    BitBucket (as mentioned in #4412) uses Python Markdown for it's markdown rendering. Looking at the documentation, (specifically about safe_mode) the authors comment:

    See Also: HTML sanitizers (like Bleach) may provide a better solution for dealing with markdown text submitted by untrusted users. That way, both the HTML generated by Markdown and user submited raw HTML are fully sanitized.

    There's no reason to have the community whitelist/blacklist tags, when a library (recommended by the authors of the currently used markdown library) exists to do exactly what we want.

  12. Christopher Case

    Do the developers have any feelings about solutions such as Bleach? I would assume it would need to go through some sort of security audit before it would be able to be used; though what I'm more interested in knowing, I guess, is if a sanitization solution like that would meet security requirements, or if instead only a whitelist like solution is the only acceptable route.

    As a side note, I completely understand the tradeoffs that need to be made with regards to security. Frankly, I'm just excited this issue is getting attention. (And, I'll completely accept a whitelist if that's what's mandated.) It would just be nice to know the context to have a useful discussion within.

  13. Bogdan Vatula

    I wanted to add an embedded youtube video / a video tag to a video in a bitbucket repository (simple how-to video). But without inline html support it is not possible in markdown.

  14. Theodore Brown

    I have copyright symbols in my READMEs (&copy;), and it really annoys me that they appear literally, rather than as the intended symbol the way other markdown parsers display them.

  15. Ændrew Rininsland

    Strike. Please, add strike. In Atlassian world, people who file issues may be smart enough to create separate issues, but us in the real world frequently have to deal with folks who try to cram a list of 10 points into a single issue, and short of taking half an hour to separate all those out into separate issues, using strikeout is the best way of depicting which points have been addressed.

    tl;dr -- not having strike imposes a somewhat-serious usability issue for some users.

  16. Paul Rupe

    <pre>, <code> tags with the ability to use formatting like <b>, <i>, etc. within them.

    Say I have /path1/path2/path3/path4/file.ext maybe by itself or as part of a larger code block, and I want to emphasize the path4/file.ext part. I don't see any way to do this with the existing wiki languages.

    /path1/path2/path3/**path4/file.ext**
    /path1/path2/path3/<b>path4/file.ext</b>

  17. Christopher Case

    Honestly, I don't see a sane way to do that. What if you're writing C code?

    char **foobar;

    How is the wiki syntax supposed to differentiate between an un-closed bold syntax, or a pointer to a pointer? Using the html tags makes it even worse; you can never put html code in a code block if you do that.

    Honestly, I'll just be happy with strikeout.

  18. Christopher Keele

    Paul Rupe Markdown has never supported extra emphasis tags inside code blocks. If you think about it, that's entirely counterintuitive to the purpose of <pre>.

    This issue isn't about inventing a new markup syntax. It's also not asking Bitbucket to support a single new tag, such as <strike>. It's about the existing markdown parser, which fails to comply with a well-defined spec where all of these considerations are already addressed:

    For any markup that is not covered by Markdown’s syntax, you simply use HTML itself. There’s no need to preface it or delimit it to indicate that you’re switching from Markdown to HTML; you just use the tags.

    The only restrictions are that block-level HTML elements — e.g. <div>, <table>, <pre>, <p>, etc. — must be separated from surrounding content by blank lines, and the start and end tags of the block should not be indented with tabs or spaces.

    - Daring Fireball

  19. Alexander Lukanin

    I agree with Paul Rupe, it would be nice to allow HTML <pre><code> with embedded <b> and <i>. Sometimes you need to provide an example of working in shell and highlight user input. With indented code block it's not possible by design, so using <pre> is the only option.

  20. stylig

    I am using inline svg in my markdown… well I was using it, as it seems now. :> What I like is the principle of some markdown processor to allow the author to use any kind of html within a <div> or <span> tag. If you fear the security risks, maybe it's an option to allow it at least for private repositories.

  21. Adrien Tétar

    Bitbucket, you are violating the gist of Markdown by not supporting this &ndash; Markdown is not a subset of HTML, it is HTML with syntax sugar for common notation patterns.

    You won't compete with GitHub if you don't resolve issues like this.

  22. Ændrew Rininsland

    The glacial pace at which Atlassian is working to resolve this is more than a little bit unsettling... The new ~~ and ++ symbols should be removed and HTML allowed. That's really all there is to it. I now regret advocating for something as simple as strikeout earlier on, didn't realise it would create such a completely-ignored regressive issue...

  23. Patrick O'Keeffe

    Please, please, please, just disable 'safe_mode' and use a proper tool like bleach for HTML sanitation like the author of Python-Markdown suggests. I can understand the need to sanitize but the inability to add simple entities like copyright and trademark symbols is baffling to a lot of users. Some might go so far as to say the situation is "stupid."

    Alternatively, switch to a superior version of markdown, like Github-Flavored-Markdown. Sure, it's Github, yeah yeah, but it is a superior variant.

  24. Michael Lippert

    I understand the need to sanitize the html for security reasons, but I see other comments (such as that from Patrick O'Keeffe) on how to do that.

    Personally I got here when I found I couldn't use write 10<sup>n</sup> and H<sub>2</sub>O. (the 1st is real the 2nd is an example.

  25. Pat Sissons

    I was a bit shocked to find out that my <br/>'s were not being rendered as actual line breaks. Even more shocked to find out that it is actually not possible to insert line breaks into a table. The funny thing here is that a white list would take any normal dev team, even for a massive project like bitbucket, only a few weeks at most to implement. Years later, everyone still scratching their heads on how various simple HTML markups are not supported.

  26. Christopher Keele

    When this issue was created, I don't think that CommonMark was a thing, yet. If I were re-evaluating the Markdown parsing of any major project today, I'd embrace it as the most future-proof, inter-operable choice of technology. There's a Python implementation, and as a plus it eliminates the burden of maintenance and questions of feature-support from the organization using it by deferring to a community standard.

  27. Timm Wagener

    Just signed up with bitbucket for the nice opportunity to have private repos. I also imported all my public repos from github, but all my carefully crafted README files definitely fell apart big time, since i totally rely on HTML in there. I feel like a beautifull README is the inviting entry point to your repo and therefore not totally unimportant at least for public repos. HTML often means the difference between only neutral and informative (without) and designed and inviting (with). Would be great to see it enabled here.

  28. Log in to comment