Allow .html instead of .md for pages and sections

A week and a half ago we deployed our new website at https://www.fastmail.com, and it’s using Zola. (The /help pages are not currently in Zola, but will be at some point not too distant.)

I personally only set the Zola project up at the start and did a few bits of content here and there towards the end like the signup page, but I observed a regular annoyance: Markdown only works well for pages that are predominantly prose with no special formatting; on pages that need to be almost entirely exclusively fancy formatting, Markdown is really annoying, because it loves to mangle the arbitrary HTML that you’re writing. Most specifically, you must eschew either the conventional indentation of your HTML, or blank lines; because if you use both, then your indented HTML following a blank line will be treated as a code block instead. (Our old site also suffered from this nuisance, as content was all fed through Markdown.pl, which handles this situation a little differently—and I just now found our DPA page ruined by this incompatibility between Markdown.pl and CommonMark, which I’ll deploy a fix for on Monday.)

It seems to me that, at the user level, there is a simple and obvious solution for this pain: allow pages and sections to use the extension .html instead of .md, and disable Markdown mangling processing in that case.

A couple of alternatives available to users are: making a template for each such page, and having the pages be empty stubs; or having the HTML sit beside the page Markdown file and use load_data. Either way, you now have two files to care about, possibly in different places, and it’s just not neat. They’re both clearly workarounds rather than solutions. (I am also assuming in_search_index = false on these pages, because that’s our case; think through the implications of not this yourself.)

I have not looked at the implementation ramifications of such a change. I imagine that there should be no substantial problem; for pages, there are already multiple candidates, foo.md and foo/index.md, and this is essentially just adding two more; sections I cannot speak for.

A variant that would probably be easier to implement, but which I don’t much like, would be allowing markdown = false in the frontmatter to disable Markdown interpretation, yielding a .md file that is actually HTML rather than Markdown.

1 Like

:heart:

I ran into the same issues with sites I’ve made in SSG before.

My preference is one of the options you mentioned, having a template per page. I’m not comfortable with loading .html as pages/sections as it would essentially be 2 completely different models (no front-matter for .html), causing issues when iterating sections/pages. It would also mean that templates would be in 2 different directories that would be need to be watched.

Why? The front-matter is not part of Markdown, it’s a Zola thing.

I’m not suggesting that these HTML files be templates: just that by changing from .md to .html, Markdown processing will not occur.

I see, but people would still expect to be able to use Tera constructs in it right? It doesn’t look like a worthwhile improvement to me over setting template and having all the templates in one place

These are not templates. This is a matter of content that happens to need enough manually-written markup.

Perhaps the clearest way for me to express why this should be content and not templates is searching. (Ideally search operates upon source text with markup stripped, whether it be Markdown or HTML.) If you shift such pages as choose to use HTML into templates, you can no longer search them.

I confess that sometimes I would like to be able to write a page as a template, but that’s definitely a different feature request from this.

I totally agree but people are definitely going to ask for things like extending blocks. I’m not sure what’s the best way there, are shortcodes not good enough?
If I had to pick, I would have a raw = true in the front-matter equivalent to your markdown = false just to be sure people don’t misinterpret that as being a template, the file being still a .md.

I just came up with another approach for this: wrap everything in a noop shortcode, because that stuff doesn’t get treated as Markdown.

templates/shortcodes/raw.html:

{{ body | safe }}

Then, at the start of the page body, add {% raw() %}, and at the end, {% end %}.

I don’t like this solution, partly because it confuses syntax highlighters that think they’re still dealing with Markdown because it’s a .md file, and partly because it’ll be confusing to other authors too. But it’s what I’m going with on my own website now, where I have concluded that I really don’t mind writing in HTML, and doing so saves my a surprising amount of trouble, mostly caused by indentation following blank lines. (My own currently-deployed website uses Hyde, where pages are full Jinja2 templates; there, I wrote everything in HTML with a few conveniences courtesy of Jinja2 extensions, and didn’t use Markdown at all.)

1 Like

@keats Markdown is very limiting for anything but the simplest of pages. There is no way of declaring HTML5 tags and ids / classes which is a complete deal breaker.

Writing the entire page in “shortcodes” seems more of a hack, rather than a solution.

But you can just set your page to have a specific template and write your page in it; that’s what I do if a page needs to be completely custom. This way you can also use your template macros

2 Likes

I agree with @tafale here though, this feels like a hack. You would basically be writing a post in the templates/ directory which is strange, no?

What is so bad about using tera templating in a post if you need it? If you need different tera code for different posts you’ll be polluting the templates/shortcodes dir all the time when this should really be contained in content ?

Wouldn’t it be much cleaner if the entire content of a single post was inside a single .html file?

What is so bad about using tera templating in a post if you need it?

That would clash with shortcodes, they use the same syntax. I don’t really see what would be the difference between plain HTML and Tera in that instance: you don’t have variables to use, you wouldn’t be able to use macros or anything defined in your templates folder without a lot of hacking with the Zola Tera instance and no shortcodes. Also as mentioned earlier in the topic, those wouldn’t be templates anyway, just plain HTML files.

If you don’t want markdown processing, you can just wrap the content in a <div>, a no-op shortcode or a new raw=true attribute in the front-matter

1 Like

Yeah I see. OK, I was comparing it to jekyll where you can use .html directly with liquid inside, but all the use cases I can think of would be covered by shortcodes

Maybe you can have a different type of templates which:

  • No input
  • Non md input

and produce some other format output. This way we can generate html, xml, json, media files as needed.

In addition some form of scripting support would be welcome.

If in the font matter if you can specify what template to apply then the template can produce to needed html? I guess this is how it is achieved in Hugo.

Was this issue ever solved? I have a few existing pages that are currently written in HTML that I’d like to just include verbatim as pages in my blog. I don’t want to put them in the templates directory because they aren’t templates.

1 Like

If you don’t want markdown processing, you can just wrap the content in a <div>, a no-op shortcode or a new raw=true attribute in the front-matter.

The idea is you put your HTML content in a .md file. Then include the front-matter. When testing this I also found that a suitable template was needed, let me show you my preferred option, using raw = true.

+++
title = "Raw HTML in content/ folder"
template = "bare.html"
raw = true
+++
<h1>Go to town with HTML</h1>
<p>This is the entire &lt;body&gt;, it uses its own template <code>bare.html</code></p>

bare.html is just this:

{% extends "base.html" %}
{% block content %}
{{ page.content | safe }}
{% endblock content %}

where base.html is another template to include all the usual HTML stuff, for example

<!DOCTYPE html>
<html>

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>My Awesome Website</title>
    <link rel="stylesheet" href="stylesheet.css" type="text/css">
</head>

<body>
    {% block content %} {% endblock %}
</body>

</html>

Hi there, i am just getting started with Zola. I have tried this approach, but sadly my html renders as <code>. :confused:
raw=true or not seems not to make any difference. Do i need to do something special for it to work? I am on 0.15.2.

Actually confusingly it starts properly, but then starts falling apart at the first <h1>
Bildschirmfoto vom 2021-12-21 10-32-33

If you have random code block being inserted, it’s likely that you have some 4 space indentation in your markdown → that’s a codeblock for markdown

Oh, i didn’t know that. Thanks. It’s indeed four spaces. Moving blocks to the left fixed them. A bit unhandy, will try to work my way around this, but +1 for proper html file support please. :confused:

Edit: by deleting empty lines, following html is properly parsed. That’s why in my screenshot to top part
looked fine, as it’s 4 spaces there as well. :rocket:

It seems the “new raw=true attribute” discussed here is not actually implemented. At least, it’s not a documented option:

I have a old Jekyll blog with dozens of “.html” posts and am considering moving to Zola, but I have to figure out what to do with all those pages.

1 Like