Generating Pages At Build Time

Hello, I was wondering if it is possible or if would be possible for Zola to be able to generate pages at build time from a source such as from load_data? This would be incredibly useful for generating a relatively large amount of pages with a small footprint. The specific use case I had in mind is rust-lang’s “governance” section which lists members and alumni is generated from a static JSON file which generated from a git repository. I saw there was an issue < https://github.com/getzola/zola/issues/401> around generating pages from a headless CMS, however it didn’t seem to go anywhere.

It would be very tricky imo since that would need to happen before building the site and essentially transform into a partial headless CMS which is an awful lot of work

Well let me explain in more detail of what kind of API I had in mind. I think it’s a lot smaller in scope than trying to be a CMS, but is still powerful enough to be useful for most users. I’m still new to contributing to Zola so I’m still not the most familiar, I do believe this should be possible as an initial or early step in building the site’s library.

What I would propose is adding a [children] (name can be changed) key to sections that would accept the similar arguments as load_data and a template argument. Zola would attempt to load this data either as an map with child objects or an array of objects and then iterate and construct pages from each of those objects.

Inside each object Zola would expect a @zola (name can be changed) key that points to an object containing the front matter needed to generate the page. title, description, date, slug, etc. Every other key contained within that object would be placed into the extra variable map.

This would allow you easily generate a list of pages from a single data-set that is either contained locally or being remotely published with minimal and purely additive changes to data. Requiring a @zola key means that Zola won’t be able to work with arbitrary data-sets, but that’s fine; I don’t think I would expect it to be able to.

Section Reference

[children]
template: String;
path: String?;
# Either `path` or `url` has to be provided
url: String?;
format: String;

Data Reference

Object

{
  // When passed an object the slug is inferred from the key.
  "key": {
    "@zola": Object,
    // Extra...
  }
}

Array

[
  {
    // When the data is passed as an array slug is required metadata.
    "@zola": {
       "slug": "key",
    },
    // Extra...
  }
]

Section Example

+++
title = "Teams"

[children]
template = "child.html"
path = "./data.json"
url = "https://example.com/data.json"
format = "json"
+++

Data Example

{
  "compiler": {
    "@zola": {
        "title": "Compiler Team",
    },
    "members": ["jane", "mike", "laura"],
  }
}

I’m not sure honestly. I do generate pages automatically for the Zola website for themes and that wouldn’t cover my one and only usecase (getting from git submodules). Having to have a @zola key severaly restricts the utility of it imo since you need to edit the input, which you might not control. Ideally there would be some nice headless CMS features but I think generating with some python script or whatever in the meantime is better than a measure that only allows for some narrow usecases.

Could you expand on the submodules feature? This is also something I would be interested in, as there are members who have expressed interested in having the blog part of the website as a git submodule in Zola.

Well yes I am making an assumption that you have some way to control the schema of the file. Which I don’t think is an unreasonable constraint. I would find it complex to want Zola to be able to generate pages from purely arbitrary data.

The themes sections of the docs is generated with a Python script: GitHub - getzola/themes: Creates the templates section to be used in Zola doc site and https://github.com/getzola/themes/blob/master/generate_docs.py
If I didn’t need to parse TOML for the themes it would work without any 3rd-party libraries

Just want to chime in that having a CSV file to replace a directory of markdown files would be very useful. I’ve got an idea for a static site where the data the powers the content could be a spreadsheet and would love to generate a page for each row. Perhaps a Section can have a config to mark that it’s content comes from a CSV/Json file instead of markdown files?