Migrating from another blog engine: zola could be more lenient with the frontmatter, especially around taxonomies

jpcaruana · January 6, 2023, 2:01pm

Hi!
I don’t know if anyone faced this issue (or is it a feature reqest?). Zola behaves very different from the other blog engines (Jekyll, Hugo to name the ones I know well).

My front matter looks like this in my (over 1000) md files:

+++
date = 2020-09-08T16:18:51+02:00
title = "Lire le cycle de l'Assassin Royal, c'est compliqué"
tags = ["livre"]
copy = "https://twitter.com/jpcaruana/status/1303356472705921026"
really_any_key_I_want = "any value"
+++

But I have to introduce taxonomies and extras, it seems unnecessary (and will prevent me from transitionning to zola as I’d have to update over 1000 frontmatters)

+++
date = 2020-09-08T16:18:51+02:00
title = "Lire le cycle de l'Assassin Royal, c'est compliqué"
[taxonomies]
tags = ["livre"]
[extra]
copy = "https://twitter.com/jpcaruana/status/1303356472705921026"
really_any_key_I_want = "any value"
+++

Did I miss something here? If not, why is zola so different here when compared to jekyll or hugo?

Thanks

keats · January 6, 2023, 4:00pm

It just happened initially to keep things simple but overall I would keep it that way. It’s cleaner imo to look at and easier internally in the codebase: everything is typed (except for the extra section obviously) immediately and you get the errors directly from parsing rather than zola trying to guess what it is. I’ve seen some sites with 100s of taxonomies, including common words, that were sometimes reused in extra as keys. With Jekyll/Hugo you would need to rename the extra field.

jpcaruana · January 6, 2023, 4:09pm

I understand
It makes migrating bigger code bases more challenging: I can’t imagine myself browsing over all my “hugo” content to fix the frontmatter.

I don’t know if there is some tool/script that would help me in doing that?

$ find content/ -name "*.md" | wc -l
    2638

dustinknopoff · January 6, 2023, 8:03pm

This is a simplified version of a Deno script I’ve written which edits in place, moving keys into the extra and taxonomies tables

gist.github.com

https://gist.github.com/dustinknopoff/0913e25d059f111f57045c904de25980

migrateToTaxonomies.ts

import {
    extract,
    test,
} from "https://deno.land/std@0.170.0/encoding/front_matter/any.ts";
import { walk } from "https://deno.land/std@0.170.0/fs/mod.ts";
import { stringify } from "npm:yaml@2.1.3"

async function writeFile(path: string, attrs: { [key: string]: any }, body: string) {
    await Deno.writeTextFile(path, `---\n${stringify(attrs)}\n---\n\n${body}`)
}

This file has been truncated. show original

jpcaruana · January 7, 2023, 3:10pm

Great @dustinknopoff ! Thank you so much for sharing.

Sadly it does not detect TOML frontmatter contained within +++ and not --- (as described in Front Matter |Hugo), and I don’t see where I could change your script to make it read between +++ (easy to make it write +++ again)

dustinknopoff · January 7, 2023, 3:50pm

I don’t know if you saw, but I commented an adjustment to that script which should work

jpcaruana · January 7, 2023, 4:17pm

Thank you, I will give it a try (even if I now prefer TOML over YAML for frontmatter, for no reason)

adrianboston · January 17, 2023, 12:41am

a solution is to use sed with find

it will be along the lines of the following:

find . -name “*.md” -exec sed -i saved ‘s/tags =/[taxanomies]\ntags =/’ {} ;

find . -name “*.md” -exec sed -i saved ‘s/copy =/[extra]\ncopy =/’ {} ;

i would run this only in a backup, gitted or copied folder to ensure it doesnt corrupt the original.
even though sed -i saves a copy with the saved extension.

sed does the new line using \n

jpcaruana · January 25, 2023, 3:27pm

Hi @adrianboston

thanks for the hint (and sorry for the late reply). I tried it, but as my frontmatter is not ordered (because, it is only a key/value context for hugoi), this simple sed command is not enough

adrianboston · January 25, 2023, 10:02pm

This is a basic substitution jitsu. All frontmatter has to be in the exact order/style/format you posted.

Basically it placed the string [taxonomies] before the string 'tags and [extra] before copy

If it is as posted then it should work depending on your system OS.

One could do a more complex REGEX but you would need to post further examples. Are you still even interested in swapping to zola. As a hint, I’ve had no problem with zola. I would never use any ruby,python,node based ones that fills your machine with gems and perls.

jpcaruana · January 27, 2023, 10:54am

I am!

Sorry, I didn’t get what you mean here.

keats · January 27, 2023, 12:57pm

I would write a script (python) that goes through your pages, extract the front-matter with a regex and load it using a toml library. You would have to hardcode the basic fields (date, title etc) and taxonomies in your script, everything else goes to extra. Then it can just regenerate the page with the new front-matter.

jpcaruana · January 27, 2023, 1:08pm

ok, that what was on my mind too

adrianboston · January 27, 2023, 7:55pm

try this which puts [taxonomies] before tags and [extra] after it. of course the assumption is that tags is the last entry prior to the non standard frontmatter.

sed -E 's/(tags .+)/[taxonomies]\n\1\n[extra]/' content/test.md

just copy paste your example above into a file called test.md. if it works as expected on the output then you can go about saving files.

im on macos. not sure what you are running.

Try this site out its great for testing regex

jpcaruana · January 28, 2023, 10:53am

I ended up writting a python script GitHub - jpcaruana/frontmatter_switcher. Works great for me I had of 2500 md files to convert, so it had to do everything at once.
Thank you for your help.

adrianboston · January 29, 2023, 10:45pm

ok

BobRocke · April 1, 2025, 12:25pm

I’m coming into this conversation pretty late, and I’m a little bit sorry. But I’m glad I found it. The portability of markdown files has been on my mind lately as I consider switching my blog platform yet again (from Hugo to maybe Zola, this time).

Using scripts to convert ‘standard’ front matter to Zola front matter and potentially back to ‘standard’ format one day in the future has really cooled my jets.

A year later, have better solutions been developed?

Topic		Replies	Views
`yaml` frontmatter as a config option Feature requests	4	561	July 13, 2021
Experience report after porting my blog Feature requests	5	1162	June 29, 2020
RFQ: Converting Pelican (with reStructuredText) blog to Zola Support	0	196	May 24, 2023
New behavior of old templates with truncate path and taxonomy items Support	5	540	January 8, 2021
Migrated to Zola, Loved it!	2	711	November 22, 2020

Migrating from another blog engine: zola could be more lenient with the frontmatter, especially around taxonomies

Related topics