Migrating from another blog engine: zola could be more lenient with the frontmatter, especially around taxonomies

Hi!
I don’t know if anyone faced this issue (or is it a feature reqest?). Zola behaves very different from the other blog engines (Jekyll, Hugo to name the ones I know well).

My front matter looks like this in my (over 1000) md files:

+++
date = 2020-09-08T16:18:51+02:00
title = "Lire le cycle de l'Assassin Royal, c'est compliqué"
tags = ["livre"]
copy = "https://twitter.com/jpcaruana/status/1303356472705921026"
really_any_key_I_want = "any value"
+++

But I have to introduce taxonomies and extras, it seems unnecessary (and will prevent me from transitionning to zola as I’d have to update over 1000 frontmatters)

+++
date = 2020-09-08T16:18:51+02:00
title = "Lire le cycle de l'Assassin Royal, c'est compliqué"
[taxonomies]
tags = ["livre"]
[extra]
copy = "https://twitter.com/jpcaruana/status/1303356472705921026"
really_any_key_I_want = "any value"
+++

Did I miss something here? If not, why is zola so different here when compared to jekyll or hugo?

Thanks

It just happened initially to keep things simple but overall I would keep it that way. It’s cleaner imo to look at and easier internally in the codebase: everything is typed (except for the extra section obviously) immediately and you get the errors directly from parsing rather than zola trying to guess what it is. I’ve seen some sites with 100s of taxonomies, including common words, that were sometimes reused in extra as keys. With Jekyll/Hugo you would need to rename the extra field.

I understand
It makes migrating bigger code bases more challenging: I can’t imagine myself browsing over all my “hugo” content to fix the frontmatter.

I don’t know if there is some tool/script that would help me in doing that?

$ find content/ -name "*.md" | wc -l
    2638
1 Like

This is a simplified version of a Deno script I’ve written which edits in place, moving keys into the extra and taxonomies tables

1 Like

Great @dustinknopoff ! Thank you so much for sharing.

Sadly it does not detect TOML frontmatter contained within +++ and not --- (as described in Front Matter |Hugo), and I don’t see where I could change your script to make it read between +++ (easy to make it write +++ again)

I don’t know if you saw, but I commented an adjustment to that script which should work

Thank you, I will give it a try (even if I now prefer TOML over YAML for frontmatter, for no reason)

a solution is to use sed with find

it will be along the lines of the following:

find . -name “*.md” -exec sed -i saved ‘s/tags =/[taxanomies]\ntags =/’ {} ;

find . -name “*.md” -exec sed -i saved ‘s/copy =/[extra]\ncopy =/’ {} ;

i would run this only in a backup, gitted or copied folder to ensure it doesnt corrupt the original.
even though sed -i saves a copy with the saved extension.

sed does the new line using \n

Hi @adrianboston

thanks for the hint (and sorry for the late reply). I tried it, but as my frontmatter is not ordered (because, it is only a key/value context for hugoi), this simple sed command is not enough :frowning:

This is a basic substitution jitsu. All frontmatter has to be in the exact order/style/format you posted.

Basically it placed the string [taxonomies] before the string 'tags and [extra] before copy

If it is as posted then it should work depending on your system OS.

One could do a more complex REGEX but you would need to post further examples. Are you still even interested in swapping to zola. As a hint, I’ve had no problem with zola. I would never use any ruby,python,node based ones that fills your machine with gems and perls.

I am! :slight_smile:

Sorry, I didn’t get what you mean here.

I would write a script (python) that goes through your pages, extract the front-matter with a regex and load it using a toml library. You would have to hardcode the basic fields (date, title etc) and taxonomies in your script, everything else goes to extra. Then it can just regenerate the page with the new front-matter.

ok, that what was on my mind too :slight_smile:

try this which puts [taxonomies] before tags and [extra] after it. of course the assumption is that tags is the last entry prior to the non standard frontmatter.

sed -E 's/(tags .+)/[taxonomies]\n\1\n[extra]/' content/test.md

just copy paste your example above into a file called test.md. if it works as expected on the output then you can go about saving files.

im on macos. not sure what you are running.

Try this site out its great for testing regex

1 Like

I ended up writting a python script GitHub - jpcaruana/frontmatter_switcher. Works great for me :slight_smile: I had of 2500 md files to convert, so it had to do everything at once.
Thank you for your help.

1 Like

ok