Hi, gigantic site guy here. I’ve been quiet for a while but zola has been happily churning away on our massive site. The site is up to 575,000 pages and it takes about 2 minutes for a fresh build, and I’m very happy with that performance. (No, it’s not an AI generated site, if anyone was wondering.)
Memory consumption is getting pretty high, though, topping 20 GB towards the end of the build. It’s fine on workstations, but pipelines with lower memory quotas are starting to be a problem.
What I’ve tried so far:
- Analyze memory usage. I tried heaptrack and massif. Both went OOM trying to measure the memory usage. I sort of got heaptrack working by saving raw allocations and creating a basic streaming analyzer, since heaptrack’s analyzer (heaptrack_interpret) does bulk.
- Look for optimizations in tera. I found some enums in tera that have large variants, and Rust enums are the size of the largest variant. I tried
Boxing the larger variants, such asMath(MathExpr)becomingMath(Box<MathExpr>). This reduced tera memory by about 44% in tera-only tests I did, but had very little effect on total memory used in our zola build. - Look for optimizations in zola. I think the main thing that would help is less
collect()and more streaming, but overhauling the build architecture would be silly, just to help me, probably the only zola user on earth with this problem. I tried converting a few isolated areas to streaming, likeget_all_orphan_pages(we have a LOT of orphans), but it didn’t move the needle.
After trying that stuff, I’m here looking for advice.
Can you (keats et al) suggest areas of zola that I could look at optimizing?
Maybe a few tactical places that would benefit from replacing a `Vec<T>` with an impl Iterator<Item = T>? Or places where I could add a manual drop() for certain parts of the site tree, once they are no longer needed (like the raw file content for example; could be freed after rendering)?
Thanks for any pointers. ![]()