Louis - Git - Blog - Contact - Resume

Making a simple yet fast static site builder

A week ago, I've come accross a very interesting article by Roman Zolotarev about a little script called ssg. I very much liked the way Roman was thinking on the problem:

As a system developer, it is always a good thing to see people making the effort to keep their tools, scripts, binaries low on dependencies, disk space and memory. Folks these days tend to forget that you don't need 2GB of RAM in order to display a picture of a cat (Looking at you slack!)

To be honest, I've never been interested in static site builder before Roman's post. I did see the utility in it, but I've never needed it myself (Though this blog is the 'living' proof that is not true anymore). So, I've never used Jekyll, Hugo, or any common site builder.

I was going to clone and use ssg as my blog builder whithout even thinking about it, but then I read this about ssg performance:

100 pps. On modern computers ssg generates a hundred pages per second. Half of a time for markdown rendering and another half for wrapping articles into the template. I heard good static site generators work—twice as fast—at 200 pps, so there's lots of performance that can be gained. ;)

I was astonished by this performance. A hundred page per second is very slow in mordern computing. Yet, ssg is written in shell, so there's a lot of fork(2) and dup2(2) involved, and I can assume it was not written for speed, but for simplicity. Still, Roman tells us that a 'good' static site generators works around 200pps, which is still very slow. As a challenge and for my personnal curiosity, I've tried to develop a simple C static site generator, see what kind of performance could be done.

shayla

shayla is the result of this work. It's a small C binary (5k LoC), that reads, parses and generates HTML from markdown. The result binary is about 83Kb stripped, and come with no dependency (Besides the obvious libc). It has been developed and tested under a GNU/Linux, but should be working with equal performances under OSX or FreeBSD. I will talk about how to use shayla at the end of this post, after all the parts about performance.

Test process

The test plaform will be my Thinkpad T480s, with a shiny new Arch Linux (4.17.4) and a Samsung NVME SSD of 500G. The write speed of my disk is about 600MB/s, and the read speed is about 1.5G/s

$> dd if=/dev/zero of=output bs=1024 count=1000k
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 1.77229 s, 592 MB/s

$> dd if=output of=/dev/null bs=1024 count=1000k
1024000+0 records in
1024000+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 0.71487 s, 1.5 GB/s

The actual test will be to build a lot of pages into a website. There is 305Mb of pages (around 25k files), and each page is 'complex' markdown in order to test the parser totally. Each page contain the following header:

For Shayla:

---
title: article_X
summary: This is article X
route: article-X
---

For Hugo:

---
title: "Article"
date: 2018-08-23T20:20:56+02:00
---

For Jekyll:

---
title: "Article number 0"
layout: post
---

And the actual content is this markdown file. All the times have been mesured with /usr/bin/time.

Note on the benchmark

The purpose of this benchmark is not to proove that X is better than Y. Jekyll and Hugo have far more features than ssg or shayla, and I think are very good for building / testing a static website. For example, the live server feature is pretty useful for development, and the templating system of those tools are fare more superior than shayla. The purpose of this benchmark is to test the actual speed of those tools to transform markdown in html, applying a template and generate an index . Which is precisely all I need for this blog.

TL;DR I do not claim that these benchmarks are representative of the software tested. It is simply a pps benchmark, so take the results with a grain of salt.

Benchmark

Benchmark chart

Let's begin by the obvious: shayla is fast. It is 663 times faster than Jekyll and 13 times faster than Hugo. But again, shayla is simple, and surely too simple for projects other than blogs. By doing those tests, I did have the pleasure to test Hugo for the first time, and I must say I'm impressed with the software. The code looks clean, the template system very simple, and the performances are really nice. For a more complete static site project, I'll go with Hugo without an hesitation.

But for a simple blog, I'll stick with ssg or Shayla. There is no real difference between those two tools, besides what you want to learn from using it. ssg is a simple shell script, easy to hack for your needs, and learn a thing or two about your system. If you do not feel concerned with the point above, maybe Shayla is the right tool for you.

Using Shayla

Usage: shayla -[vhtsldrfuit]
Generate an HTML static site for markdown sources.
If used with no options, shayla will look for directory in the current path.

Options:
  -v, --version            Print software version
  -h, --help               Show this message
  -t, --title=TITLE        Title to be used in the final site
  -s, --src=DIR            Markdown sources directory
  -c, --style=DIR          Style sources directory
  -l, --layouts=DIR        Layouts directory
  -d, --dest=DIR           Destination directory
  -r, --root=ROOT          Root URL of the website
  -f, --favicon=FILE       Favicon to use
  -u, --url=URL            Url of the website
  -i, --img=DIR            Images directory
  -t, --threads=NUM        Number of threads to launch
      --debug              Print more information

Tree

Here's the 'required' tree for Shayla:

├── img
├── layouts
│   ├── footer.html
│   ├── header.html
│   └── intro.html
├── markdown
└── styles

'required' is quoted because directories can have any names, could be at any place on your filesystem. This is just the default setup.

There is no shayla init. I think you can manage creating 4 directories by yourself.

Post

A little header is required at the beginning of every post:

---
title: My first Article
summary: This is my first article
---

These 2 are required for every post. Here's a complete list of all the options:

Building the website

$> shayla --title "My site title" \
    --dest /var/www/htdocs/blog \
    --favicon ~/Pictures/blog_favicon.ico \
    --url "https://blog.ne02ptzero.me"

Shayla Terminal GIF

You can find the sources and build instructions of shayla here

No copyright - louis at ne02ptzero dot me
Any and all opinions listed here are my own and not representative of my employers; future, past and present.