31st October 2021

Simplified Saaze

1. Introduction

Simplified Saaze is a static site generator. I.e., it takes Markdown files as input and generates fixed HTML files. Simplified Saaze is a simplified version of Saaze from Gilbert Pellegrom. Large parts of this document are taken from the Saaze documentation. Simplified Saaze is roughly 90% compatible with Saaze. Simplified Saaze is built on below principles.

1. Easy to run. Simplified Saaze is built in PHP with some small parts in C. PHP is roughly used by 80% of all web-sites on the internet. Simplified Saaze needs no other PHP framework and only one PECL library.

2. Easy to host. Static sites are great for being fast and easy to deploy. However, sometimes you need dynamic aspects to your site (e.g., contact forms, custom scripts, etc). Simplified Saaze gives you the choice depending on what makes most sense.

3. Easy to edit. Markdown has become the de-facto way to edit content for the internet. It's simple to understand and write. So Simplified Saaze uses Markdown with a sprinkle of Yaml frontmatter to manage your content.

4. Easy to theme. Simplified Saaze uses plain PHP/HTML to theme. Any PHP code is a valid theme and can be checked with php -l.

5. Fast and secure. Simplified Saaze works with ordinay files in your filesystem. No database required. This means less setup and maintenance, better security and more speed. Simplified Saaze is way faster than Hugo or Zola, see Performance Comparison Saaze vs. Hugo vs. Zola.

6. Simple to understand. Simplified Sazze deliberately has a stupidly simple architecture: Everything is a collection of entries. Pages, posts, docs, recipes, whatever. It all works in the same, simple way. Supports multiple blogs under the same URL out of the box.

7. All-inclusive. Developing your site should be painless. No external tools required.

2. Installation

1. Simplified Saaze requires PHP version 8 as a minimum as it uses FFI. To be exact, PHP version 7.4 would be sufficient for FFI, but Simplified Saaze also makes use of "union types" only present in PHP8. Please note that PHP 7 active support ends November 2021, and security support for PHP version 7 ends November 2022.

5.6 7.0 7.1 7.2 7.3 7.4 8.0 1 Jan 2018 1 Jan 2019 1 Jan 2020 1 Jan 2021 1 Jan 2022 1 Jan 2023 1 Jan 2024 1 Jan 2025
Active support A release that is being actively supported. Reported bugs and security issues are fixed and regular point releases are made.
Security fixes only A release that is supported for critical security issues only. Releases are only made on an as-needed basis.
End of life A release that is no longer supported. Users of this release should upgrade as soon as possible, as they may be exposed to unpatched security vulnerabilities.

Checking the PHP version is

php -v

and should show something like

PHP 8.0.11 (cli) (built: Sep 25 2021 07:52:29) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.11, Copyright (c) Zend Technologies

2. Your PHP needs Yaml extension. Download from PECL. See PECL's Yaml Way Faster Than Symfony's Yaml. It boils down to phpize, configure, make. Check with php -m whether yaml is finally enabled in php.ini:

extension=yaml

3. Installation with composer: Create a directory of your liking, change into it, then run

composer create-project eklausme/saaze-example

This will download and install an example blog, and also the actual Simplified Saaze software.

4. The MD4C library must be installed. For example, on Arch Linux you check with

pacman -Qs md4c

FFI must be enabled in PHP. Check with phpinfo() or

php -m | grep FFI

To compile the FFI you need a C compiler, for example GCC. To compile the FFI to so use

cc -fPIC -Wall -O2 -shared php_md4c_toHtml.c -o php_md4c_toHtml.so -lmd4c-html

C program file php_md4c_toHtml.c is located in vendor/eklausme/saaze.

5. Go to directory vendor/eklausme/saaze and edit Config.php to supply the correct location of php_md4c_toHtml.so in self::$H hash, key global_ffi. Example:

'global_ffi' => \FFI::cdef("char *md4c_toHtml(const char*);","/srv/http/php_md4c_toHtml.so"),

The so-file can be placed "anywhere".

Double check that FFI is enabled in php.ini:

extension=ffi

6. General remark: All these prerequisites are only required to generate the static HTML files. Once the HTML files are generated, your web-server does not need PHP, nor MD4C, etc.

Only if you want to use the dynamic function of Simplified Saaze then your web-server needs PHP and all the above prerequisites.

7. Source code is on GitHub: eklausme/saaze. Changing or adapting the source code to ones own requirements should be painless. The entire source code is less than 2 kLines.

3. Directory structure

Assume you have created a directory ssaaze, i.e., mkdir ssaaze. Then composer would have created

ssaaze/
├── build/
├── content/
│   ├── blog/
│   |   └── example-page.md
│   └── blog.yml
├── public/
│   └── index.php
└── templates/
    ├── blog/
    │   ├── entry.php
    │   └── index.php
    ├── index.php
    ├── entry.php
    ├── error.php
    ├── top-layout.php
    └── bottom-layout.php

The directories serve the following:

  1. build will contain the result of the run when generating static files.
  2. content is where all your Markdown files reside.
  3. public is used for dynamic content. It should show the same content as in build, just without any static files laying around.
  4. templates contains PHP files which are used to generate static or static files. They usually contain common HTML elements, which are present on all your web pages. For example, they contain your company logo, Google analytics, etc.

It is very likely that you will have additional directories, for example, for images or PDF documents. They are not touched by Simplified Saaze.

4. Basic usage

4.1 Static site generator

Go to your content directory or any subdirectory therein and create your Markdown file with frontmatter in the beginning. An example is here:

--- 
title: An Example Post
date: "2021-10-30"
--- 
This is an **example** with some _markdown_ formatting.

Then run

php saaze

That's it. This will populate the build directory with HTML files. Either point your web-server document root directly to this directory, or copy/move files in build to your web-server's document root.

All your Markdown files must have suffix .md. That's what Simplified Saaze is processing. The file name can be arbitrary, except the name index.md is special. File index.md serves as transparent section. I.e., the content of the index.md file will be shown when the directory will be the ending part in the URL. For example, directory a/b/c contains index.md. Then the URL for https://.../a/b/c will show the Simplified Saaze'd output of index.md. This is usually for table of content like pages. For example, let's assume blog/2021 contains a number of Markdown files. Then index.md in blog/2021 can serve as a table of content for this directory.

4.2 Dynamic content generation

Either put index.php from public directory to your document root of your web-server.

For testing you can also use PHP builtin's web-server:

php -S 0:8000 -t ~/.../public

This will present your content at URL localhost:8000/.

For this dynamic web page generation to work you must have URL rewriting enabled in your web server! E.g., in Hiawatha you must use something like this:

UrlToolkit {
        ToolkitID = PHP_Routing
        RequestURI isfile Return
        Match ^/*$ Rewrite /index.php?/blog/
        Match ^/(.+) Rewrite /index.php?$1
}

The important part is the last line with Match in above configuration, telling the web server to redirect the URL https://example.com/abc/uvw to https://example.com/index.php?/abc/uvw. Without this, dynamic content generation will not work. Rewriting the empty string to /blog/ is just a convenience.

For Lighttpd the configuration is:

server.modules += ( "mod_openssl", ..., "mod_rewrite" )

url.rewrite-if-not-file = (
        "^/*$"  => "/index.php?/blog/",
        "^/(.*)" => "/index.php?$1" 
)

As before, rewriting the empty string to /blog/ is just a convenience.

4.3 Single file generation

Simplified Saaze allows to generate a single file, instead of all files in content directory. Use command-line option -s for this and specify the input Markdown file. E.g.,

php saaze -s content/blog/2021/new-post.md

will just build this single file. This is important when you don't want to run Simplified Saaze for your entire web-site, but rather just insert or update a single post.

This single file generation can be integration into a Makefile to just generate updated files.

4.4 Specifying an alternate build directory

When you add the command-line option -b you can specifiy the directory where the static files will be placed. E.g.,

php saaze -b /tmp/ramdisk/

This will generate the static files in /tmp/ramdisk instead of build.

4.5 Turning extract file generation on

In the single file mode you sometimes also want the excerpt file, such that you can update some table of content file with this excerpt. If you want the excerpt file generated, then add -e.

php saaze -es content/blog/2021/another-post.md

Above example generates for the single file content/blog/2021/another-post.md but in addition the file excerpt.txt is generated. So obviously, extract file only makes sense for a single file.

4.6 Draft mode

Any blog post which contains draft: true in the frontmatter will not be shown in the generated static HTML. Though, these draft posts will be shown in dynamic mode, see 4.2. If you want draft posts to be generated then specify -f:

php saaze -f

Having draft posts mixed with your normal content allows you to work on some still unfinished posts, without having them on your "productive site". The reason for this disparity between static and dynamic mode is, that dynamic mode cannot take command-line arguments.

4.7 Categories and tags

If you have more than say 50 blog posts, then organizing them via categories and/or tags becomes beneficial for the reader to find related content. If you want to use categories and tags you add

categories: ["category1", "category2", "category3"]

to your frontmatter. In the same way you add tags to your frontmatter at the top of your post:

tags: ["tag1", "tag2", "tag3"]

Now you use

php saaze -t

to generate a cat_and_tag.json file in the content directory. This file can then be used in your templates. This file looks like this:

{
    "categories": {
        "Android": [
            [
                "\/blog\/2013\/03-13-screenshots-on-nexus-4-android-4-x",
                "2013-03-13 21:54:03",
                "Screenshots on Nexus 4 (Android 4.x)"
            ],
            [
                "\/blog\/2013\/08-04-google-now-emergency-alert",
                "2013-08-04 18:22:55",
                "Google Now Emergency Alert"
            ],
        ]
    }
}

This cat_and_tag variable is a three-dimensional array: $cat_and_tag[$i][$j][$k].

  1. $i is either categories or tags
  2. $j is the category or tag, e.g., "Android" or "hardware", etc.
  3. $k being either 0, 1, 2, i.e., it is a list of 3 elements
    • URL
    • date
    • title

This cat_and_tag.json can only be generated during static site generation, and not during dynamic mode.

One important caveat: the cat_and_tag.json file is generated at the end of the generation. So the very first time, when you use categories and tags, the file will be empty, and you won't see categories and tags. Only after the second time, when you generate your static site, will you see your categories and tags. This is similar to the way TeX or LaTeX works with regard to table of contents or indexes.

4.8 RSS XML Feed

Passing -r as command-line flag will produce a file called feed.xml in the build directory. This file is an Atom 2.0 feed. The actual generation of feed.xml is done by the template code in rss.php. This rss.php looks something like this:

<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
    <title>Elmar Klausmeier's Blog</title>
    <description>Elmar Klausmeier's Blog</description>
    <lastBuildDate><?=gmdate("r")?></lastBuildDate>
    <link>https://eklausmeier.goip.de</link>
    <atom:link href="https://eklausmeier.goip.de/feed.xml" rel="self" type="application/rss+xml" />
    <generator>Simplified Saaze</generator>
<?php
$rssRelevant = array();
foreach ($collections as $collection) {
    if (!array_key_exists('rss',$collection->data) || $collection->data['rss'] === false)
        continue;
}
krsort($rssRelevant);	// sort on key=date+title in reverse order
$maxRss = 50;	// number of item's in RSS XML feed
$timeZone = new \DateTimeZone('Europe/Berlin');
foreach ($rssRelevant as $entry) {
    if ($maxRss-- <= 0) break;
    $d = date_create($entry->data['date'],$timeZone);
    printf("\t<item>\n"
        . "\t\t<link>https://eklausmeier.goip.de%s</link>\n"
        . "\t\t<guid>https://eklausmeier.goip.de%s</guid>\n"
        . "\t\t<title>%s</title>\n"
        . "\t\t<pubDate>%s</pubDate>\n"
        . "\t\t<description><![CDATA[\n%s\n"
        . "\t\t]]></description>\n"
        . "\t</item>\n",
        $entry->data['url'], $entry->data['url'], $entry->data['title'], date_format($d,"r"), $entry->data['content']);
}
?>
</channel>
</rss>

For the RSS feed generation to work you need templates/rss.php. Without that, no RSS.

4.9 Sitemap

Command-line flag -m creates a file called sitemap.html. This file covers all collections. The template file for the generation is usually as below:

<?php $url='/sitemap.html'; ?>
<?php require SAAZE_PATH . "/templates/head.php"; ?>
<title>Sitemap</title>
</head>
<body>
<h1>Sitemap</h1>
<ol>
<?php
foreach ($collections as $collection) {
    sort($collection->entries);
    foreach ($collection->entries as $entry) {
        $href = isset($collection->data['uglyURL']) ? $entry->data['url'] . '.html' : $entry->data['url'];
        printf("\t<li><a href=\".%s\">%s</a></li>\n", $href, $entry->data['url']);
    }
}
?>
</ol>
</body>
</html>

For the sitemap generation to work you need templates/sitemap.php. Without that, no sitemap.

Generating this blog, for example, goes like this:

: time php saaze -mrtb /tmp/build
Building static site in /tmp/build...
        execute(): filePath=/home/klm/php/sndsaaze/content/aux.yml, nentries=5, totalPages=1, entries_per_page=20
        execute(): filePath=/home/klm/php/sndsaaze/content/blog.yml, nentries=370, totalPages=19, entries_per_page=20
        execute(): filePath=/home/klm/php/sndsaaze/content/gallery.yml, nentries=4, totalPages=1, entries_per_page=20
        execute(): filePath=/home/klm/php/sndsaaze/content/music.yml, nentries=25, totalPages=2, entries_per_page=20
        execute(): filePath=/home/klm/php/sndsaaze/content/error.yml, nentries=1, totalPages=1, entries_per_page=20
Finished creating 5 collections, 4 with index, and 419 entries (0.11 secs / 11.66MB)
#collections=5, YamlParser=0.0058/425-5, md2html=0.0099, MathParser=0.0057/419, renderEntry=419, content=419/0, excerpt=0/0
        real 0.13s
        user 0.08s
        sys 0
        swapped 0
        total space 0

It uses command-line flags, -m for sitemap, -r for RSS XML feed, -t for categories+tags, and -b for writing results into build directory /tmp/build.

4.10 Environment variables

The following environment variables are read:

  1. CONTENT_PATH: directory of content path, i.e., where your markdown files are
  2. PUBLIC_PATH: path for dynamic mode
  3. TEMPLATES_PATH: where to find template files, which are just PHP files
  4. ENTRIES_PER_PAGE: number of entries per index-page

5. Collections and entries

5.1 Collections

One of the core concepts of Simplified Saaze is that everything is a collection of entries. From pages, blog posts, navigation menus, users, everything.

Collections are defined by Yaml files in the content directory of your site. A collection will define not only the ID and title of the collection, but also the routes for the collection and how entries are sorted in the collection.

For example, say you wanted to create a blog in Simplified Saaze. You could create a collection file called posts.yml with the following content:

title: Blog
index_route: "/blog"
entry_route: "/blog/{slug}"
sort_field: date
sort_direction: desc

The ID of the collection is defined by the file name, e.g., posts. Below is a description of the available fields in a collection and what they do. With the exception of entry_route, all of these fields are optional if you don't need them.

  1. title: The title of the collection.
  2. index_route: The route of the index for this collection. Normally this page will show a collection archive (a paginated list of entries) but it can also be a single entry if the collection has an index.md file. The index_route field can be omitted. In that case no index will be shown.
  3. entry_route: The route of an individual entry for this collection. This value should always contain {slug} which will be replaced by the entry ID when serving your site. This field is mandatory.
  4. sort_field: The entry field used to sort the collection.
  5. sort_direction: The direction to sort entries (either asc or desc). Default is asc.
  6. entries_per_page: The number of excerpts shown in index page. Default is 20.
  7. excerpt_length: Number of characters for the excerpt in each index entry. Default is 300 characters. If you set this parameter very high, e.g., 900, then you will probably want to reduce entries_per_page to keep your page not becoming too crowded. Though, both parameters can be set freely.
  8. uglyURL: boolean. True, if ugly URLs should be generated, false if no ugly URLs should be generated. Ugly URLs look like {slug}.html. Non-ugly URLs create a separate directory for each file and an index.html in it, i.e., {slug}/index.html. Currently only implemented for static site generation, not for dynamic generation. Default is false.
  9. rss: boolean. True if this collection should be part of RSS feed, false if it is ignored in RSS. Default is false.

5.2 Entries

Entries are just Markdown files with frontmatter. Below is an example:

---
title: Your title goes here
date: "2021-10-31 10:15:30"
---
Here is the usual Markdown.

Markdown is parsed with MD4C, which is

  1. CommonMark-compliant
  2. Very fast
  3. Handles tables
  4. Provides strikethrough with ~

Although generally known, Markdown can contain verbatim HTML code. In contrast, Hugo's Goldmark does not handle HTML.

Frontmatter in entries is handed over to the template verbatim. So any key/value pair in the frontmatter can be checked in template code. For example:

---
title: Blog post
date: "2021-10-30 17:30:00"
prismjs: true
MathJax: true
---
Your blog post

Here title, date, prismjs, and MathJax can be used in template code like this

<?php if (isset($entry['MathJax'])) { ?>

or like this

<?php if (isset($entry['prismjs'])) { ?>

Entries can have the following "variables" in frontmatter:

  1. title: string containing the title
  2. date: string in format yyyy-mm-dd HH24:mi:ss
  3. draft: boolean
  4. Mermaid: boolean, indicating whether Mermaid graphics are used
  5. MathJax: boolean, indicating whether MathJax CSS and JavaScript is used
  6. prismjs: boolean, indicating whether PrismJS CSS and JavaScript is used
  7. categories: JSON
  8. tags: JSON

The following additional "variables" are used in entries, mostly only used in templates:

  1. url
  2. content
  3. gallery_css + gallery_js
  4. markmap_css + markmap_js

5.3 Special tags

Simplified Saaze defines some special tags for various social media or graphing. Furthermore, MathJax is fully supported.

Nr Function Syntax Example
1 YouTube [youtube] xxx [/youtube] [youtube]nvlAW6P5PmE[/youtube]
2 Vimeo [vimeo] xxx [/vimeo] [vimeo]126529871[/vimeo]
3 Twitter [twitter] xxx [/twitter] [twitter]https://twitter.com/eklausmeier/status/1352896936051937281[/twitter]
4 CodePen [codepen] user/hash [/codepen] [codepen] thebabydino/eJrPoa [/codepen]
5 WordPress Video [wpvideo] code w=x h=y ] [wpvideo RLkLgz2V w=400 h=224]
6 Mermaid [mermaid] xxx [/mermaid], where xxx is the Mermaid code [mermaid]flowchart LR Start --> Stop[/mermaid]
7 Gallery [gallery] dir /regex/ [/gallery] [gallery] /img/gallery /IMG_20220107_14\.+\.jpg [/gallery]
8 markmap Mindmap [markmap] Headings [/markmap] [markmap] # H1 [/markmap]
9 Inline math $ formula $ $a^2+b^2=c^2$
10 Display math $$ display formula $$ $$ \int_1^\infty {1\over x^2} $$

For Mermaid to work: You have to set Mermaid: true in the frontmatter so that the required JavaScript is loaded.

For math: You have to set MathJax: true in the frontmatter to load JavaScript for MathJax.

5.4 Routing

In a Simplified Saaze site, all of the routes are defined by collections. The index_route and entry_route of each collection will be used to determine how an entry can be accessed by URL. For example, let's say we have posts collection:

title: Blog
index_route: "/blog"
entry_route: "/blog/{slug}"

When you create an entry in a collection, the name of the file (the entry ID) is used as the "slug" for the entry. For example, say we have an entry file at content/posts/an-example-post.md. This post will be accessible at the URL:

https://mysite.com/blog/an-example-post

Subdirectories work too. For example, say we have an entry file at content/posts/marketing/an-example-post.md. This post will be accessible at the URL:

https://mysite.com/blog/marketing/an-example-post

Index entries: If the ID of an entry is index, this entry will be shown at the index_route instead of the default collection archive page. For example, the entry file content/posts/index.md will be accessible at the URL:

https://mysite.com/blog

This works for subdirectories too. For example, say we have an entry file at content/posts/marketing/index.md. This post will be accessible at the URL:

https://mysite.com/blog/marketing

6. Templates

The entry-template has the following variables at its disposal: see Entries for a list of variables present in the so called "entries".

The collection- or index-template has the following variables in the PHP array pagination:

  1. currentPage, the page number of the current index page
  2. prevPage, the page number of the previous index page
  3. nextPage, the page number of the next index page
  4. prevUrl, the URL of the previous index page
  5. nextUrl, the URL of the next index page
  6. perPage, the number of entries per index page; this is $H['global_config_entries_per_page']
  7. totalEntries, number of entries (= blog posts)
  8. totalPages, the number of index pages
  9. entries, an PHP array of the entries (=blog posts) for the current index page

7. Internal data structure

Overall logic for building all static pages for all collections:

public function buildAllStatic(string $dest) : void {
    $this->clearBuildDirectory(...);
    $collections = $this->collectionArray->getCollections();

    foreach ($collections as $collection) {
        $entries    = $collection->getEntries();
        $nentries   = count(...);
        $entries_per_page = ...;
        $totalPages = ceil($nentries / $entries_per_page);

        $this->buildCollectionIndex($collection, ...);

        for ($page=1; $page <= $totalPages; $page++)
            $this->buildCollectionIndex($collection, $page, $dest);

        foreach ($entries as $entry)
            $this->buildEntry($collection, $entry, ...);
    }
}

Below ER diagram contains all PHP classes, which are held in memory during runtime of Simplified Saaze. Of these, the data in entry_data['content_raw'] is the Markdown, and entry_data['content'] is the HTML after conversion via MD4C.

erDiagram CollectionArray ||--o{ Collection : "has multiple" CollectionArray { array collections bool draftOverride } Collection ||--|| collection_data : contains Collection ||--o{ Entry : "has multiple" Collection { string filePath array collection_data string slug bool draftOverride array entries array entriesSansIndex } collection_data { string title string index_route string entry_route string sort_field string sort_direction int entries_per_page int excerpt_length bool uglyURL bool rss } Entry ||--|| entry_data : contains Entry { Collection collection string filePath array entry_data } entry_data { string title string author string date string template bool draft bool Mermaid bool MathJax bool prismjs JSON categories JSON tags string url string content_raw string content string gallery_css string gallery_js string markmap_css string markmap_js } Config { string global_rbase string global_path_base string global_path_public string global_public string global_path_templates string global_config_entries_per_page string global_excerpt_length string global_ffi } Entry ||--|| MarkdownContentParser : uses MarkdownContentParser { function toHtml } SaazeCli ||--|| BuildCommand : uses SaazeCli { function run } BuildCommand ||--|| TemplateManager : has BuildCommand ||--|| CollectionArray : has BuildCommand { string defaultName string buildDest CollectionArray collectionArray TemplateManager templateManager function buildAllStatic function buildSingleStatic } Saaze ||--|| TemplateManager : has Saaze ||--|| CollectionArray : has Saaze { CollectionArray collectionArray TemplateManager templateManager function run } TemplateManager ||--|| pagination : creates TemplateManager { function renderCollection function renderEntry function renderError function renderGeneral } pagination ||--o{ entry_data : contains pagination { int currentPage int prevPage int nextPage string prevUrl string nextUrl int perPage int totalEntries int totalPages entry_data entries }

8. Release history

Working on Saaze started in May 2021.

  1. v1.0: 02-Nov-2021, removed unnecessary directories
  2. v1.1: 08-Nov-2021, fixed PHPStan messages, transparent sections in dynamic mode
  3. v1.2: 15-Nov-2021, QUERY_STRING handling in dynamic mode => honors web-server rewriting rules
  4. v1.3: 05-Dec-2021, reduce warning messages in case of dynamic mode
  5. v1.4: 16-Jan-2021, 404 status, url variable in template
  6. v1.5: 23-Jan-2022, added draft-mode: enable/disable generation of drafts
  7. v1.6: 26-Jan-2022, special handling for transparent sections, i.e., index.md
  8. v1.7: 20-Apr-2022, added gallery support
  9. v1.8: 03-May-2022, added markmap support for Mindmaps
  10. v.1.9: 25-Jun-2022, support uglyURL and entries_per_page per yaml file
  11. v.1.10: 09-Jul-2022, cope for index_route in Saaze.php
  12. v.1.11: 29-Jul-2022, added parameter excerpt_length in collection.yml
  13. v.1.12: 01-Aug-2022, removed unused variables
  14. v.1.13: 07-Aug-2022, moved most of EntryManager to Collection, moved CollectionManager to CollectionArray
  15. v.1.14: 14-Aug-2022, generate cat_and_tag.json
  16. v.1.15: 16-Aug-2022, added RSS XML feed generation
  17. v.1.16: added sitemap-generation for static+dynamic mode