, 3 min read

Generate RSS from Markdown

For this blog I wanted an RSS feed. Saaze does not provide this functionality. Saaze is supposed to be "stupidly simple" by design, which I consider a plus. Since 15-Aug-2022 Simplified Saaze can generate a RSS XML feed. Simplified Saaze, so to speak, is the successor of Saaze.

This post shows how you can generate an RSS feed without Simplified Saaze, but using just plain Perl.

Generating an RSS feed is simple. It contains a header with some fixed XML. Then each post, is printed as so called "item" with

  1. link / URL
  2. publication date
  3. title
  4. an excerpt or even the full blog post

Finally the required closing XML tags. That's it.

Taking this information directly from Markdown file with some frontmatter seems to be the easiest approach. For example, the frontmatter for this blog post is:

---
date: "2021-05-30 20:00:00"
title: "Generate RSS from Markdown"
draft: false
categories: ["www"]
tags: ["RSS", "feed", "Markdown"]
author: "Elmar Klausmeier"
prismjs: true
---

Below Perl script mkdwnrss implements this. As input files it wants those blog posts which should be part of the RSS feed. So usually you will "generate" the list of files. Implementing this in PHP would be equally simple.

The excerpt is restricted to either 9 lines of Markdown or less than 500 characters.

#!/bin/perl -W
# Create RSS XML file ("feed") based on Markdown files
#
# Input: List of Markdown files (order of files determines order of <item>))
# Output: RSS (description with 3 lines of Markdown as excerpt)
#
# Example:
#      mkdwnrss `find blog/2021 -type f | sort -r`

use strict;

my $dt = localtime();
print <<"EOT";
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
<channel>
    <title>Elmar Klausmeier's Blog</title>
    <description>Elmar Klausmeier's Blog</description>
    <lastBuildDate>$dt</lastBuildDate>
    <link>https://eklausmeier.goip.de</link>
    <atom:link href="https://eklausmeier.goip.de/feed.xml" rel="self" type="application/rss+xml" />
    <generator>mkdwnrss</generator>

EOT


sub item(@) {
    my $f = $_[0];
    open(F,"< $f") || die("Cannot open $f");

    my $link = $f;
    $link =~ s/\.md$/\//;
    print "\t<item>\n"
    . "\t\t<link>https://eklausmeier.goip.de/$link</link>\n"
    . "\t\t<guid>https://eklausmeier.goip.de/$link</guid>\n";

    my ($sep,$linecnt,$excerpt) = (0,0,"");
    while (<F>) {
        chomp;
        if (/^\-\-\-$/) { $sep++ ; next; }
        if ($sep == 1) {
            if (/^title:\s+"(.+)"$/) {
                printf("\t\t<title>%s</title>\n",$1);
            } elsif (/^date:\s+"(.+)"$/) {
                printf("\t\t<pubDate>%s</pubDate>\n",$1);
            }
        } elsif ($sep >= 2) {
            next if (length($_) == 0);
            if ($linecnt++ == 0) {
                print "\t\t<description><![CDATA[";
                $excerpt = $_;
            } elsif ($linecnt < 9 || length($excerpt) < 500) {
                $excerpt .= " " . $_;
            } else {
                last;
            }
        }
    }
    print $excerpt . "]]></description>\n" if ($linecnt > 0);
    print "\t</item>\n";

    close(F) || die("Cannot close $f");
}


while (<@ARGV>) {
    item($_);
}


print "</channel>\n</rss>\n";

Source code for mkdwnrss is in GitHub.

During development I checked whether my RSS looks similar to the RSS feed in WordPress: feed. I also checked Alex Le's blog post on RSS feed: Create An RSS Feed From Scratch.

Added 08-Jul-2021: When checking the RSS in W3C Feed Validation Service the dates and descriptions were marked as non-compliant. This is now corrected. Checking now gives: Valid RSS

Added 17-Jan-2022: Also see Generate RSS from HTML.

Added 15-Aug-2022: Since today Simplified Saaze can generate RSS directly with the -r flag. The PHP code used in that is described here: RSS XML Feed.