{"id":163,"date":"2023-12-18T23:28:53","date_gmt":"2023-12-18T22:28:53","guid":{"rendered":"https:\/\/www.softwolves.com\/wolfblog\/?p=163"},"modified":"2024-03-23T21:37:48","modified_gmt":"2024-03-23T20:37:48","slug":"adding-a-mastodon-feed-to-a-static-html-site","status":"publish","type":"post","link":"https:\/\/www.softwolves.com\/wolfblog\/2023\/12\/18\/adding-a-mastodon-feed-to-a-static-html-site\/","title":{"rendered":"Adding a Mastodon feed to a static HTML site"},"content":{"rendered":"\n<p>I do not update my private web page that often, and while I do post in <em>this<\/em> blog every now and then, it can be several months without activity. I have links to the latest few posts from my private web site, using the <a href=\"https:\/\/www.softwolves.com\/wolfblog\/feed\/\">RSS feed<\/a> and a script that converts it to HTML, but as I am more active other places, I felt like I wanted to include that information as well.<\/p>\n\n\n\n<p>I did have a limited presence on Twitter, but <a href=\"https:\/\/edition.cnn.com\/2023\/11\/17\/business\/elon-musk-reveals-his-actual-truth\/index.html\">when it turned all nazi<\/a> last year, I, and many others, left. A number of the people I follow left for Mastodon, the decentralized &#8220;Twitter alternative&#8221;, including <a href=\"https:\/\/universeodon.com\/@georgetakei\">George Takei<\/a> and <a href=\"https:\/\/wandering.shop\/@cstross\">Charles Stross<\/a>, I <a href=\"https:\/\/social.vivaldi.net\/@nafmo\">did so too<\/a>. I had been toying with the idea of including the last few &#8220;toots&#8221; on my website for a while, but never got around to.<\/p>\n\n\n\n<p>First I was looking at using an easy solution like <a href=\"https:\/\/sampsyo.github.io\/emfed\/\">emfed<\/a>, which adds some scripting that downloads and displays the posts in the page. But as I mentioned in the <a href=\"https:\/\/www.softwolves.com\/wolfblog\/2023\/12\/18\/moving-a-website-to-a-new-host\/\" data-type=\"post\" data-id=\"160\">previous post<\/a>, my website is old-school with static HTML, so I wanted something that matched that, including the content as text and links, not as singing and dancing stuff, so I ended up writing my own stuff. I ended up with a shell script that downloads the last few posts, and a Python script that converts the posts into a HTML snippet that I can include in the HTML using Apache <a href=\"https:\/\/httpd.apache.org\/docs\/current\/howto\/ssi.html\">server-side includes<\/a>.<\/p>\n\n\n\n<p>Everything in the Mastodon API is JSON, which is the hype nowadays (I&#8217;m old enough to remember when XML was new and all the hype, so I don&#8217;t think JSON is the solution to everything, either, but it does the job). To parse JSON in my shell script I found <a href=\"https:\/\/github.com\/jqlang\/jq\">jq<\/a>, which was already installed on my hosting service and packaged for all systems I am running. While I know my used ID doesn&#8217;t change, I made the script resilient to that by first looking up the user ID and then download the feed:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: bash; auto-links: false; title: ; notranslate\" title=\"\">\n#!\/bin\/sh\nSERVER=mastodon.example.com\nUSERNAME=myusername\nMAX=10\nUSERID=&quot;$(curl --silent &quot;https:\/\/$SERVER\/api\/v1\/accounts\/lookup?acct=$USERNAME&quot; | jq -r .id)&quot;\nif &#x5B; -z &quot;$USERID&quot; ]; then\n\techo &quot;Failed getting user ID&quot; 1&gt;&amp;2\n\texit 1\nfi\ncurl --silent -o output.json &quot;https:\/\/$SERVER\/api\/v1\/accounts\/$USERID\/statuses?limit=$MAX&quot;\n<\/pre><\/div>\n\n\n<p>This script writes the file as output.json, which I then feed into a simple Python script that reads the latest (i.e. first in the file) posts and writes a short HTML snippet that I can include. Since toots does not have headings like blog posts, there&#8217;s no clean markup-free text that can be copied to the page, everything is provided as HTML, so I have added some code to strip the markup and just give me the text. I also completely ignore any attachments and stuff, you have to click to go to media yourself:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; auto-links: false; title: ; notranslate\" title=\"\">\n#!\/usr\/bin\/python3\n\nimport sys\nimport json\nfrom io import StringIO\nfrom html.parser import HTMLParser\n\n# https:\/\/stackoverflow.com\/a\/925630\nclass MLStripper(HTMLParser):\n    def __init__(self):\n        super().__init__()\n        self.reset()\n        self.strict = False\n        self.convert_charrefs= True\n        self.text = StringIO()\n    def handle_data(self, d):\n        self.text.write(d)\n    def get_data(self):\n        return self.text.getvalue()\n\n# Strip markup from HTML input\ndef strip_tags(html):\n    s = MLStripper()\n    s.feed(html)\n    return s.get_data()\n\ndef latest(file, url):\n    &quot;&quot;&quot;Fetch entries from JSON and print them&quot;&quot;&quot;\n    # Slurp JSON\n    try:\n        with open(file, &#039;rb&#039;) as jsondata:\n            data = json.load(jsondata)\n    except:\n        return\n\n    # Output headers\n    print(&quot;&lt;ul&gt;&quot;)\n\n    # Print the latest five\n    num = 0\n    for item in data:\n        # Hide sensitive and unlisted toots\n        if item&#x5B;&#039;sensitive&#039;]:\n             continue\n        if item&#x5B;&#039;visibility&#039;] == &#039;unlisted&#039;:\n            continue\n        # Extract information\n        link = url + &#039;\/&#039; + item&#x5B;&#039;id&#039;]\n        date = item&#x5B;&#039;created_at&#039;]\n        reply = item&#x5B;&#039;in_reply_to_id&#039;]\n        is_reblog = &#039;reblog&#039; in item and item&#x5B;&#039;reblog&#039;] is not None\n        is_reply = &#039;in_reply_to_id&#039; in item and item&#x5B;&#039;in_reply_to_id&#039;] is not None\n        html = item&#x5B;&#039;content&#039;]\n        text = &#039;Toot&#039;\n        if is_reply:\n            text = &#039;Reply in thread&#039;\n        if is_reblog:\n            text = &#039;Boost @&#039; + item&#x5B;&#039;reblog&#039;]&#x5B;&#039;account&#039;]&#x5B;&#039;acct&#039;]\n            html = item&#x5B;&#039;reblog&#039;]&#x5B;&#039;content&#039;]\n\n        content = strip_tags(html.replace(&#039;&lt;\/p&gt;&lt;p&gt;&#039;, &#039;&lt;\/p&gt;\\n&lt;p&gt;&#039;)).replace(&#039;\\n&#039;, &#039;&lt;br&gt;&#039;)\n            \n        # Truncate date to YYYY-MM-DD\n        datestr = date&#x5B;0:10]\n        outhtml = &#039; &lt;li&gt;&lt;a href=&quot;%s&quot;&gt;%s&lt;\/a&gt; (%s):&lt;br&gt;%s&lt;\/li&gt;&#039; % (link, text, datestr, content)\n        print(outhtml.encode(&#039;ascii&#039;, &#039;xmlcharrefreplace&#039;).decode(&#039;utf-8&#039;))\n        num += 1\n        if num == 5:\n            break\n\n    print(&quot;&lt;\/ul&gt;&quot;)\n\nif sys.argv&#x5B;1] == &#039;output&#039;:\n    latest(&#039;output.json&#039;, &#039;https:\/\/mastodon.example.com\/@myusername&#039;)\n<\/pre><\/div>\n\n\n<p>In both scripts, replace &#8220;mastodon.example.com&#8221; with the actual host name of your instance and &#8220;myusername&#8221; with the handle.<\/p>\n\n\n\n<p>All scripts are included without any warranties. If it breaks, you get to keep both parts.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I do not update my private web page that often, and while I do post in this blog every now and then, it can be several months without activity. I have links to the latest few posts from my private web site, using the RSS feed and a script that converts it to HTML, but [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28,59],"tags":[61,63,62,60],"class_list":["post-163","post","type-post","status-publish","format-standard","hentry","category-internet","category-mastodon","tag-bash","tag-html","tag-mastodon","tag-python"],"_links":{"self":[{"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/posts\/163","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/comments?post=163"}],"version-history":[{"count":5,"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/posts\/163\/revisions"}],"predecessor-version":[{"id":197,"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/posts\/163\/revisions\/197"}],"wp:attachment":[{"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/media?parent=163"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/categories?post=163"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.softwolves.com\/wolfblog\/wp-json\/wp\/v2\/tags?post=163"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}