Until now, Astro hasn’t had a built-in way to dump image links in straight Markdown content files and have Astro generate optimized images and responsive HTML for them. This caused me a problem, which I partially solved by using MDX instead of Markdown for blog posts, and importing and calling Astro Image inside the MDX post files. This SOUNDS great, because this is the whole purpose of MDX, in MDX’s own words:
MDX allows you to use JSX in your markdown content. You can import components, such as interactive charts or alerts, and embed them within your content. This makes writing long-form content with components a blast. 🚀
There are, however, a couple problems with this. One of them I’ve spoken about on this site, which is MDX makes it very hard to generate full-content RSS feeds with Astro (part 1 and part 2 of that saga here and here).
Also, using the Astro Image component directly in my content means mixing writing and implementation details, something I strongly dislike. When I’m writing a blog post, I don’t WANT to have to remember Astro Image syntax, and I don’t WANT to have to remember exactly what widths I like to specify and what media-query-ish styling I put in the sizes attribute. I just want to write and to let my system handle all that by itself. That’s what computers are for.
Here’s what it looks like when I want to put an image in one of my posts using MDX as my content file format and the Astro Image component directly inside my blog post:
I don’t want to remember that. I never want to think about that at all. I want to put an image link in using standard markdown and have Astro do all that for me.
I have two pieces of good news for you if you’re in the same boat as me:
The wonderful people at Astro are building an Astro Assets integration that can create optimized versions of and responsive img tags for images linked to in Markdown.
In the meantime, you can use the really nice and fully functional Astro Markdown Eleventy Image Astro component by CJ Ohanaja. As you may have guessed, it uses Eleventy Image to do the work of intercepting Markdown image links and replacing them with responsive ones (and generating the responsive images themselves, of course).
The Astro Assets integration loudly proclaims itself as experimental, and that’s not self-deprecation: it won’t build. It runs great in the dev server, but it gives all kinds of wacky errors when trying to build. But just using it in dev mode is enough to see the future, and it’s great.
As for Astro Markdown Eleventy Image, it works great in build, but it doesn’t bother to optimize anything in dev mode. That means if you use the browser inspector tools to look at your images while testing in dev mode, you’ll see gigantic original file sizes. You’ll have to build and run preview to serve up the built pages locally to see its handiwork.
But the good news is, you can quit or never start using MDX right this minute, and you can still have optimized images from Markdown image links with Astro.
By the way, in case you’ve forgotten my RSS story at the start of this, now that I’m using straight Markdown files for my posts again, I can just straight up go back to using Astro RSS and generate an RSS feed with full post content, and not have to do my hacky custom nonsense anymore.
That’s such good news for me, because that hack only generated the RSS file in dev mode, so every time I did a build I had to copy the RSS.xml to the dist folder, AND remember to change all the link prefixes from http://localhost:3000 to https://scottwillsey.com.
Another annoying implementation detail I never want to think about again, vanquished!
You’ll notice I said transcript functionality. I’m weasel-wording it a bit there because now I need to generate transcripts for all the episodes. So far I have them for episodes 1, 22, and 26 (the current podcast episode as of this writing).
On the Friends with Brews homepage, click the Transcripts link under the podcast description paragraph, and you’ll see a list of available transcripts.
In addition, any episodes with available transcripts will show a transcript link under its episode title.
The transcript pages themselves have links to the episode page, to the transcripts index, and to the episodes index, in addition to an episode description followed by the transcript.
I still have more work to do on this feature. I plan to make the raw transcripts downloadable, and also to integrate them into the RSS feed with srt formatting, at the suggestion of John Chidgey.
One thing you’ll notice right away is the transcripts are not perfect. I haven’t done any A B testing yet, but I think the transcript better separates transitions in speakers if I only output the transcript as a raw text file and don’t simultaneously output the srt file and the raw text file. At any rate, Whisper.cpp doesn’t know about individual speakers, and so there are no names showing who is saying what.
Also, Whisper gets some things wrong, and there will occasionally be some confusing text that doesn’t exactly match what we were saying at that point. Overall though, I think they’re pretty good and at least you can search the site and find what episode contained some specific mention of something. Again, it’s not perfect - if you search for Shaquille O’Neal (mentioned in episode 1), you won’t find him, because the transcript butchered the spelling of his name and I didn’t fix every typo that Whisper.cpp made.
Still, I think having transcripts, even imperfect ones, is a net gain for the site and the podcast. It adds more work for me as I have to generate them and then clean them up, but now that I have the functionality built into my Astro code, getting newly generated transcripts published is a snap.
I’ve written so much about images and image optimization and yet the reality is I still have no clue exactly how it works.
Case in point: I installed Christian Ohanaja’s Astro Remark Eleventy Image plugin to parse my Friends with Brews show notes markdown files and replace any markdown images with responsive images (it both generates the image sources and creates the responsive HTML, as with any real image optimizer).
In the version I installed at the time, I immediately found that because the large source image’s width and height were included in the img element’s width and height properties, the browser ignored the size directives in the sources, and displayed the image at the x and y dimensions specified in the img tag.
Kind of defeats the point of size directives.
This is NOT an issue with Astro Remark Eleventy Image. In fact Christian now allows custom HTML markup to override this. This happens with any Picture element that includes an img tag with width and height properties. It doesn’t matter if it’s handwritten, generated by this plugin, generated directly using eleventy-img, or generated using some other image optimization plugin or scheme.
The biggest issue with NOT including them so that the browser respects the size directives instead is that now you’re subject to Cumulative Layout Shift (CLS)1 because the browser doesn’t understand how large the image will be in advance.
If anyone knows of a way to use Picture element sizes without overriding them unintentionally with img height and width but still managing to avoid CLS, I’d love to hear more about it. Tell me!
I’ve written a bunch of words on this site about programming stuff in Astro, but there are bunches of other things that can be scripted too. Literal Bunches in fact – enter Bunch, a Mac automation app for launching apps and running commands with just a click. It’s written by Brett Terpstra, which is a name any Mac automation geek will know.
Bunch works as a menubar app that lists your Bunches. Click on a Bunch in the list, and it executes whatever is inside that Bunch, be it names of apps to launch or to close, or commands that can include system tasks, AppleScripts, Automator workflows, or even Bash scripts.
By default, Bunches are toggles – the first time you click on a Bunch name in the menubar list, the Bunch opens. Any apps or commands that are set to open or run do so. The next time you click the Bunch in the menubar list, it does the reverse. It closes any apps that are not explicitly set to remain open when the Bunch is toggled off (or “closed”, in Bunch parlance). It also runs any commands you have set up specifically to run when the Bunch is closed.
Talking about in the abstract isn’t super helpful. So here’s a podcast Bunch of mine! Please note that I’m still not super fluent in Bunch and this is void where prohibited and etc, etc, etc.
That’s a lot. Here’s how it works:
This top section is frontmatter and just determines what this Bunch is called in the menubar.
call_app = ?[FaceTime, Discord, Skype, Zoom] "Which calling app?" pops up a dialog box with a menu too choose which app I’m talking to cohosts on. Truthfully, it’s going to almost always be FaceTime for Friends with Brews, but other people on other podcasts use different ones. Slight aside, I’m a firm believer of podcasters always recording their end locally and the editor using all the original (better sounding) tracks, but not everyone does this.
The next couple lines open my soundboard app Farrago and then whichever communication app I selected from the menu mentioned above.
These two lines open Safari and then load Google Docs, which we use for show notes. The %Safari notation with the percent sign means that when I close the bunch, Safari is not closed along with the other apps in this bunch, but stays open.
This section opens a finder window and opens tabs for me with some podcast related file locations.
Audio Hijack^ just opens Audio Hijack and makes it the active (focused) program.
These illustrate one of Bunch’s coolest features, the ability to call system level commands. These lines do just what they look like: They set my Mac to output audio through my AirPods Pro and to use my virtual Loopback device that combines my mic and my soundboard as my audio in. This means I can set FaceTime or Zoom (or whatever app we’re talking on) to use this as its audio input, and my cohosts can hear whatever I play on the soundboard.
I’m going to cover the rest of this Bunch all at once.
Basically the first line of this says “hey, when this Bunch is closed (toggled off), run the #On Close snippet”. The On Close snippet is in a special section at the bottom that is reserved for any snippets or snippet fragments you want to include.
My On Close snippet just runs a shell script located in the same folder as the Bunch to see if I’m connected to my Studio Display or not, and if I am, sets the output back to the Studio Display speakers. Otherwise, it sets the output to the Mac’s internal speakers.
Because this only runs when the Bunch is closed, meaning I’m done podcasting, this is exactly what I want.
This looks confusing, and I’m not going to lie – it took me awhile to get this working the way I wanted. Part of my issue was that I didn’t understand how Bunches work by default, and I thought I had to make a “Start Podcasting” Bunch and a “Stop Podcasting Bunch”, not realizing that it was set to toggle and just by choosing “Podcasting” again it would close any apps I didn’t explicitly say not to close. The rest of it was just learning the syntax. Fortunately, Brett has written excellent documentation for Bunch.
The fact that you can use conditional logic and use the output from shell scripts and set system settings and so many other things makes this a super flexible, powerful automation tool for the Mac. I used to open all these programs and set my audio settings manually, and now it requires just that many fewer clicks every time I want to podcast.
By the way, Brett has many more amazing utilities. Check out Gather CLI, for example, which lets you fetch the contents of a web page and have them converted to markdown syntax. It’s amazing and it’s perfect for doing things like saving information to Obsidian.
I’m glad he did too, because looking at the remark part of the code, I feel confusion more than anything. I guess I have another rabbit hole to pop down to learn about THAT.
Anyway, being the true jerk that I am, instead of just being grateful and using his plugin, I forked it to add some additional options (such as the ability for the inline image to link to the high resolution version) and to add an Astro component for image optimization. That way I can have one plugin that provides image optimization in both markdown files and inside Astro components.
I’m assuming that I can technically combine a plugin and an Astro component in one project. I actually have no idea, but I’ll find out.
In what feels like a lifetime ago, I had a podcast called Pocket Sized Podcast, talking about iOS apps and devices, mostly. At some point, for reasons I can’t even begin to recall, I joined up with a fledgling podcast network called Fiat Lux, which was later rebranded Constellation by the two fairly angry guys running it. The whole thing was a giant fiasco full of insane stories, but it’s relevant to me now because podcast transcription is having a moment.
Fiat Lux/Constellation decided that the core feature of the podcast network would be incredibly detailed show notes on all podcasts. Unfortunately they had some really bad ideas about exactly what those show notes should be like.1 None of us wanted anything to do with their plan, mainly because of how they presented it and the amount of shouting involved in their attempts to convince us.
If you’re going to try to herd cats, you’d better be a cat person is what I’m saying.
But the idea of making podcast episodes available in text IS a good idea, and several very popular podcasters I know of are looking at all kinds of options for creating good transcripts without spending hours and hours on them.
There are several paid and soon-to-be-paid options such as Adobe Podcast (currently in beta, pricing to be determined), and Otter. But the completely free option that got my attention is a Mac port of OpenAI’s Whisper, called Whisper.cpp.
Whisper runs locally on your own machine, and Whisper.cpp jettisons the Python runtime for C and C++, which has obvious positive performance implications. Better yet, it’s even optimized for Apple Silicon.
I heard about Whisper.cpp while listening to Rebuild from Tatsuhiko Miyagawa, a very enjoyable Japanese language tech podcast. It’s actually one of my favorite podcasts in any language. Anyway, at the time Miyagawa-san was experimenting with Whisper.cpp on his new Apple Silicon Mac, and I filed that information away in my brain, figuring it would be some time before I got an Apple Silicon Mac of my own. It was, but now I have, and so I recently jumped into performing Whisper.cpp experiments of my own.2
Whisper.cpp has several models you can download, depending on what kind of quality vs time tradeoffs you want to make. I’ve tested Whisper.cpp on Friends with Brews episodes using the ggml-base.en.bin, ggml-medium.en.bin, and ggml-large.bin models, with interestingly varying results.
The first thing I found is that the base model is FAST. I transcribed a 50 minute podcast episode in about a minute with decent results. I had to fix a few names and technical terms, but otherwise it was quite good.
I couldn’t tell a huge difference between the medium and large models with the two particular episodes I experimented with, but the time difference between all three models was noticeable. I tested all three models on Friends with Brews episode 21, which is 45 minutes and 29 seconds long and roughly 24 MB in size.
Even with the large model, transcribing a podcast at 2x speed is pretty good.
The end result is that I think it’s worth going with the medium or large models. It’ll cost you disk space – the base english model is 142MB, the medium english model is 1.5GB, and the large model is 2.9GB. But I think it’s worth it in terms of results.
You may have to do some testing to decide between the medium and large though, even if you’re convinced that the base model isn’t the way to go. Generally I think I like the large model results better, but there are some instances where the medium transcribed something more accurately.
Personally I’m using the large model, but that’s because I’m actually using yet another port, which I’ll talk about in another post very soon.
By the way, if you want to hear the Fiat Lux/Constellation stories, just slip @Vichudson1@appdot.net a nice glass of whiskey.3 He has a much better memory than I do about pretty much anything in the past, and especially about the saga of the world’s unhappiest podcast network.
Footnotes
Including wanting markdown format in Google Docs specifically as opposed to any plaintext document format (like, I don’t know, .md?), but whatever. ↩
Whisper.cpp actually runs on Intel Silicon too, but I didn’t realize it at the time. But my late 2015 iMac probably would have barfed up a lung on it anyway. ↩
Fair warning, he’ll probably try to get you to buy him more than one. ↩
Step 1 of the Great Show Note Images odyssey is generating optimized versions of any images to be included in episode show notes. Figuring out which images those are is easy – I have a directory named src/images/episodes, and I’ll just dump my images in there.
From there it’s a matter of reading all the files in the directory, generating the desired sizes, and dumping them in public/images/episodes, which in the published site will be located at /images/episodes.
Because I’m not doing this inside an Astro file with pre-imported or pre-linked images, I can’t use the Astro Image component like I do for all the other images on the Friends with Brews website. I need something I can call from a JavaScript function. Fortunately, as I noted last time, I can use the eleventy-image plugin this way. Ben Holmes details how on his website in Picture perfect image optimization for any web framework article.
If you look at section 4 of his post, Using 11ty image with any framework, you can see a script Ben wrote to look in a directory and generate optimized images for each image file in the directory in the specified widths and formats using the Node package @11ty/eleventy-img.
I modified the script a little bit as shown below. I would have included it as a code block instead of an image, but for some reason it triggers modsecurity on my server and blocks the IP of whoever tries to load this page. Not exactly ideal.
If I run this script with the following images in src/images/episodes
the script generates the following images in public/images/episodes:
This is good news. First of all, Ben did all of my work for me. Second, I can generate optimized images without having to know anything about them in advance. Now I just have the very little problem of replacing image links in my episode markdown file with picture elements that contain the sources for the different file types and srcsets for each of the different image sizes.
By the way, check out the file size differences on the optimized versions versus the originals in those file listings! Even the full width and height optimized images are quite a bit smaller in terms of file size than the originals, and the smaller ones are minute compared to the images I started with.
A couple of things I’d like to note about working with eleventy-img here:
Eleventy-img doesn’t stupidly try to generate image sizes larger than the original. If you specify widths: ["auto", 600, 1000, 1400] and one of the images is only 677 pixels wide, it will only generate the 600 and 677 pixel width versions (the “auto” option tells eleventy-img to also make a copy at the original size).
As you must have guessed by now, just because eleventy-img was developed as a plugin for the Eleventy SSG framework, it’s just JavaScript and can be installed with npm and used with any other Node.js compatible framework. That includes Astro.
Next time I write anything on this site, it may be completely incoherent depending on what Rube Goldberg mechanism I come up with for getting the responsive html for images into my show notes markdown files in the correct location. My writing workflow allows for a few different possibilities since I already know in advance the default width I want these images. Maybe next time I’ll detail that workflow and what options I think might be available to me, and then we can get into implementation.