Regular Expressions and Shortcuts, Part 1

Today as I worked on finishing up a major revision to my Blog Post Publish shortcut, it occurred to me that, in a relatively short period of time, I went from struggling to understand regular expressions in general, and how regular expressions worked in Shortcuts in particular, to using them all the time in my shortcuts.

I decided I’d start a series about regular expressions and how to use them productively with the Shortcuts app, not because I’m a genius with regular expressions, but to show that anyone can learn them and that they are indeed useful and powerful when used for creating shortcuts.

Regular Expressions

The term “regular expressions” sounds a little odd, but basically regular expressions are just patterns used for searching text. They are extremely useful for things like extracting specific bits of information from text, for replacing specific things in text, or for validating text input for an app or web page.

To be more precise, regular expressions can help you match sequences of characters that match a specific pattern, even if you don’t know what the exact text that matches will be.

Here’s a pretty simple example of what I’m talking about: this website has a page of links to sites and pages I like. There are several categories of links, each with an icon, a category name, and links applicable to the category.

The information for each category is contained in its own json file. The json files live on the server under the /hugo-files/data/links directory path.

I also use the name of the file as the name of the link category. For example, the links for the section titled “Apple” are in a file called “apple.json”. This means that at some point, I need to grab the name of the file without the extension.

In order to update these .json files to add new links to a section, I use a shortcut called Publish Links. One of the steps in the shortcut lists the contents of hugo-files/data/links/ on the server and returns the results. It returns the results with the full file paths, because of how the third-party app I use from within Shortcuts to connect to my server lists directory contents:

Shortcut steps for listing my link JSON files directory


The output from shortcut steps listing my link JSON files directory


But that’s ok. Given the full file paths as shown above, I can extract the part of the file names I want pretty easily with the following regular expression:

^(?:.+\/){1,}(.+)\.json$

That weird looking string does the following:

  • Looks for a line of text starting with one or more characters followed by a forward slash, repeated one or more times (meaning hugo-files/ and hugo-files/data/links/ will both match, for example).
  • Looks for any number of characters after that, followed by .json at the end of the line of text.
  • Captures the characters between the last forward slash and the .json file extension (or in other words, the name of the file without the extension).

A picture is worth 1000 words (or special characters):

A regular expression for capturing file names from a file path

Each of those lines of text in the lower text box are a match to the regular expression (so they are highlighted in dark blue), but only the file names excluding the file extensions are captured in a capture group (so they have a red underline).

What these means is that while each of those entire lines matches my regular expression pattern, the file names without the .json extensions are put into a special group known as a capture group for me to use without having to strip away the directory names or file extensions.

In the next part of this series, I’ll show you how this regular expression works!