SIGPIPE 13

Programming, automation, algorithms, macOS, and more.

UTI Problems

I was excited to use the “new” Universal Type Identifiers but excitement turned to confusion and a bit of disappointment. I will share my findings in this article.

Problems With Declaring Supported UTIs

Let’s start with the most clear advantage of using UTIs:

[…] if your application can open any type of plain text document, then you simply specify public.plain-text […] within your application’s Info.plist.

So I did that, and while I can drag types conforming to public.plain-text to my application, there are two problems:

  1. If I bring up the action menu for test.cc then my application is not listed in the Open With sub menu. It is there for test.txt, so it shows my application when my declared type exactly matches the file’s type, not when the file’s type merely conforms to my declared type.
  2. If I drag test.cc to the application icon in the Dock, I am not allowed to drop the file there but test.txt works fine, so same problem as above.

I am almost certain that the second problem is a bug, the first one, I hope is a bug.

Problems With Dynamic UTIs

Not all extensions are registered and Apple’s solution to this problem is to create a UTI in the dyn namespace and provide API to get a UTI from a file extension. That way, all applications which reference an unregistered extension will refer to the same UTI.

The API function is UTTypeCreatePreferredIdentifierForTag and it takes inConformingToUTI as the third parameter for which the documentation states:

You can pass a UTI in the inConformingToUTI parameter as a hint, in case the given tag appears in more than one UTI declaration. For example, if you know that the filename extension tag is associated with a file, not a directory, you can pass public.data here, which causes the function to ignore any types with the same extension that conform to public.directory. Pass NULL for this parameter if you have no hints.

So let’s get a UTI for the textClipping file extension, I used this code:

CFStringRef textClippingUTI = UTTypeCreatePreferredIdentifierForTag(
    kUTTagClassFilenameExtension,
    CFSTR("textClipping"),
    NULL);

Now to test if a file is a text clipping we do:

CFStringRef fileUTI;
LSCopyItemAttribute(&fsRef,
    kLSRolesAll, kLSItemContentType, &fileUTI);

if(UTTypeConformsTo(fileUTI, textClippingUTI))
    …

Unfortunately it turns out my file does not conform, even though it definitely is a text clipping.

The problem is with how I created the dynamic UTI for the textClipping extension. If I give public.data as the inConformingToUTI parameter, then I get the proper UTI. But as I read the API documentation, this parameter should only matter if there is an ambiguity finding the UTI for the type passed in.

Another bug?

Limitations

When working with files it is possible to provide better type than just the extension. For example some files are executable, some are symbolic links, etc.

Unfortunately both a shebang executable (without extension) and a compiled binary are public.unix-executable, it would have been nice if the former would be a type that conforms both to public.unix-executable and public.plain-text.

Furthermore, the procedural API to get the UTI for a file requires an FSRef, but creating an FSRef from a symbolic link results in a reference to the pointed-to file, not the symbolic link itself.

Anyway, having different types for README and relaunch (script) is nice, this should technically allow us to associate different applications with the two different types (public.data versus public.unix-executable) but using Get Info from Finder and picking another viewer for either, results in an error which states that “not enough info is available”. Presumably because Launch Services is still working with extensions.

Type Hierarchy Confusion

Looking at public.url-name we see that this is a base type, i.e. it does not conform to anything. I would think it should conform to public.plain-text, but then I would also think that there should be a type like com.apple.xml-property-list which conforms to both com.apple.property-list and public.xml.

I can sort of follow why Apple decided against the XML property list type, as it would require inspecting the file on disk to know the format and for “in memory” data, a property list is just a property list (i.e. Foundation/CoreFoundation data types).

But regarding public.url-name this doesn’t even conform to public.data, yet looking at public.ostype (which technically is an unsigned 32 bit integer) we find that this type does conform to public.plain-text.

So whatever the reasoning, I find it inconsistent.

URL Oversights

In the public type hierarchy we only find one specialization of public.url which is public.file-url so no types for http, mailto, ftp, or similar.

Today where custom URL schemes are common (itms, pcast, feed, torrent, aim, irc, etc.) it’s surprising that URLs do not seem to have been considered for this new type system, as it would allow unification of declaring, and configuring handlers for, URL and document types.

Practical Problems

The reason I embraced UTIs is because I want to provide a general content filter system in TextMate, that is, whether the user drops data on TextMate or pastes it from the clipboard, the type of the data (and the current context) should decide the action, and that should of course be configurable by the user.

Different data types combined with different contexts can lead to different actions, for example pasting a color into a CSS file, pasting PHP code into a HTML document, dropping a URL on a Markdown document, etc.

I had hoped that UTIs would bring me closer to the goal of making it completely configurable by the user, but unfortunately the system doesn’t really make it any simpler to expose this level of configurability to the user.

URL Content Filter

Let’s first look at allowing the user to setup a custom action for URL data. He would specify public.url as the type to handle.

There are a few problems with that:

  1. Pasteboard data from Finder is tagged with public.file-url, so even for regular files, the user action is triggered, which is probably not expected. This even includes dragging a text clipping to the document!
  2. public.url holds a single URL, if we drag a URL from Safari there is also public.url-name which holds the name of the URL. Many user actions want this data as well.
  3. If dragging multiple URLs the public.url type only holds the first — WebKit additionally has the WebURLsWithTitlesPboardType pasteboard type for which there is no (public) UTI, but which holds all the URLs dragged including their titles, i.e. what the user action want (if available), yet is unable to specify as a UTI.

Image Content Filter

Another likely user action is handling images e.g. when dragged to a HTML document.

Here the user would setup his action for the public.image type. But as mentioned above, everything dragged from Finder comes as public.file-url, it does not come as the actual file type.

I can understand the problem though, i.e. Finder is delivering a file URL, it is not delivering the content of the file. So it seems that I should look for the NSFilenamesPboardType type and ask for the UTI of each file dragged, then use that to pick the proper user action.

Only problem here is that when dragging a text clipping, web location, or similar, Finder is actually providing the actual content but is also providing the path to the meta-file storing this info via NSFilenamesPboardType.

One could give precedence to the UTI over NSFilenamesPboardType but if one uses Edit → Copy instead of dragging a file, Finder will place the name of the file on the clipboard tagged with the public.plain-text type.

Color Content Filter

There is no UTI for color data…

Closing Thoughts

While my intended use of the system is probably a little different than the goal the designers had in mind, the challenges described above I think applies equally well to other applications, i.e. relying solely on the UTI for receiving URL drops is not recommendable and heuristics are required to properly decipher the data from Finder.

The latter problem I think comes from unifying URLs, files, and content without any indication of which is which and tagging data with only one type.

For example if I have data of type public.file-url then all I know is that the data is a file URL. I can extract the path from the URL and query the system, let’s say the URL is file:///Users/duff/bin/Markdown.pl. The system will tell me that the file data is public.perl-script but if the script had instead been called markdown the type would be public.unix-executable which refers to the file itself rather than the data. This file property was ignored when the file had the .pl extension.

The lack of expressing what level we are interested in limits the usefulness of the type system when exchanging data. For example there is no way to register for image files (via drag’n’drop) nor is it possible to register for folders or application bundles, since Finder will deliver these as a file URLs.

{{ numberOfCommentsTitle }}

{{ submitComment.success }}

Error Posting Comment

{{ submitComment.error }}