SIGPIPE 13
Programming, automation, algorithms, macOS, and more.
Posted by Allan Odgaard
{{ numberOfComments('2020-05-22-macos-catalina-slow-by-design') }}
macOS 10.15: Slow by Design
In episode 379 of ATP both Marco Arment and John Siracusa described noticeable delays and stalls after upgrading to macOS 10.15.
I have been struggling with this issue myself and have found several system operations that can cause these delays, which I will detail below.
One way to solve the delays is to disable your internet connection. This is tough medicine, but if you notice these delays, try it for an hour just to verify that indeed the issue is resolved by disabling internet connectivity.
Another way to reduce the delays is by disabling System Integrity Protection. I say reduce, because I still do get some delays even with SIP disabled, but the system does overall feel much faster, and I would strongly recommend anyone who thinks their system is sluggish to do the same.
Posted by Allan Odgaard
{{ numberOfComments('2018-09-28-creating-a-faster-jekyll') }}
Creating a Faster Jekyll
Jekyll is a static site generator which we recently adopted for most of https://macromates.com motivated by its nice design and large userbase.
We did however run into performance issues so we wrote a replacement which is semi-compatible with Jekyll but with better speed and some additional features
Posted by Allan Odgaard
{{ numberOfComments('2014-08-17-run-command-every-other-week') }}
Run Command Every Other Week
I run a few things via cron, some of them need to run in intervals that cannot be expressed, for example biweekly or every 8th month.
As a general solution I created the every
command available here.
The supported usage is:
every [-n number] command [argument ...]
This will run command
every number
time it’s invoked. For example to send an email the third, sixth, ninth, etc. time we call it, use:
every -n3 mail -s"Water the plants" me@example.org <<< "It’s time again!"
Using this in a crontab to remind us every second Wednesday could be done as:
# m h dom mon dow command
00 12 * * wed every -n2 mail -s"Water the plants" me@example.org <<< "It’s time again!"
How it Works
The command uses a guard file written to $XDG_DATA_HOME/every
. If XDG_DATA_HOME
is unset then it defaults to $HOME/.local/share
.
The name of the guard file is derived from the arguments passed to every
(using sha1) and the content of the guard file is a counter to keep track of how many times we have been called. As a convenience we also write the command to the guard file.
Once the counter reaches the value given via -n
then every
will remove the guard file and exec
your command.
The command is implemented as a bash script and should work on both OS X and GNU/Linux.
Alternative Solution
If the external guard file is undesired or readability is not a concern, then an alternative approach is to use modular arithmetic with the UNIX epoch returned by date +%s
. For an example see this post.
Posted by Allan Odgaard
Path Completion (bash)
If you upgraded to Mountain Lion and often want to cd
into ~/Library/Application Support
you might be a little annoyed by the new Application Scripts
directory that makes the normal ~/Library/Ap⇥
stop at ~/Library/Application S‸
to have you disambiguate the path.
To avoid this you can set the FIGNORE
variable. From man bash
:
FIGNORE
A colon-separated list of suffixes to ignore when
performing filename completion (see READLINE below). A
filename whose suffix matches one of the entries in
FIGNORE is excluded from the list of matched file-
names. A sample value is ".o:~".
So if you set this in your bash startup file:
FIGNORE=".o:~:Application Scripts"
Then it will completely ignore that folder and do the full expansion.
Some other useful variables you can set in ~/.inputrc
that (IMHO) improve the default behavior of filename completion:
completion-ignore-case (Off)
If set to On, readline performs filename matching and
completion in a case-insensitive fashion.
mark-symlinked-directories (Off)
If set to On, completed names which are symbolic links
to directories have a slash appended (subject to the
value of mark-directories).
show-all-if-ambiguous (Off)
This alters the default behavior of the completion
functions. If set to On, words which have more than one
possible completion cause the matches to be listed
immediately instead of ringing the bell.
So my recommendation is to go with this:
set completion-ignore-case on
set mark-symlinked-directories on
set show-all-if-ambiguous on
The ignore case allows you to type ~/l⇥
and still get ~/Library/
.
Marking symlinked directories is useful for /tmp
, /etc
, and /var
.
Showing all when ambiguous instead of ringing the bell… who came up with these defaults?
Posted by Allan Odgaard
Beating Binary Search
Jay from LinkedIn’s SNA team writes:
Quick, what is the fastest way to search a sorted array?
Binary search, right?
Wrong. There is actually a method called interpolation search
Posted by Allan Odgaard
{{ numberOfComments('2010-05-06-accessing-protected-data') }}
Accessing Protected Data
Whenever I see something that intrigues me, my mind makes a note of it and then subconsciously works toward finding a use-case for my newfound knowledge.
An example is that I recently learned how protected member data (C++) is actually not safe from outside pryers (even in clean code that does not use typecasts).
Posted by Allan Odgaard
GCC 4.5 & C++0x
GCC 4.5.0 is out and their progress on implementing C++0x features is coming along nicely.
If you are on OS X and want to try it out you can install it via MacPorts:
sudo port install gcc45
The binary installed is named g++-mp-4.5
and you must use the -std=c++0x
argument to enable the new features.
Of the supported C++0x features here are some of those that I find the most interesting (for my use of C++).
Posted by Allan Odgaard
Parallel BZip2
I ran some benchmarks which included PBZip2, a multi-threaded implementation of BZip2 (which is slow yet effective, so my preferred choice of compressor for basically everything).
Running the Burrows–Wheeler transform over the input blocks is a task well suited for being parallelized and the benchmarks show that Jeff Gilchrist did a great job at this:
Compressor | Time | Archive Size |
---|---|---|
None (cat) | 2.3s | 50 MB |
GZip | 4.0s | 34 MB |
BZip2 | 16.3s | 29 MB |
PBZip2 | 3.0s | 29 MB |
LZip | 41.8s | 24 MB |
The timings were produced by running the code below 4 times and taking the average of the last 3 runs (for each compressor).
This was executed on a 2 × 2.8 GHz Quad Core Mac Pro where PBZip2
(correctly) auto-detected 8 cores.
I am running PBZip2 version 1.1.0 from MacPorts (sudo port install pbzip2
).
for Z in cat gzip bzip2 pbzip2 lzip; do
time tar -cf "${Z}.res" --use-compress-prog="${Z}" Avian
done
Update: Added test with LZip (an LZMA based compresser). There is a multi-threaded implementation of this (plzip
) but a quick ./configure && make
did not cut it.
Posted by Allan Odgaard
Search Path for CD
I just learned this neat thing about the cd
shell command:
The variable
CDPATH
defines the search path for the directory containing «dir». Alternative directory names inCDPATH
are separated by a colon (:
). A null directory name is the same as the current directory. If «dir» begins with a slash (/
), thenCDPATH
is not used.
For example:
% export CDPATH=$HOME/Source:$HOME/Library/Application\ Support/TextMate
% cd Avian/
/Users/duff/Source/Avian
% cd Bundles/
/Users/duff/Library/Application Support/TextMate/Bundles
% cd Support/lib/
/Users/duff/Library/Application Support/TextMate/Support/lib
% cd Avian/Frameworks/
/Users/duff/Source/Avian/Frameworks
This works with tab completion (using bash 4.1.2) so regardless of the current directory, I can generally do cd Av⇥↩
to reach ~/Source/Avian
.
Posted by Allan Odgaard
{{ numberOfComments('2010-01-23-build-automation-part-2') }}
Build Automation Part 2
This is part 2 of what I think will end up as four parts. This might be a bit of a rehash of the first part, but I skimmed lightly over why it actually is that I am so fond of make
compared to most other build systems, so I will elaborate with some examples.
Part 3 will be a general post about declarative systems, not directly related to build automation. Part 4 should be about auto-generating the make files (which is part of the motivation for writing about declarative systems first).
Posted by Allan Odgaard
{{ numberOfComments('2010-01-15-build-automation-part-1') }}
Build Automation Part 1
A blog post about Ant vs. Maven concludes that “the best build tool is the one you write yourself” and the Programmer Competency Matrix has “can setup a script to build the system” as requirement for reaching the higher levels in the “build automation” row.
I have looked at a lot of build systems myself, and while I agree that the best build system is the one you create yourself I am also a big fan of make
and believe that the best approach is to use generated Makefiles.
This post is a “getting started with make
”. I plan to follow up with a part 2 about how to handle auto-generated self-updating Makefiles.
Posted by Allan Odgaard
Self-balancing Trees
In a previous blog post I describe a data structure which require the use of a self-balancing binary search tree.
Posted by Allan Odgaard
Cuckoo Hashing
The Achilles’ heel of hashing is collision: When we want to insert a new value into the hash table and the slot is already filled, we use a fallback strategy to find another slot, for example linear probing.
The fallback strategy can affect lookup time since we need to do the same probing when a lookup results in an entry with wrong key, turning the nice O(1) time complexity into (worst case) O(n).
Posted by Allan Odgaard
Maintaining a Layout
TextMate works with fixed-width fonts both because of the simplicity and because it is the immediate difference between a plain text editor and a word processor.
Though for version 2.0 I want it to do a richer layout, e.g. larger headings in markup languages, indented soft wrap, proper support for unicode, etc. So I had to bite the bullet and figure out how to allow this with reasonable performance, this article explains the problem and data structure I picked.
Posted by Allan Odgaard
{{ numberOfComments('2009-08-11-blog-spam-filtering-ideas') }}
Blog Spam Filtering Ideas
I have previously detailed how I fight comment spam using a JavaScript challenge.
I host two blogs, a wiki, and a ticket system, all targets for spam, so I have since generalized the system by using mod_rewrite
to redirect all POSTs without a cookie to a page which uses JavaScript to set this cookie and resubmit the request (which is then no longer catched by mod_rewrite
due to the cookie being set). This means “blocking” spam doesn’t require a plug-in written specifically for the particular web application.
Despite this JS challenge some spam still gets through, and that’s what this post is about.
Posted by Allan Odgaard
{{ numberOfComments('2009-08-01-get-os-version-from-scripts') }}
Get OS Version From Scripts
It is sometimes useful to have a script check the OS version, for example the way to get the user’s full name was previously done using niutil
but Apple removed that command in Leopard (it can now be done using dscl
).
Posted by Allan Odgaard
{{ numberOfComments('2009-08-01-optimizing-path-normalization') }}
Optimizing Path Normalization
One of my path functions is normalize
. It removes (redundant) slashes and references to directory meta entries (current and parent directory).
A lot of other path functions use or rely on normalize
, for example my version of dirname()
is simply: return normalize(path + "/..");
.
I was recently tasked with rewriting normalize
to be more efficient and it proved to be a bit of a challenge, so I’ll share what I came up with.
Posted by Allan Odgaard
Worker Thread Protocol
When two components are used together, let’s call them A and B, it is a good approach to figure out who is using whom, and if A is using B then B should not know about A and vice versa.
This rule of thumb lowers complexity and makes both refactoring and re-use of code easier.
One scenario where it might be appealing to ignore this rule is when outsourcing computation to a worker thread, but here it is actually more important to stick with it.
Posted by Allan Odgaard
{{ numberOfComments('2009-07-27-simplifying-boolean-expressions') }}
Simplifying Boolean Expressions
I recently had a boolean expression of the following form:
a || (x && b) || (x && y && c) || (x && y && z && d)
It looked redundant with 10 instances of only 7 different variables.
Posted by Allan Odgaard
UTI Problems
I was excited to use the “new” Universal Type Identifiers but excitement turned to confusion and a bit of disappointment. I will share my findings in this article.
Posted by Allan Odgaard
Automatic Storage
One of the things I like about C++ is the ability to have the compiler create code for me that does actual work.
What do I mean? I am thinking about implicit conversions (wrapping) of data types and constructing/destructing data types when they go in/out of scope.
I will focus on the latter in this blog post, show how it can be used with Objective-C and how it can track leaks in C++ code.
Posted by Allan Odgaard
Objective-C++ Tips
C++ Objects as Instance Data
Say you create a custom view with arbitrary many
tracking rectangles (i.e. dynamically added).
Each time you add a rectangle you get back an
identifier for this rectangle which can’t be
stored in an NSArray
as-is since it
is of the primitive type NSTrackingRectTag
(an integer).
If you use Objective-C++ then you can use a
std::vector<NSTrackingRectTag>
to avoid
having to box/unbox your identifiers but
if you have tried to put non-POD in the interface
declaration of your Objective-C class you have probably
seen that gcc
does not like that.
Well, starting with 10.4 (so actually, some time ago)
Apple added a switch to gcc
which allows
C++ objects as part of the instance data, and it will
call both constructor and destructor for your C++
objects when allocating/deallocating the Objective-C
object.
The flag you need to set is -fobjc-call-cxx-cdtors
.
C++ Objects as Method Arguments
Occasionally it is convenient to pass a C++ object
to an Objective-C method. For example I have an NSString
initializer that takes a std::string
as argument.
This works as long as you pass the object as a
reference (i.e. pass a pointer), but you can use
the “reference of” operator in the method signature
rather than at the call-site. By using a const
reference it will work for temporary/implicit objects.
So with the following method:
+ (NSString*)stringWithCxxString:(std::string const&)cxxString
{
return [[[NSString alloc] initWithBytes:cxxString.data()
length:cxxString.size()
encoding:NSUTF8StringEncoding] autorelease];
}
We can have code like this:
std::string dir = get_some_dir();
std::string file = get_some_file();
NSString* str = [NSString stringWithCxxString:dir + file];
Posted by Allan Odgaard
{{ numberOfComments('2006-04-20-message-catalogs-on-darwin') }}
Message Catalogs on Darwin
I wanted to localize a shell command to give danish output and decided to look into the message catalog functions described in/by XPG4.
Posted by Allan Odgaard
{{ numberOfComments('2005-10-11-clipboard-access-from-shell-utf-8') }}
Clipboard Access From Shell (UTF-8)
Update 2011-01-27: Recent versions of Mac OS X make these replacements obsolete.
Two very nice shell commands that Apple has given us are the pbcopy
and pbpaste
commands. These allow stdin to go to the clipboard and the clipboard to be written to stdout.
Unfortunately the commands seem to use a combination of MacRoman and question marks for non-ASCII characters, which often makes them unusable for me, since I work with non-ASCII characters.
So today I decided to write a replacement for the two commands (yes, I did also file an enhancement report). You can download them here.
There’s just one source, it compiles to a command which works as pbcopy
, when called under that name, otherwise pbpaste
.
What I’ve done is place the command in ~/bin
and added a symbolic link from pbpaste
to pbcopy
, like this:
ln -s pbcopy ~/bin/pbpaste
And in addition ensured that my PATH contains ~/bin
before anything else, i.e. by placing the following in my ~/.bash_profile
(well, actually ~/.zshrc
):
export PATH="$HOME/bin:/opt/local/bin:$PATH:/Developer/Tools"
The source is included in the archive, and it’s very simple. No usage instructions etc., and it links with the Application Kit, since NSPasteboard is under that and not Foundation Kit.
Posted by Allan Odgaard
Progress Indicator for Unarchiving
I added a software updater to my application, and one of the steps was uncompressing the archive (after downloading it). Since the archive size is a few megabytes, and I use bzip2 as compression, this step takes a few seconds, and thus I want to show a determinate progress indicator while it is working on this.