Build Automation Part 2

January 23rd, 2010

This is part 2 of what I think will end up as four parts. This might be a bit of a rehash of the first part, but I skimmed lightly over why it actually is that I am so fond of make compared to most other build systems, so I will elaborate with some examples.

Part 3 will be a general post about declarative systems, not directly related to build automation. Part 4 should be about auto-generating the make files (which is part of the motivation for writing about declarative systems first).

Read the rest of this entry »

Build Automation Part 1

January 15th, 2010

A blog post about Ant vs. Maven concludes that “the best build tool is the one you write yourself” and the Programmer Competency Matrix has “can setup a script to build the system” as requirement for reaching the higher levels in the “build automation” row.

I have looked at a lot of build systems myself, and while I agree that the best build system is the one you create yourself I am also a big fan of make and believe that the best approach is to use generated Makefiles.

This post is a “getting started with make”. I plan to follow up with a part 2 about how to handle auto-generated self-updating Makefiles.

Read the rest of this entry »

Self-balancing Trees

August 22nd, 2009

In a previous blog post I describe a data structure which require the use of a self-balancing binary search tree.

Read the rest of this entry »

Cuckoo Hashing

August 18th, 2009

The Achilles’ heel of hashing is collision: When we want to insert a new value into the hash table and the slot is already filled, we use a fallback strategy to find another slot, for example linear probing.

The fallback strategy can affect lookup time since we need to do the same probing when a lookup results in an entry with wrong key, turning the nice O(1) time complexity into (worst case) O(n).

Read the rest of this entry »

Maintaining a Layout

August 13th, 2009

TextMate works with fixed-width fonts both because of the simplicity and because it is the immediate difference between a plain text editor and a word processor.

Though for version 2.0 I want it to do a richer layout, e.g. larger headings in markup languages, indented soft wrap, proper support for unicode, etc. So I had to bite the bullet and figure out how to allow this with reasonable performance, this article explains the problem and data structure I picked.

Read the rest of this entry »

Blog Spam Filtering Ideas

August 11th, 2009

I have previously detailed how I fight comment spam using a JavaScript challenge.

I host two blogs, a wiki, and a ticket system, all targets for spam, so I have since generalized the system by using mod_rewrite to redirect all POSTs without a cookie to a page which uses JavaScript to set this cookie and resubmit the request (which is then no longer catched by mod_rewrite due to the cookie being set). This means “blocking” spam doesn’t require a plug-in written specifically for the particular web application.

Despite this JS challenge some spam still gets through, and that’s what this post is about.

Read the rest of this entry »

Get OS Version From Scripts

August 1st, 2009

It is sometimes useful to have a script check the OS version, for example the way to get the user’s full name was previously done using niutil but Apple removed that command in Leopard (it can now be done using dscl).

Read the rest of this entry »

Optimizing Path Normalization

August 1st, 2009

One of my path functions is normalize. It removes (redundant) slashes and references to directory meta entries (current and parent directory).

A lot of other path functions use or rely on normalize, for example my version of dirname() is simply: return normalize(path + "/..");.

I was recently tasked with rewriting normalize to be more efficient and it proved to be a bit of a challenge, so I’ll share what I came up with.

Read the rest of this entry »

Worker Thread Protocol

July 30th, 2009

When two components are used together, let’s call them A and B, it is a good approach to figure out who is using whom, and if A is using B then B should not know about A and vice versa.

This rule of thumb lowers complexity and makes both refactoring and re-use of code easier.

One scenario where it might be appealing to ignore this rule is when outsourcing computation to a worker thread, but here it is actually more important to stick with it.

Read the rest of this entry »

Simplifying Boolean Expressions

July 27th, 2009

I recently had a boolean expression of the following form:

a || (x && b) || (x && y && c) || (x && y && z && d)

It looked redundant with 10 instances of only 7 different variables.

Read the rest of this entry »

UTI Problems

March 9th, 2009

I was excited to use the “new” Universal Type Identifiers but excitement turned to confusion and a bit of disappointment. I will share my findings in this article.

Read the rest of this entry »

Automatic Storage

September 19th, 2008

One of the things I like about C++ is the ability to have the compiler create code for me that does actual work.

What do I mean? I am thinking about implicit conversions (wrapping) of data types and constructing/destructing data types when they go in/out of scope.

I will focus on the latter in this blog post, show how it can be used with Objective-C and how it can track leaks in C++ code.

Read the rest of this entry »

Objective-C++ Tips

May 22nd, 2008

C++ Objects as Instance Data

Say you create a custom view with arbitrary many tracking rectangles (i.e. dynamically added).

Each time you add a rectangle you get back an identifier for this rectangle which can’t be stored in an NSArray as-is since it is of the primitive type NSTrackingRectTag (an integer).

If you use Objective-C++ then you can use a std::vector<NSTrackingRectTag> to avoid having to box/unbox your identifiers but if you have tried to put non-POD in the interface declaration of your Objective-C class you have probably seen that gcc does not like that.

Well, starting with 10.4 (so actually, some time ago) Apple added a switch to gcc which allows C++ objects as part of the instance data, and it will call both constructor and destructor for your C++ objects when allocating/deallocating the Objective-C object.

The flag you need to set is -fobjc-call-cxx-cdtors.

C++ Objects as Method Arguments

Occasionally it is convenient to pass a C++ object to an Objective-C method. For example I have an NSString initializer that takes a std::string as argument.

This works as long as you pass the object as a reference (i.e. pass a pointer), but you can use the “reference of” operator in the method signature rather than at the call-site. By using a const reference it will work for temporary/implicit objects.

So with the following method:

+ (NSString*)stringWithCxxString:(std::string const&)cxxString
{
   return [[[NSString alloc] initWithBytes:cxxString.data()
                                    length:cxxString.size()
                                  encoding:NSUTF8StringEncoding] autorelease];
}

We can have code like this:

std::string dir  = get_some_dir();
std::string file = get_some_file();

NSString* str    = [NSString stringWithCxxString:dir + file];

Message Catalogs on Darwin

April 20th, 2006

I wanted to localize a shell command to give danish output and decided to look into the message catalog functions described in/by XPG4.

Read the rest of this entry »

Clipboard access from shell (utf-8)

October 11th, 2005

Two very nice shell commands that Apple has given us are the pbcopy and pbpaste commands. These allow stdin to go to the clipboard and the clipboard to be written to stdout.

Unfortunately the commands seem to use a combination of MacRoman and question marks for non-ASCII characters, which often makes them unusable for me, since I work with non-ASCII characters.

So today I decided to write a replacement for the two commands (yes, I did also file an enhancement report). You can download them here.

There's just one source, it compiles to a command which works as pbcopy, when called under that name, otherwise pbpaste.

What I've done is place the command in ~/bin and added a symbolic link from pbpaste to pbcopy, like this:

  ln -s pbcopy ~/bin/pbpaste

And in addition ensured that my PATH contains ~/bin before anything else, i.e. by placing the following in my ~/.bash_profile (well, actually ~/.zshrc):

  export PATH="$HOME/bin:/opt/local/bin:$PATH:/Developer/Tools"

The source is included in the archive, and it's very simple. No usage instructions etc., and it links with the Application Kit, since NSPasteboard is under that and not Foundation Kit.

Determinate progress indicator for uncompressing a tar + bzip2 archive

October 2nd, 2005

I added a software updater to my application, and one of the steps was uncompressing the archive (after downloading it). Since the archive size is a few megabytes, and I use bzip2 as compression, this step takes a few seconds, and thus I want to show a determinate progress indicator while it is working on this.

Read the rest of this entry »

Fighting comment spam

September 25th, 2005

Update 2007-07-17: Since I installed the JS challenge almost two years ago it has blocked 83,837 POSTs. Roughly a dozen spam POSTs did defeat the challenge. Looking at the access log for these they do seem to be from actual humans (based on the initial hit having a google referrer, all resources (CSS and images) being fetched, and the delay from last GET to the POST), but it could also be a cleverly scripted browser (not sure of the “economy” of either though).

Recently I've received a lot of comment spam, which is fake comments posted to a blog or wiki (for me once every hour) with the purpose of increasing the page rank for a website.

Looking at the comment spam I have received, I see that more than 90% of the IP addresses are unique (infected Windows machines used as proxies?) so for the challenge I decided to run sha-1 on the visitors IP (plus a constant) and ask for that back when he submits the form.

Read the rest of this entry »

Deciphering an NSEvent

September 24th, 2005

Since I need to handle key equivalents myself, I need to mimic how the system does it.

After having spent some time investigating this, it seems to be impossible to do with the info provided by NSEvent, but I ended up creating a heuristic which works fairly well (but isn't perfect).

Read the rest of this entry »

Shell calculator

July 26th, 2005

They say that old habits die hard, and the following is probably an example of such.

Read the rest of this entry »

Metadata workarounds on Tiger

July 20th, 2005

Tiger introduced functions for file metadata in the BSD layer (see man setxattr for more info).

File metadata is something most programmers have wet dreams about, and naturally I started to use these new functions in TextMate 1.1b12. After this I got several reports of kernel panics when saving files to AFP mounted volumes (personal file sharing).

Read the rest of this entry »