SIGPIPE 13

Oniguruma C++ Wrapper

May 19th, 2005

I've recently partially switched to the Oniguruma regular expression library.

Since I also use regular expressions in my source code I've created a simple C++ wrapper which makes the API more friendly to my tasks. I generally work with iterators, and there are 4 tasks I often do.

  1. Create a pattern object, this is done using:

     ptrn_t ptrn(first, last);
    

    Here first, last is the iterator sequence that contains the regular expression.

  2. Test if a sequence matches a pattern, done using:

     if(find(first, it, last, ptrn))
         ...;
    

    Here it, last is the sequence which is matched, first is the start of the buffer, in case the pattern uses look-behind or starts with a word/line boundary or similar.

  3. Move an iterator to the first occurance of a pattern (or end-of-sequence if no match):

     it = find(first, it, last, ptrn);
    
  4. Examine the captures of a match, if there was one:

     if(match_t const& m = find(first, it, last, ptrn))
     {
        for(int i = 1; i < m.size(); i++)
        {
           if(!m.empty(i))
              cout << string(m.begin(i), m.end(i)) << endl;
        }
     }
    

The wrapper (less than 100 lines) with an example can be downloaded from here. The nice thing about the above API is that a) you don't have to alloc/release resources yourself (and it does reference count on the match_t object in case you make copies) and b) all the cases make use of the same STL-inspired find()-function, so there's little to remember (the match_t class is also inspired by STL with the begin/end and size member functions).

The supplied wrapper uses char sequences and expect them to be UTF-8 encoded. Unfortunately this library can't work with real STL iterators.

Btw: by adding a char* constructor to ptrn_t it's possible to write e.g.:

  it = find(first, it, last, "(foo|bar)");

Which advances it to the first occurrence of either “foo” or “bar” in the it, last sequence. Almost like using a high-level language.

[by Allan Odgaard]


2 Responses to “Oniguruma C++ Wrapper”

  1. Anonymous Says:
    May 23rd, 2005 at 14:52

    have you seen OgreKit?

  2. Allan Odgaard Says:
    June 28th, 2005 at 11:56

    I'm aware of OgreKit, but that's a wrapper for Objective-C/Foundation kit (i.e. NSString, though I think it also has a find panel, making it App kit dependent).

    What I needed was a C++ wrapper, and not to work with NSStrings.


Leave a Reply