SIGPIPE 13

Simplifying Boolean Expressions

July 27th, 2009

I recently had a boolean expression of the following form:

a || (x && b) || (x && y && c) || (x && y && z && d)

It looked redundant with 10 instances of only 7 different variables.

Pretty much the only rule I know for manipulating boolean logic is how to change AND to OR so I didn’t have much luck working on the above until I realized that AND can be seen as multiply and OR as plus (with false being 0 and everything else being true).

Using that insight we instead have:

a + (x · b) + (x · y · c) + (x · y · z · d)

It is now possible to use the algebraic properties with which most of us should be somewhat familiar. For example we can use the distributivity property to only have one instance of x:

a + x · (b + y · c + y · z · d)

We can do exactly the same with y:

a + x · (b + y · (c + z · d))

We now only list each variable once, so this is optimal (assuming that all variables can affect the result of the expression). Going back to boolean operations we end with:

a || x && (b || y && (c || z && d))

Another thing I gained from this exercise is that now I will always be able to remember that AND has higher precedence than OR which is useful if you want to cut down on your use of parenthesis.

[by Allan Odgaard]


11 Responses to “Simplifying Boolean Expressions”

  1. Harry Jordan Says:
    July 27th, 2009 at 14:47

    But you an never have too many parenthesis! :)

  2. Gerd Knops Says:
    August 11th, 2009 at 21:54

    The problem is that the previous form was a heck of a lot easier to read (and therefore maintain) than the end result. So unless it gets executed a trillion times (I tend to leave the more readable version.

  3. Allan Odgaard Says:
    August 11th, 2009 at 22:08

    The motivation for the optimization was due to number of variables not being fixed, and what I had didn’t scale linearly with number of variables — so it wasn’t exactly a performance optimization.

  4. Sebastian Says:
    August 18th, 2009 at 05:43

    You can always use a "Karnaugh map" that will gurantee to give you the minimal expression

  5. jpc Says:
    August 18th, 2009 at 08:45

    The fun part is that OR is also distributive over AND (and that is quite difficult to see for a normal person when you start switching && and || to dots and pluses). :) Compare: (A || B) && (A || C) == A || (B && C) to (A + B) * (A + C) == A + (B * C)

  6. Stephan Says:
    August 30th, 2009 at 03:22

    @Sebastian: Karnaugh maps give you a easy way to build a dis- or conjuctive normal form; with already is given at the beginning. It is limited to "two levels", which helps to keep the propagation time short in circuits, but doesn't matter in programming. There may always be a shorter form if one uses more parenthesis. Furthermore, a karnaugh map is not that easy to build for more than 4 variables; or maybe 6 if you use 3d. Other roles can be found on http://en.wikipedia.org/wiki/Boolean_algebra_(structure)

    The final expression uses 6 instead of 9 operators, which roughly should need only 1/3 less time to execute. The compiler probably optimizes both cases to the short form, because it's mathematically exactly the same.

    So much about optimizing. But simplifying? In my opinion, the first one was simpler. And save on parenthesis can bite back, too. Just like using the form "c || z && d" which is confusing while reading from left to right instead of "d && z || c". The second one doesn't change it's meaning while reading.

  7. Allan Odgaard Says:
    September 2nd, 2009 at 11:30

    Stephan: As mentioned above, the simplifying was in the context of scalability. The expression I show has 7 different variables but > 7 uses of variables. In the context where this is used, the number of variables is not fixed, so non-linear scaling is undesired.

    Harry / Stephan: Parenthesis are good when they aid in readability. In some cases I’d argue they don’t, for example a shell script like:

    make || exit 1
    

    But assume we want to tag on an error message:

    make || echo "Error building product" && exit 1
    

    Or it could have been:

    make clean && make || exit 1
    

    I say parenthesis doesn’t help in readability, but in reality it is the unfamiliarity with what exactly parenthesis do in a shell script, is it grouping? is it a sub shell?

    Anyway, remembering operator precedence is always better than not remembering it, as you read code written by other people who may not have added parenthesis, I would e.g. never use parenthesis in something like: 32*x + 11*y and I expect everyone reading such expression to know that * binds stronger than +. Someone who writes boolean logic all day probably feels similar about AND and OR :)

  8. Allan Odgaard Says:
    September 2nd, 2009 at 11:35

    jpc: Good observation, btw.

  9. swt Says:
    September 11th, 2009 at 22:09

    about the “c || z && d” vs “d && z || c”

    sometimes first form may be better if you know something about the variables such as if c may be frequently true, or z/d are lengthy operations. First one will short circuit rest of the expression if c is true without bothering to evaluate z or d. There are other various short circuits in different orders, so one may be better than others, depending on what variables are doing.

    bob

  10. Yan Says:
    November 19th, 2009 at 09:09

    I think this is just a simple mathematical problem if take AND as set intersection, OR as set union, and negation as the set complement. Using some set operations one can get a uniformly simplest form for these problems (software like Mathematica can do this even for very complicated expression). Don't take my comment too series, I promise my idea is correct but I don't promise it is practical. -:)

  11. Daniel Yankowsky Says:
    December 5th, 2009 at 06:55

    That's a good insight; you have discovered half of the distributive law of boolean algebra (http://en.wikipedia.org/wiki/Boolean_algebra_(structure)#Definition). The other laws listed there are also useful. I have occasionally been hacking an expression and, because of the absorption law, noticed that some cases were redundant.

    I'm curious about what you mean about the number of variables not being fixed. Doesn't that imply that you are using a loop of some sort? Or do you mean that the example was a trimmed-down version of something more complicated in the real code?


Leave a Reply