The second version (test against zero) benefits from the fact that the subtracti...

tsahyt · on Sept 4, 2012

The compiler optimizing out the loop completely is one of the reasons why I didn't turn on optimizations here, although there was a loop body. gcc is smart enough to optimize loops with a fixed outcome away in some cases. I probably should have written a more complex loop body.

That said, it's very likely that the compiler will make sense of such an optimization and produce the quicker assembly code like you just wrote it.

What I wanted to say with my post was really that things like these are heavily depended on architecture and that comparisons could be optimized in hardware.

exDM69 · on Sept 4, 2012

> The compiler optimizing out the loop completely is one of the reasons why I didn't turn on optimizations here

Use a non-constant value for the loop and put a dummy load inside to stop compiler from doing loop unrolling and/or dead code elimination.

I often use time(), rand() or even argc to get dummy values when looking at assembly from compiler optimizations.

dbaupp · on Sept 4, 2012

One can just write the loop as

  for (..; ..; ..) asm("");

GCC doesn't introspect the asm statements, and so it leaves the loop there. (At least, it did for me, even with -O3.)