Reliable Software Logo
 Home  >  C++ Resources  > C++ In Action Book > Language: Pointers > Assembler Digression

C++ In Action

Assembler Digression

"Beautiful it is not, but who cares, if it’s more efficient." Well, not really. For those of you who understand the x86 assembler, here’s the output of the optimizing 16-bit compiler for the two implementations

Index Pointer
?StrLen@@YAHPAD@Z      PROC NEAR
      push      bp
      mov      bp,sp
      push     di
;     pStr = 4
;     register bx = i
      mov      di,WORD PTR [bp+4]
      xor      bx,bx
      cmp      BYTE PTR [di],bl
      je      $FB1596
$F1594:
      inc      bx
      cmp      BYTE PTR [bx][di],0
      jne      $F1594
$FB1596:
      mov      ax,bx
      pop      di

      mov      sp,bp
      pop      bp
      ret      
?StrLen@@YAHPAD@Z      ENDP
?StrLen@@YAHPAD@Z      PROC NEAR
      push     bp
      mov      bp,sp

;     register bx = p
;     pStr = 4
      mov      dx,WORD PTR [bp+4]
      mov      bx,dx


$FC1603:
      inc      bx
      cmp      BYTE PTR [bx-1],0
      jne      $FC1603

      mov      ax,bx
      sub     ax,dx
      dec     ax
      mov      sp,bp
      pop      bp
      ret      
?StrLen@@YAHPAD@Z      ENDP

In the first implementation the compiler decided to use two register variables, hence additional push and pop. The loop is essentially the same, only the addressing mode is different. Under close scrutiny it turns out that that the instruction in the second loop is longer by one byte in comparison with the first one.

80 39 00 cmp BYTE PTR [bx][di],0 ; first loop
80 7f ff 00 cmp BYTE PTR [bx-1],0 ; second loop

So for really long strings the index implementation beats the pointers. Or does it? A lot depends on the alignment of the instructions. On my old machine, an 80486, the second loop turned out to be better aligned and therefore produced faster code.

In the pointer implementation some additional pointer arithmetic is done at the end—in the index implementation, a test is done before entering the loop—but then the loop is executed one fewer time. Again, on my machine, the overhead of the index solution turned out to be smaller than the overhead of the pointer one, therefore for strings of up to 3 characters indexes beat pointers.

Frankly, it’s six of one, half a dozen of another. Is it worth the complication? Have you actually done the comparison of assembly instructions and timings for all your favorite tricks? Maybe it’s time to throw away all these idioms from the great era of hacking in C and learn some new tricks that are focused on the understandability and maintainability of code. You don’t want to end up penny-wise but pound-foolish.

Don't use pointers where an index will do.