GRRLib Forum / [DONE] Modulo arithmetic is expensive

BlueChip · 2009-07-23 04:40:25

Code:

a % b

...is a very expensive operation as it is essentially a complex division

For value of b where b is a power of 2 (Eg 2, 4, 8, 16, 32, etc) the same effect can be achieved by a very cheap AND (&) operation

Code:

a % b  ==  a & (b - 1)

Therefore

Code:

    offset = (((y >> 2)<<4)*tex->w) + ((x >> 2)<<6) + (((y%4 << 2) + x%4 ) << 1); // Fuckin equation found by NoNameNo ;)

as used by GRRLIB_GetPixelFromtexImg and GRRLIB_SetPixelTotexImg ...can be rewritten as:

Code:

    offset = (((y >> 2)<<4)*tex->w) + ((x >> 2)<<6) + (((y&3 << 2) + x&3 ) << 1); // Fuckin equation found by NoNameNo ;)

Also

Code:

    u8 r, g, b, a;
    a=*(truc+offset);
    r=*(truc+offset+1);
    g=*(truc+offset+32);
    b=*(truc+offset+33);
    return ((r<<24) | (g<<16) | (b<<8) | a);

...is (sort of) readable code, but again, if the compiler does not realise the cpu has enough registers to perform the operation and furthermore go on to optimise the code, it'll probably store all those values in RAM for a while.

This can be replaced by a slightly less friendly, but more efficient:

Code:

  return (*(truc+offset) <<24) | (*(truc+offset+1) <<16) | (*(truc+offset+32) <<8) | *(truc+offset+33) ;

These optimisations will help many functions (including the new composition function) run faster

BC

Crayon · 2009-07-23 18:06:34

Thanks, for the information. On a forum I found those assembly lines to represent each function. I don't know it's on which architectures, but it's interesting.

For input & 0xFF:

Code:

   1. movzx   eax, BYTE PTR [esp + 4]
   2. ret

For input % 256:

Code:

   1.     mov eax, DWORD PTR [esp + 4]
   2.     and eax, -2147483393
   3.     jge .B2.4
   4.     sub eax, 1
   5.     or  eax, -256
   6.     inc eax
   7. .B2.4:
   8.     ret

Changes as been done in revision 103.

#1 2009-07-23 04:40:25

[DONE] Modulo arithmetic is expensive

Code:

Code:

Code:

Code:

Code:

Code:

#2 2009-07-23 18:06:34

Re: [DONE] Modulo arithmetic is expensive

Code:

Code:

Board footer