Search this site (

CCS C Compiler for Microchip PIC micros

Your ad here


Speed optimization tips for Hi-Tech C and PIC16Fx core

General Optimization Tips for the PIC16Fx microcontrollers

  • To save bank switching, move variables in different banks together.In initialization code, at startup of the program, look at the order of initialization - first all variables in bank0, then in bank1 then in bank2, then in bank3.
    In initialization - may be some variables do not need initialization. Where is possible, reorder operators to let the compiler avoid redundant loads of W register or temp locations.Use variables in same bank in arithmetic expressions to avoid bank switching.If possible, take the chance to use byte arithmetic instead of word arithmetic.If possible, use of pointers to array's elements instead of index. Note that in small loops manipulating pointers, however, the overhead of the loop cancels out the saving using pointers, so its about equivalent.
    A series of:

    else if
    else if ...
    often generates smaller code than the equivalent case statement.
    In switch - case, change constants to be sequental numbers, without gaps.
    Depending on the bank switching required:

    var = value1;
    if (!flag)
    var = value2;

    generates more optimal code then:

    if (flag)
    var = value1;
    var = value2;

    Just make sure that var won't be used in a interrupt while this code executes.
    Clearing, incrementing, and decrementing a byte are single instruction operations. Assigning a value to a byte requires 2 instructions (value -> W, and W -> byte).Use bits instead of unsigned chars whenever possible. Bit sets, clears, and tests and skips are all single instructions. Since you can't declare bits in a function, you may benefit from a globally declared bit.There is overhead to making function calls. Try replacing some of your smaller functions with macros.Large blocks of duplicated code should be replaced with a function and function calls if stack space allows.
  • Optimization of existing logic. I have yet to be given a project with non-changing requirements, so I try to write my code to be very flexible. As it gets closer to the end of the project, I find some of the flexibility isn't needed, and may be removed at a code savings.

    Thanks to Ivan Cenov [] and Michael Dipperstein [].

Optimization Tip 1: Signed vs. Unsigned variables

Compare the assembly for signed and unsigned variables, and you will find that there is a few more instructions for doing comparisons on signed variables.

Conclusion 1:

Use unsigned integers and/or chars if possible.

Optimization Tip 2: Byte Loops

Ok, heres two pieces of code, that do exactly the same thing. Yet, one of them is finished 25% faster, with less memory space! Can you pick which one?
unsigned char i;
for(i=0;i<250;i++) do_func(); //executes do_func() 250 times, in 3.25ms

for(i=250;i!=0;i--) do_func(); //executes do_func() 250 times, in 2.5ms

To figure this out, have a look at the assembly produced.

for(i=0;i<250;i++) do_func();

//executes 250 times in 3251 cy

1617 01B8 clrf 0x38
1618 260F call 0x60F
1619 0AB8 incf 0x38
161A 3008 movlw 0xFA
161B 0238 subwf 0x38,W
161C 1C03 btfss 0x3,0x0
161D 2E18 goto 0x618

for(i=250;i!=0;i--) do_func();

//executes 250 times in 2502 cy

1621 3008 movlw 0xFA
1622 00B8 movwf 0x38
1623 260F call 0x60F
1624 0BB8 decfsz 0x38
1625 2E23 goto 0x623

Conclusion 2:

Have your loops decrementing to zero, if possible. Its fast to check a ram variable against zero.

However, note that in the incrementing loop, do_func(); was called one
clock cycle earlier. If you want speed of entry, choose the incrementing loop.

Optimization Tip 3: Integer Timeout Loops

If you want to poll a port, or execute a function a certain number of times before timing out, you need a timeout loop.

unsigned int timeout;
#define hibyte(x) ((unsigned char)(x>>8))
#define lobyte(x) ((unsigned char)(x&0xff))
//the optimizer takes care of using the hi/lo correct byte of integer

  • Loops to avoid with timeouts: 320000 to 380000 cycles for 20000 iterations.

    for(timeout=0;timeout<20000;timeout++) do_func(); //380011 cycles
    for(timeout=20000;timeout!=0;timeout--) do_func(); //320011 cycles |

  • Best loop for a timeout: 295000 cycles for 20000 iterations.

    //we want to execute do_func() approx. 20000 times before timing out
    timeout=(20000/0x100)*0x100; //keeps lobyte(timeout)==0, which speeds up assignments
    for(;hibyte(timeout)!=0;timeout--) do_func();
    //295704 cycles

    Notice the features of the loop shown above.

    1. It only tests the high byte of the integer each time around the loop.
    2. It checks this byte against zero, very fast.
    3. When initializing variable timeout, it takes advantage of the fact that the assembly command to initialize a ram variable to zero is one instruction, whereas to assign it a number its two instructions.

Conclusion 3:

  • Have your loops decrementing to zero, if possible, its easy to check a ram variable against zero.Only test the high byte of an integer in a timeout loop, its faster.
  • When assigning integers, its faster to assign zero to a ram variable, rather than a number.

Optimization Tip 4: Timeout loops using built in timers

Of course, the fastest form of timeout is to use the built-in PIC timers, and check for an interrupt. This is typically 70% faster than using your own timeout loops.

//set up tmr0 to set flag T0IF high when it rolls over
while(RA0==0 && !T0IF); //wait until port goes high

Conclusion 4:

Use the built in timers and/or interrupt flags whenever possible.

Optimization Tip 5: Case statements

Slow and Inefficient

  case 'A':
    do something;
  case 'H':
    do something;

  case 'Z':
    do something;

Fast and Efficient

  case 0:
    do something;
  case 1:
    do something;

  case 2:
    do something;

The Hi-Tech C optimizer turns the switch statement into a computed goto if possible.Conclusion 5:

  • Use sequential numbers in case statements whenever possible.

Optimization Tip 6: Division in Hi-Tech C

If you use Hi-Tech C, and there is any mathematical division at all in the entire program, this uses up between 13 and 23 bytes in bank0, and some EPROM/flash.

This occurs even if the variables used are not in bank0.

Occurrence RAM usageROM/flash usage Fix/Explanation

Any mathematical division at all in the entire program using a variable of type 'long', even if all variables do not reside in bank0.

23 bytes in bank0large, it has to include ldiv routines Use combinations of bit shifts ie: x=x*6 is replaced by x1=x;x2=x;x=x1<<2 + x2<<1
Any mathematical division at all in the entire program using a variable of type 'unsigned int', even if all variables do not reside in bank0. 13 bytes in bank0large,it has to include ldiv routines Use combinations of bit shifts
Any mathematical division involving a divisor that is a power of 2, ie: x=x/64; -low Use combinations of bit shifts
Any mathematical division involving a divisor that is not a power of 2, ie: x=x/65;
-high make your divisors a power of 2, ie: 2^5=32.

Conclusion 6:

If necessary, make it easy on the C compiler and use bit shifts, and divisors that are a power of 2. Divisors that are a power of 2, such as 256=2^8, can be optimized into a bit shift by the C compiler.

If you dont use any division at all in the program, you will save 23 bytes in bank0 and a portion of ROM ldiv() routines.


This site is non-profit. Ad revenue almost covers hosting costs.

We welcome any suggesions or comments! Send them to Shane Tolmie on This site is a completely separate site to, and is maintained independently of Microchip Ltd., manufacturers of the PIC micro. All code on this site is free for non-commercial use, unless stated otherwise. Commercial use normally free, however, it is prohibited without contacting for permission. All content on this site created by Shane Tolmie is copyrighted by Shane Tolmie 1999-2009. Click to advertise on this website - $29.90 for a banner ad which will reach 55,000 user sessions per month. One months free trial!