Speed optimization tips for Hi-Tech C and PIC16Fx core
General Optimization Tips for the PIC16Fx microcontrollers
To save bank switching,
move variables in different banks together.In initialization code,
at startup of the program, look at the order of initialization - first all
variables in bank0, then in bank1 then in bank2, then in bank3.
In initialization -
may be some variables do not need initialization. Where is possible,
reorder operators to let the compiler avoid redundant loads of W register
or temp locations.Use variables in same
bank in arithmetic expressions to avoid bank switching.If possible, take the
chance to use byte arithmetic instead of word arithmetic.If possible, use of
pointers to array's elements instead of index. Note that in small loops
manipulating pointers, however, the overhead of the loop cancels out the
saving using pointers, so its about equivalent.
A series of:
if
else if
else if ... often generates smaller code than the equivalent case
statement.
In switch - case, change
constants to be sequental numbers, without gaps.
Depending on the bank
switching required:
var = value1;
if (!flag)
var = value2;
generates more optimal code then:
if (flag)
var = value1;
else
var = value2;
Just make
sure that var won't be used in a interrupt while this code executes.Clearing, incrementing,
and decrementing a byte are single instruction operations. Assigning a value
to a byte requires 2 instructions (value -> W, and W -> byte).Use bits instead of
unsigned chars whenever possible. Bit sets, clears, and tests and skips
are all single instructions. Since you can't declare bits in a function,
you may benefit from a globally declared bit.There is overhead to
making function calls. Try replacing some of your smaller functions with
macros.Large blocks of duplicated
code should be replaced with a function and function calls if stack space
allows.
Optimization of existing
logic. I have yet to be given a project with non-changing requirements,
so I try to write my code to be very flexible. As it gets closer to the
end of the project, I find some of the flexibility isn't needed, and may
be removed at a code savings.
Thanks to Ivan Cenov
[imc@okto7.com] and Michael Dipperstein [mdippers@harris.com].
Optimization
Tip 1: Signed vs. Unsigned variables
Compare the assembly for signed and unsigned
variables, and you will find that there
is a few more instructions for doing comparisons on signed variables.
Conclusion
1:
Use unsigned integers
and/or chars if possible.
Optimization Tip
2: Byte Loops
Ok, heres two pieces of code, that do exactly the same thing. Yet, one of
them is finished 25% faster, with less memory space! Can you pick which one? unsigned char i;
for(i=0;i<250;i++) do_func(); //executes do_func() 250 times, in 3.25ms
for(i=250;i!=0;i--)
do_func(); //executes do_func() 250 times, in 2.5ms
To figure this out, have
a look at the assembly produced.
Have your loops
decrementing to zero, if possible. Its fast to check a ram variable against
zero.
However, note that in the incrementing loop, do_func(); was called one clock
cycle earlier. If you want speed of entry, choose the incrementing loop.
Optimization
Tip 3: Integer Timeout Loops
If you want to poll a port, or execute a function a certain number of times
before timing out, you need a timeout loop.
unsigned
int timeout;
#define hibyte(x) ((unsigned char)(x>>8))
#define lobyte(x) ((unsigned char)(x&0xff))
//the optimizer takes
care of using the hi/lo correct byte
of integer
Loops
to avoid with timeouts: 320000 to 380000 cycles for 20000 iterations.
Best
loop for a timeout: 295000 cycles for 20000 iterations.
//we want to execute do_func()
approx. 20000 times before timing out
timeout=(20000/0x100)*0x100; //keeps lobyte(timeout)==0, which speeds up
assignments
for(;hibyte(timeout)!=0;timeout--) do_func(); //295704
cycles
Notice the features of the loop shown above.
1. It only tests the high byte of the integer each time around
the loop.
2. It checks this byte against zero, very fast.
3. When initializing variable timeout, it takes advantage of the fact that
the assembly command to initialize a ram variable to zero is one instruction,
whereas to assign it a number its two instructions.
Conclusion
3:
Have your loops decrementing
to zero, if possible, its easy to check a ram variable against zero.Only test the high
byte of an integer in a timeout loop, its faster.
When assigning integers,
its faster to assign zero to a ram variable, rather than a number.
Optimization
Tip 4: Timeout loops using built in timers
Of course, the fastest form of timeout is to use the built-in PIC timers,
and check for an interrupt. This is typically 70% faster than using your own
timeout loops.
//set
up tmr0 to set flag T0IF high when it rolls over
while(RA0==0 && !T0IF); //wait until port goes high
Conclusion
4:
Use the built in timers
and/or interrupt flags whenever possible.
Optimization
Tip 5: Case statements
Slow
and Inefficient
c=getch();
switch(c)
{
case 'A':
{
do something;
break;
} case 'H':
{
do something;
break;
}
case 'Z':
{
do something;
break;
}
}
Fast
and Efficient
c=getch();
switch(c)
{
case 0:
{
do something;
break;
} case 1:
{
do something;
break;
}
case 2:
{
do something;
break;
}
}
The Hi-Tech C optimizer
turns the switch statement into a computed goto if possible.Conclusion
5:
Use sequential numbers
in case statements whenever possible.
Optimization
Tip 6: Division in Hi-Tech C
If you use Hi-Tech C, and there is any mathematical division at all in the
entire program, this uses up between 13 and 23 bytes in bank0, and some EPROM/flash.
This occurs even if the
variables used are not in bank0.
Occurrence
RAM usageROM/flash
usage
Fix/Explanation
Any mathematical
division at all in the entire program using a variable of type 'long',
even if all variables do not reside in bank0.
23 bytes
in bank0large,
it has to include ldiv routines
Use combinations
of bit shifts ie: x=x*6 is replaced by x1=x;x2=x;x=x1<<2 + x2<<1
Any mathematical
division at all in the entire program using a variable of type 'unsigned
int', even if all variables do not reside in bank0.
13 bytes
in bank0large,it
has to include ldiv routines
Use combinations
of bit shifts
Any mathematical
division involving a divisor that is a power of 2, ie: x=x/64;
-low
Use combinations
of bit shifts
Any mathematical
division involving a divisor that is not a power of 2, ie: x=x/65;
-high
make
your divisors a power of 2, ie: 2^5=32.
Conclusion
6:
If necessary, make it easy on the C compiler and use bit shifts, and divisors
that are a power of 2. Divisors that are a power of 2, such as 256=2^8, can
be optimized into a bit shift by the C compiler.
If you dont use any division at all in the program, you will save 23 bytes
in bank0 and a portion of ROM ldiv() routines.
This site is non-profit. Ad revenue almost covers hosting costs.