Another example for recursive call is calculating the power of a number. The recursion requires that formal params,local variables and return address are stored onto the stack for each recursive call. However, having in mind that RAM is a limited resources in AVR devices it is conceivable why recursive algorithms are not widely accepted in avr assembly programming. There is a need for optimization or twisting the stack frame build up requirement. The following rules will be used in building a recursive avr algorithm
Two methods of finding the power of a number are presented. Both assume unsigned byte size input - word size output.
Naive solution
The most straightforward solution to the problem would be to implement x2=x.x.x.x...x , multiplying x n times with itself. Following implementation does not even use local stack frame variables. All calculations are done over global ones.
;************************************Power of a number***************
;* Naive slow approach stack usage N*4
;* Unsigned arithmetic
;*@INPUT: xx -> number
;* n -> power
;*************************************************************************
naive_power:
//EXIT CONDITION
tst n
brne nextframe
ret
nextframe:
//BUILD NEXT STACK FRAME
dec n
rcall naive_power
//CALCULATE REWINDING THE STACK
clr ah
mov al,xx
movw bh:bl,output3:output4
rcall mul16x16_32 ;no stack manipulation
ret
Optimized solution
Suppose we want to compute xn, where x is a real number and n is any integer. It's easy if n is 0, since x0=1 no matter what x is. That's a good base condition case. Only positive numbers are considered so when you multiply powers of x, you add the exponents: xa+xb=xa+b for any base x and any exponents a and b. Therefore, if n is positive and even, then xn=xn/2.xn/2. If we were to compute y=xn/2 recursively, then we could compute xn=y.y . What if n is positive and odd? Then xn=xn-1.x and and n-1 either is 0 or is positive and even. We just saw how to compute powers of x when the exponent either is 0 or is positive and even. Therefore, we could compute xn-1 recursively , and then use this result to compute xn=xn-1.x
What about when n is negative? Then xn=1/x-n , and the exponent −n is positive. So we can compute x-n recursively and take its reciprocal.
Putting these observations together, we get the following recursive algorithm for computing xn.
The following acorn kernel task is based on positive numbers only. Default task stack is 20 and each stack frame keeps 1 byte and return call of 2 bytes. There is however an internal call to multiply function in each frame after 1 byte is popped off the stack. So total stack frame usage is 4 bytes. Calculating 85 will require (5/2+1)*4=12. The default task stack size is less than 20 bytes so NO increase of the variable TASK_STACK_DEPTH in kernel.inc is required.
;************************************Power of a number***************
;* Optimized fast approach
;* Unsigned arithmetic
;*@INPUT: xx -> number
;* n -> power
;*************************************************************************
power:
//EXIT CONDITION
tst n
brne nextframeex
ret
nextframeex:
//BUILD NEXT STACK FRAME
push n
lsr n
rcall power
//CALCULATE REWINDING THE STACK
pop n
sbrc n,0 ;is n odd
rjmp odd
even:
;result*result
movw ah:al,output3:output4
movw bh:bl,output3:output4
rcall mul16x16_32
ret
odd:
;result*result
movw ah:al,output3:output4
movw bh:bl,output3:output4
rcall mul16x16_32
;x*result
clr ah
mov al,xx
movw bh:bl,output3:output4
rcall mul16x16_32
ret
;****************MULTIPLY unsigned 16x16=32bit
;* INPUT= r23:r22 * r21:r20
;* USAGE=r0,r1,r2
;* STACK=0
;* OUTPUT = r19:r18
;* r17:r16
;******************************************
mul16x16_32:
clr r2
mul r23, r21 ; ah * bh
movw r19:r18, r1:r0
mul r22, r20 ; al * bl
movw r17:r16, r1:r0
mul r23, r20 ; ah * bl
add r17, r0
adc r18, r1
adc r19, r2
mul r21, r22 ; bh * al
add r17, r0
adc r18, r1
adc r19, r2
ret
.def output1=r19 ;not used
.def output2=r18 ;not used
.def output3=r17
.def output4=r16
.def bh=r23
.def bl=r22
.def ah=r21
.def al=r20
.def n=r24
.def xx=r25
Task_16:
ldi bl,0
ldi bh,0
ldi output4,1
clr ah
clr al
ldi n,5
ldi xx,8 ;3^6
rcall power
main16:
nop
rjmp main16
ret