Another example for recursive call is calculating the power of a number. The recursion requires that formal params,local variables and return address are stored onto the stack for each recursive call. However, having in mind that RAM is a limited resources in AVR devices it is conceivable why recursive algorithms are not widely accepted in avr assembly programming. There is a need for optimization or twisting the stack frame build up requirement. The following rules will be used in building a recursive avr algorithm
- Stack frame limit - calculate the maximum stack size in accordance with input params. RAM is a limited resource so beware how deep the rabbit hole may go. Depending on AVR device you may have different stack frame’s size.
- Save in stack frame only the minimum amount of data, use global data whenever possible. This reduce code usage and stack frame size.
- CPU saves the return address on each recursive call so apart from this, frame build up and maintenance is our code responsibility.
Two methods of finding the power of a number are presented. Both assume unsigned byte size input - word size output.
Naive solution
The most straightforward solution to the problem would be to implement x2=x.x.x.x...x , multiplying x n times with itself. Following implementation does not even use local stack frame variables. All calculations are done over global ones.
;************************************Power of a number***************
;* Naive slow approach stack usage N*4
;* Unsigned arithmetic
;*@INPUT: xx -> number
;* n -> power
;*************************************************************************
naive_power:
//EXIT CONDITION
tst n
brne nextframe
ret
nextframe:
//BUILD NEXT STACK FRAME
dec n
rcall naive_power
//CALCULATE REWINDING THE STACK
clr ah
mov al,xx
movw bh:bl,output3:output4
rcall mul16x16_32 ;no stack manipulation
ret
Optimized solution
Suppose we want to compute xn, where x is a real number and n is any integer. It's easy if n is 0, since x0=1 no matter what x is. That's a good base condition case. Only positive numbers are considered so when you multiply powers of x, you add the exponents: xa+xb=xa+b for any base x and any exponents a and b. Therefore, if n is positive and even, then xn=xn/2.xn/2. If we were to compute y=xn/2 recursively, then we could compute xn=y.y . What if n is positive and odd? Then xn=xn-1.x and and n-1 either is 0 or is positive and even. We just saw how to compute powers of x when the exponent either is 0 or is positive and even. Therefore, we could compute xn-1 recursively , and then use this result to compute xn=xn-1.x
What about when n is negative? Then xn=1/x-n , and the exponent −n is positive. So we can compute x-n recursively and take its reciprocal.
Putting these observations together, we get the following recursive algorithm for computing xn.
- The base case is when n=0,and x0=1.
- If n is positive and even, recursively compute y=xn/2 and then xn=y.y. Notice that you can get away with making just one recursive call in this case, computing xn/2 just once, and then you multiply the result of this recursive call by itself.
- If n is positive and odd, recursively compute xn-1, so that the exponent either is 0 or is positive and even. Then, xn=xn-1.x.
- If n is negative, recursively compute x-n, so that the exponent becomes positive. Then, xn=1/x-n.
The following acorn kernel task is based on positive numbers only. Default task stack is 20 and each stack frame keeps 1 byte and return call of 2 bytes. There is however an internal call to multiply function in each frame after 1 byte is popped off the stack. So total stack frame usage is 4 bytes. Calculating 85 will require (5/2+1)*4=12. The default task stack size is less than 20 bytes so NO increase of the variable TASK_STACK_DEPTH in kernel.inc is required.
;************************************Power of a number***************
;* Optimized fast approach
;* Unsigned arithmetic
;*@INPUT: xx -> number
;* n -> power
;*************************************************************************
power:
//EXIT CONDITION
tst n
brne nextframeex
ret
nextframeex:
//BUILD NEXT STACK FRAME
push n
lsr n
rcall power
//CALCULATE REWINDING THE STACK
pop n
sbrc n,0 ;is n odd
rjmp odd
even:
;result*result
movw ah:al,output3:output4
movw bh:bl,output3:output4
rcall mul16x16_32
ret
odd:
;result*result
movw ah:al,output3:output4
movw bh:bl,output3:output4
rcall mul16x16_32
;x*result
clr ah
mov al,xx
movw bh:bl,output3:output4
rcall mul16x16_32
ret
;****************MULTIPLY unsigned 16x16=32bit
;* INPUT= r23:r22 * r21:r20
;* USAGE=r0,r1,r2
;* STACK=0
;* OUTPUT = r19:r18
;* r17:r16
;******************************************
mul16x16_32:
clr r2
mul r23, r21 ; ah * bh
movw r19:r18, r1:r0
mul r22, r20 ; al * bl
movw r17:r16, r1:r0
mul r23, r20 ; ah * bl
add r17, r0
adc r18, r1
adc r19, r2
mul r21, r22 ; bh * al
add r17, r0
adc r18, r1
adc r19, r2
ret
.def output1=r19 ;not used
.def output2=r18 ;not used
.def output3=r17
.def output4=r16
.def bh=r23
.def bl=r22
.def ah=r21
.def al=r20
.def n=r24
.def xx=r25
Task_16:
ldi bl,0
ldi bh,0
ldi output4,1
clr ah
clr al
ldi n,5
ldi xx,8 ;3^6
rcall power
main16:
nop
rjmp main16
ret
|