Another example for recursive call is calculating the power of a number. The recursion requires that formal params,local variables and return address are stored onto the stack for each recursive call. However, having in mind that RAM is a limited resources in AVR devices it is conceivable why recursive algorithms are not widely accepted in avr assembly programming. There is a need for optimization or twisting the stack frame build up requirement. The following rules will be used in building a recursive avr algorithm
 Stack frame limit  calculate the maximum stack size in accordance with input params. RAM is a limited resource so beware how deep the rabbit hole may go. Depending on AVR device you may have different stack frame’s size.
 Save in stack frame only the minimum amount of data, use global data whenever possible. This reduce code usage and stack frame size.
 CPU saves the return address on each recursive call so apart from this, frame build up and maintenance is our code responsibility.
Two methods of finding the power of a number are presented. Both assume unsigned byte size input  word size output.
Naive solution
The most straightforward solution to the problem would be to implement x^{2}=x.x.x.x...x , multiplying x n times with itself. Following implementation does not even use local stack frame variables. All calculations are done over global ones.
;************************************Power of a number***************
;* Naive slow approach stack usage N*4
;* Unsigned arithmetic
;*@INPUT: xx > number
;* n > power
;*************************************************************************
naive_power:
//EXIT CONDITION
tst n
brne nextframe
ret
nextframe:
//BUILD NEXT STACK FRAME
dec n
rcall naive_power
//CALCULATE REWINDING THE STACK
clr ah
mov al,xx
movw bh:bl,output3:output4
rcall mul16x16_32 ;no stack manipulation
ret
Optimized solution
Suppose we want to compute x^{n}, where x is a real number and n is any integer. It's easy if n is 0, since x^{0}=1 no matter what x is. That's a good base condition case. Only positive numbers are considered so when you multiply powers of x, you add the exponents: x^{a}+x^{b}=x^{a+b} for any base x and any exponents a and b. Therefore, if n is positive and even, then x^{n}=x^{n/2}.x^{n/2}. If we were to compute y=x^{n/2} recursively, then we could compute x^{n}=y.y . What if n is positive and odd? Then x^{n}=x^{n1}.x and and n1 either is 0 or is positive and even. We just saw how to compute powers of x when the exponent either is 0 or is positive and even. Therefore, we could compute x^{n1} recursively , and then use this result to compute x^{n}=x^{n1}.x
What about when n is negative? Then x^{n}=1/x^{n} , and the exponent −n is positive. So we can compute x^{n} recursively and take its reciprocal.
Putting these observations together, we get the following recursive algorithm for computing x^{n}.
 The base case is when n=0,and x^{0}=1.
 If n is positive and even, recursively compute y=x^{n/2} and then x^{n}=y.y. Notice that you can get away with making just one recursive call in this case, computing x^{n/2} just once, and then you multiply the result of this recursive call by itself.
 If n is positive and odd, recursively compute x^{n1}, so that the exponent either is 0 or is positive and even. Then, x^{n}=x^{n1}.x.
 If n is negative, recursively compute x^{n}, so that the exponent becomes positive. Then, x^{n}=1/x^{n}.
The following acorn kernel task is based on positive numbers only. Default task stack is 20 and each stack frame keeps 1 byte and return call of 2 bytes. There is however an internal call to multiply function in each frame after 1 byte is popped off the stack. So total stack frame usage is 4 bytes. Calculating 8^{5 }will require (5/2+1)*4=12. The default task stack size is less than 20 bytes so NO increase of the variable TASK_STACK_DEPTH in kernel.inc is required.
;************************************Power of a number***************
;* Optimized fast approach
;* Unsigned arithmetic
;*@INPUT: xx > number
;* n > power
;*************************************************************************
power:
//EXIT CONDITION
tst n
brne nextframeex
ret
nextframeex:
//BUILD NEXT STACK FRAME
push n
lsr n
rcall power
//CALCULATE REWINDING THE STACK
pop n
sbrc n,0 ;is n odd
rjmp odd
even:
;result*result
movw ah:al,output3:output4
movw bh:bl,output3:output4
rcall mul16x16_32
ret
odd:
;result*result
movw ah:al,output3:output4
movw bh:bl,output3:output4
rcall mul16x16_32
;x*result
clr ah
mov al,xx
movw bh:bl,output3:output4
rcall mul16x16_32
ret
;****************MULTIPLY unsigned 16x16=32bit
;* INPUT= r23:r22 * r21:r20
;* USAGE=r0,r1,r2
;* STACK=0
;* OUTPUT = r19:r18
;* r17:r16
;******************************************
mul16x16_32:
clr r2
mul r23, r21 ; ah * bh
movw r19:r18, r1:r0
mul r22, r20 ; al * bl
movw r17:r16, r1:r0
mul r23, r20 ; ah * bl
add r17, r0
adc r18, r1
adc r19, r2
mul r21, r22 ; bh * al
add r17, r0
adc r18, r1
adc r19, r2
ret
.def output1=r19 ;not used
.def output2=r18 ;not used
.def output3=r17
.def output4=r16
.def bh=r23
.def bl=r22
.def ah=r21
.def al=r20
.def n=r24
.def xx=r25
Task_16:
ldi bl,0
ldi bh,0
ldi output4,1
clr ah
clr al
ldi n,5
ldi xx,8 ;3^6
rcall power
main16:
nop
rjmp main16
ret
