Factorial algorithm

In its basic definition recursive function is a function which calls itself(self-references) in its body of execution. Each time it calls itself a new stack frame is added on to the stack keeping track of local variables and result calculation. The recursion requires that formal params,local variables and return address are stored onto the stack for each recursice call. However, having in mind that RAM is a limited resources in AVR devices it is conceivable why recursive algorithms are not widely accepted in avr assembly programming. There is a need for optimization or twisting the stack frame build up requirement.

Consequently, a limit on the depth of recursion to avoid stack overflows is required. Apart from this a manual stack build up is required which makes recursive algorithms quite awkward to handle. The last but not least is we keep in stack only the variables that change and keep the result of calculation as global variable thus reducing the stack frame size for each call.

A classic example of a recursive procedure is the function used to calculate the factorial of a natural number:

Let us see how to go about building a recursive algorithm in avr assember for finding a factorial of a number.

**Building the stack**

Preparation of stack frame includes saving the local variables onto the stack one by one , leaving the return address saving to the CPU. In factorial example there is only one variable (n) to frame up. We will use the result of calculations in a global variable to save stack space and code size as well. As a result each stack frame will look like

A simple implementation of stack build up is

factorial:

//EXIT CONDITION

…….

ret

nextframe:

//BUILD NEXT STACK FRAME

push al

dec al

rcall factorial

//CALCULATE REWINDING THE STACK

…….

ret

**Exit recursion condition**

**Each recursive algorithm must have an exit or stop recursion condition to check against in order to stop building up the stack. In our sample factorial function the condition to stop is when (n) equates to 0. Adding the condition to sample factorial function:**

factorial:

//EXIT CONDITION

cpi al,0

brne nextframe

ret

nextframe:

//BUILD NEXT STACK FRAME

push al

dec al

rcall factorial

//CALCULATE REWINDING THE STACK

…….

ret

**Calculate rewinding stack**

The last step to consider is the calculation for each stack frame. In our case it is multiplying each stack variable (n) with the result of the previous stack frame. The result of multiplication is kept in a global register pair variable to make the sample as simple as possible.

factorial:

//EXIT CONDITION

cpi al,0

brne nextframe

ret

nextframe:

//BUILD NEXT STACK FRAME

push al

dec al

rcall factorial

//CALCULATE REWINDING THE STACK

pop al

movw resulth:resultl,output3:output4

rcall mul16x16_32 ;no stack manipulation

ret

**Implementation**

The sample implementation puts a limit to word size result which limits the input to 8!, which equates to 40320. The size for each frame is 3 bytes so total recursion stack size is 24.

Mind how much RAM size is allocated for each task in order to run the task. The default size is less than 20 bytes so an increase of the variable TASK_STACK_DEPTH in kernel.inc is required.

.def output1=r19

.def output2=r18

.def output3=r17

.def output4=r16

.def resulth=r23

.def resultl=r22

.def ah=r21

.def al=r20

Task:

ldi resultl,0

ldi resulth,0

ldi output4,1

clr ah

ldi al,8

rcall factorial

main:

rjmp main

ret

/*

Reqursive factorial function f()=9!=1*2*3*4*5*6*7*8

*/

factorial:

//EXIT CONDITION

cpi al,0

brne nextframe

ret

nextframe:

//BUILD NEXT STACK FRAME

push al

dec al

rcall factorial

//CALCULATE REWINDING THE STACK

pop al

movw resulth:resultl,output3:output4

rcall mul16x16_32 ;no stack manipulation

ret

;**********************

;* Input= r23:r22 * r21:r20

;* Usage=r0,r1,r2

;* Output = r19:r18

;* r17:r16

;****MULTIPLY unsigned 16x16=32bit

mul16x16_32:

clr r2

mul r23, r21 ; ah * bh

movw r19:r18, r1:r0

mul r22, r20 ; al * bl

movw r17:r16, r1:r0

mul r23, r20 ; ah * bl

add r17, r0

adc r18, r1

adc r19, r2

mul r21, r22 ; bh * al

add r17, r0

adc r18, r1

adc r19, r2

ret