Code using RISC-V a program that gives the largest value in a sequence

眠りネロク 2020-02-02 22:53

I think your intended use of the registers is:

x5 contains the current value read from the sequence.
x6 contains the index into the sequence.
x7 contains the maximum value seen so far.

There is no need to store x7 in memory since the maximum value is a single value at any moment. The register x7 should be initialized at the very beginning.

x7 could be initialized with the lowest possible value it can hold, i.e., -2⁶³:

addi x7, x0, 1
slli x7, x7, 63

Any value read from the sequence other than this lowest possible value would cause the current maximum to be updated.

Alternatively, you can directly load the first element of the sequence into x7 since there is always one element available for loading (the terminating 0 if it's an empty sequence):

ld x7, sr(x6)

The following branch instruction in your code:

        bge x5, x7, skip
skip:   addi x6, x6, 8

Regardless of whether or not the condition holds (x5 >= x7), the next instruction executed is always addi x6, x6, 8. What is missing between these two instructions, after the skip label, is the code for updating the current maximum value seen so far, i.e., an instruction for moving the contents from x5 to x7. The operands x5 and x7 of the bge instruction must also be swapped since you want to skip the code for updating the maximum when x7 >= x5 holds (i.e., the maximum isn't updated when the maximum is already greater or equal to the current value):

        bge x7, x5, skip # skip the update of the maximum?
        addi x7, x5, 0   # update new maximum value
skip:   addi x6, x6, 8

Therefore, if the branching condition does hold, i.e., if x7 (the maximum) is greater or equal to x5 (the current value read), the code for updating the maximum is skipped.

Instead of having two branch instruction in the loop: the beq x5, x0, end that terminates the loop if the zero value has been reached in the sequence, and the unconditional jump beq x0, x0, loop as the last instruction of your loop for repeating the loop, you could rearrange your code so that the loop's last instruction is:

bne x5, x0, loop   # is the end of the sequence not reached yet?

This would replace both beq x5, x0, end and beq x0, x0, loop: if the terminating value of the sequence (i.e., zero) has been reached it falls through, otherwise, it iterates the loop again.

Keeping all these things in mind, your code could look like:

src:    DD -1, 5, -3, 7, 0

        add x6, x0, x0      # initialize the index
        ld x7, sr(x6)       # initialize the maximum
        addi x5, x7, 0      # initialize with the first value

        beq x5, x0, end     # is the sequence empty?

loop:   bge x7, x5, skip    # skip the update of the maximum?
        addi x7, x5, 0      # update the maximum with the new value read
skip:   addi x6, x6, 8      # update the index
        ld x5, sr(x6)       # load the next value from the sequence
        bne x5, x0, loop    # is the end of the sequence not reached yet?        
end:    
        addi x5, x7, 0      # place the final result in x5 (your problem's assignment)
        ebreak x0, x0, 0

Peter Cordes 2020-02-02 21:27:27

The array is implicit-length so it's always safe to load the first element. You could use that instead of constructing INT64_MIN with 2 regs. The return value is meaningless anyway if there are no elements before the terminator. That might also be a good setup for peeling the first iteration so you can put the loop-exit conditional branch at the bottom of the loop where it belongs.

Peter Cordes 2020-02-02 22:34:16

I think you introduced a bug. You now check x5 for being a new max before you check it for being a terminator. I'd put ld + bne near the bottom, with a jump to there to enter the loop. Or peel part of the first iteration so you can just fall into the loop. (i.e. check if the main part of the loop body needs to run 0 times, and skip the loop if that's the case. Like normal for a loop that might need to run 0 times.)

Peter Cordes 2020-02-02 22:59:30

Yup, that's what I was describing in the last part of my comment: branch to skip the whole loop before falling into a normal do{}while loop structure. Now you also avoid loading the first element twice.

Hani Salehi 2020-02-03 04:45:14

thanks for the help it really cleared things up for me

Peter Cordes 2020-02-03 15:37:07

BTW, for performance you'd probably want to write the loop so the normal case (no new maximum) is a not-taken branch. Taken branches can have more costs than not-taken even when predicted correctly, especially on a superscalar CPU. So for example you could conditionally jump to another block (after the end of the function) which updates x7 and jumps back. Possibly duplicate the loop tail so you can fall through (to a jump to end, or to a duplicated end of the whole function that returns). Or jump directly to loop after setting up for the next iteration if we didn't load the terminator.

Related issues

Understanding Syntax of ARM Vector Table Assembly Declaration

Using LEA on values that aren't addresses / pointers?

Writing x86_64 linux kernel module in assembler

Assembly x86 Programming Debugging (GDB): How to print out data through advancing indexing

(mips64) lw and sw not adding value to registers

Is right shift undefined behavior if the count is larger than the width of the type?

Is it possible to implement pipelining in a accumulator based machine/ISA?

x86 Assembly: Prologue of recursive function messes up parameters

Cortex M4: cmp instruction broken?

Binary Bomb Phase 5 issue (my phase 5 seems to be different from everyone elses)