Warm tip: This article is reproduced from serverfault.com, please click

How exactly does the x86 LOOP instruction work?

发布于 2017-10-23 02:49:51
            mov    ecx, 16
looptop:    .
            .
            .
            loop looptop

How many times will this loop execute?

What happens if ecx = 0 to start with? Does loop jump or fall-through in that case?

Questioner
Hannah Duncan
Viewed
0
Peter Cordes 2019-10-10 04:39:59

loop is exactly like dec ecx / jnz, except it doesn't set flags.

It's like the bottom of a do{} while(--ecx != 0); in C. If execution enters the loop with ecx = 0, wrap-around means the loop will run 2^32 times. (Or 2^64 times in 64-bit mode, because it uses RCX).

Unlike rep movsb/stosb/etc., it doesn't check for ECX=0 before decrementing, only after.

The address-size determines whether it uses CX, ECX, or RCX. So in 64-bit code, addr32 loop is like dec ecx / jnz, while a regular loop is like dec rcx / jnz. Or in 16-bit code, it normally uses CX, but an address-size prefix (0x67) will make it use ecx. As Intel's manual says, it ignores REX.W, because that sets the operand-size, not the address-size.

Related: Why are loops always compiled into "do...while" style (tail jump)? for more about loop structure in asm, while(){} vs. do{}while() and how to lay them out.


Extra debugging tips

If you ever want to know the details on an instruction, check the manual: either Intel's official vol.2 PDF instruction set reference manual, or an html extract with each entry on a different page (http://felixcloutier.com/x86/). But note that the HTML leaves out the intro and appendices that have details on how to interpret stuff, like when it says "flags are set according to the result" for instructions like add.

And you can (and should) also just try stuff in a debugger: single-step and watch registers change. Use a smaller starting value for ecx so you get to the interesting ecx=1 part sooner. See also the x86 tag wiki for links to manuals, guides, and asm debugging tips at the bottom.


And BTW, if the instructions inside the loop that aren't shown modify ecx, it could loop any number of times. For the question to have a simple and unique answer, you need a guarantee that the instructions between the label and the loop instruction don't modify ecx. (They could save/restore it, but if you're going to do that it's usually better to just use a different register as the loop counter. push/pop inside a loop makes your code hard to read.)


Rant about over-use of LOOP even when you already need to increment something else in the loop. LOOP isn't the only way to loop, and usually it's the worst.

You should normally never use the loop instruction unless optimizing for code-size at the expense of speed, because it's slow. Compilers don't use it. (So CPU vendors don't bother to make it fast; catch 22.) Use dec / jnz, or an entirely different loop condition. (See also http://agner.org/optimize/ to learn more about what's efficient.)

Loops don't even have to use a counter; it's often just as good if not better to compare a pointer to an end address, or to check for some other condition. (Pointless use of loop is one of my pet peeves, especially when you already have something in another register that would work as a loop counter.) Use cx as a loop counter often just ties up one of your precious few registers when you could have used cmp/jcc on another register you were incrementing anyway.

IMO, loop should be considered one of those obscure x86 instructions that beginners shouldn't be distracted with. Like stosd (without a rep prefix), aam or xlatb. It does have real uses when optimizing for code size, though. (That's sometimes useful in real life for machine code (like for boot sectors), not just for stuff like code golf.)

IMO, just teach / learn how conditional branches work, and how to make loops out of them. Then you won't get stuck into thinking there's something special about a loop that uses loop. I've seen an SO question or comment that said something like "I thought you had to declare loops", and didn't realize that loop was just an instruction.

</rant>. Like I said, loop is one of my pet peeves. It's an obscure code-golfing instruction, unless you're optimizing for an actual 8086.