movzx and cwd - 糯米PHP

zx485 2020-02-01 11:38

The bare result can be equivalent or not - that depends on the value. The description of CWD states

Doubles the size of the operand in register AX, EAX, or RAX (depending on the operand size) by means of sign extension and stores the result in registers DX:AX, EDX:EAX, or RDX:RAX, respectively. The CWD instruction copies the sign (bit 15) of the value in the AX register into every bit position in the DX register.

So if the value in AX is lower than 32,767 (15 bit MAX), the result of it is equivalent to MOVZX (zero extend) and MOVSX (sign extend). But if the value is greater, it would only be equivalent to MOVSX. Usually MOVZX would be used in combination with DIV(unsigned division) and MOVSX in combination with IDIV(signed division).

But there remains the problem of where the result will be stored:
CWD stores the 32-bit result in two 16-bit registers DX:AX, while the MOV?X instructions store it in the 32-bit register EAX.

This has consequences on the following DIV instruction. The first part of your code uses the 32-bit value in DX:AX as input, while the second approach assumes EAX to be the input of a 16-bit DIV:

F7 /6   DIV r/m16   M   Valid   Valid   Unsigned divide DX:AX by r/m16, with result stored in AX ← Quotient, DX ← Remainder.

which makes the result unpredictable, because DX is undefined and the higher half of EAX is unused in the division.

Peter Cordes 2020-02-01 11:25:07

You should probably mention that you normally want to zero-extend before div (i.e. xor edx,edx), and only sign-extend before idiv. When and why do we sign extend and use cdq with mul/div? and Why should EDX be 0 before using the DIV instruction?

Related issues

Using LEA on values that aren't addresses / pointers?

Assembly x86 Programming Debugging (GDB): How to print out data through advancing indexing

x86 Assembly: Prologue of recursive function messes up parameters

How can I compile to assembly with gcc

How to get the CPU cycle count in x86_64 from C++?

Binary Bomb Phase 5 issue (my phase 5 seems to be different from everyone elses)

NASM x86 16-bit addressing modes

Solution needed for building a static IDT and GDT at assemble/compile/link time

How to find the instruction causing crash using objdump

Extended Inline Assembly in C++: Is it necessary to preserve volatile registers?