You can use the shift and rotate instructions (SHR, SHL, SAR, and SAL) for multiplication and division. Shifting an integer right by one bit has the effect of dividing by two; shifting left by one bit has the effect of multiplying by two. You can take advantage of shifts to do fast multiplication and division by powers of two. For example, shifting left twice multiplies by four, shifting left three times multiplies by eight, and so on.
Use SHR (Shift Right) to divide unsigned numbers. You can use SAR (Shift Arithmetic Right) to divide signed numbers, but SAR rounds numbers down—IDIV always rounds up. Division using SAR must adjust for this difference. Multiplication by shifting is the same for signed and unsigned numbers, so you can use either SAL or SHL.
Summary: Use shifts instead of MUL or DIV to optimize your code.
Since the multiply and divide instructions are very slow on the 8088 and 8086 processors, using shifts instead can often speed operations by a factor of 10 or more. For example, on the 8088 or 8086 processor, these statements take only four clocks:
sub ah, ah ; Clear AH
shl ax, 1 ; Multiply byte in AL by 2
The following statements produce the same results, but take between 74 and 81 clocks on the 8088 or 8086. The same statements take 15 clocks on the 80286 and between 11 and 16 clocks on the 80386.
mov bl, 2 ; Multiply byte in AL by 2
mul bl
You can put multiplication and division operations in macros so they can be changed if the constants in a program change, as shown in the two macros below.
mul_10 MACRO factor ; Factor must be unsigned
mov ax, factor ; Load into AX
shl ax, 1 ; AX = factor * 2
mov bx, ax ; Save copy in BX
shl ax, 1 ; AX = factor * 4
shl ax, 1 ; AX = factor * 8
add ax, bx ; AX = (factor * 8) + (factor * 2)
ENDM ; AX = factor * 10
div_512 MACRO dividend ; Dividend must be unsigned
mov ax, dividend ; Load into AX
shr ax, 1 ; AX = dividend / 2 (unsigned)
xchg al, ah ; xchg is like rotate right 8
; AL = (dividend / 2) / 256
cbw ; Clear upper byte
ENDM ; AX = (dividend / 512)
Summary: Since RCR and RCL use the carry flag, clear it before multiple-register shifts.
If you need to shift a value that is too large to fit in one register, you can shift each part separately. The RCR (Register Carry Right) and RCL (Register Carry Left) instructions carry values from the first register to the second by passing the leftmost or rightmost bit through the carry flag.
This example shifts a multiword value.
.DATA
mem32 DWORD 500000
.CODE
; Divide 32-bit unsigned by 16
mov cx, 4 ; Shift right 4 500000
again: shr WORD PTR mem32[2], 1 ; Shift into carry DIV 16
rcr WORD PTR mem32[0], 1 ; Rotate carry in ------
loop again ; 31250
Since the carry flag is treated as part of the operand (it's like using a nine-bit or 17-bit operand), the flag value before the operation is crucial. The carry flag can be set by a previous instruction, but you can also set it directly by using the CLC (Clear Carry Flag), CMC (Complement Carry Flag), and STC (Set Carry Flag) instructions.
On the 80386 and 80486, an alternate method for multiplying quickly by constants takes advantage of the LEA (Load Effective Address) instruction and the scaling of indirect memory operands. By using a 32-bit value as both the index and the base register in an indirect memory operand, you can multiply by the constants 2, 3, 4, 5, 8, and 9 more quickly than you can by using the MUL instruction. LEA calculates the offset of the source operand and stores it into the destination register, EBX, as this example shows:
lea ebx, [eax*2] ; EBX = 2 * EAX
lea ebx, [eax*2+eax] ; EBX = 3 * EAX
lea ebx, [eax*4] ; EBX = 4 * EAX
lea ebx, [eax*4+eax] ; EBX = 5 * EAX
lea ebx, [eax*8] ; EBX = 8 * EAX
lea ebx, [eax*8+eax] ; EBX = 9 * EAX
Section 3.2.4.3, “Indirect Memory Operands with 32-Bit Registers,” discusses scaling of 80386 indirect memory operands, and Section 3.3.3.2, “Loading Addresses into Registers,” introduces LEA.
This chapter has covered the integer operations you use in your MASM programs. The next chapter looks at more complex data types—arrays, strings, structures, unions, and records. Many of the operations presented in this chapter can also be applied to the data structures discussed in Chapter 5, “Defining and Using Complex Data Types.”