### Part 5: MIPS Instruction Set

In this section, we will describe the encoding format of MIPS assembly instructions, list the most common MIPS instructions, and discuss the anatomy of pseudo-instructions.

#### MIPS Instruction Formats

In Part 1: Introduction to MIPS Assembly, we discussed that assembly instructions are mnemonics for the combination of 1's and 0's that are defined as machine code instructions.

MIPS Instructions are always 4 bytes (32 bits) in size. To distinguish one instruction from another, several bits out of the 32 are assigned to represent the operation code (opcode), while other bits are assigned to represent the source and destination registers.

These combinations of bits make up several different types of MIPS instruction formats.

They are:

- R Instructions
- I Instructions
- J Instructions

Each instruction format follows a different syntax and encoding which will be described below in **big-endian** format. This is also described in greater detail from the MIPS Assembly Wikibook here

#### R Instruction Format

[**R**]**egister** instructions have operands that are registers.

An R instruction has the machine-code format:

[ **opcode** (6 bits) ] [ **Rs** (5 bits) ] [ **Rt** (5 bits) ] [ **Rd** (5 bits) ] [ **shift** (5 bits) ] [ **function code** (6 bits) ]

The **opcode** is the binary representation of the instruction. Related instructions can have the same opcode to which the **function code** bits of the instruction are used to tell the difference.

For example, add and addu have the same opcode but different function codes.

**Rs**, **Rt**, and **Rd** represent **source register**, **target register**, and **destination registers** respectively.

**shift** bits are used with the shift instructions and determine the number of shifts to be performed.

R Instructions that do not directly fit the machine-code format and omit bits for example **Rd** or **shift** bits will have those bits as all 0's.

For example, **jr** **Rs** has the encoding **0000 00****ss sss****0 0000** **0000 0****000 00****00 1000**

An incomplete list of R-type instructions is

- add
- addu
- and
- or
- sll
- jr

An example of an R-type instruction in binary format would be: **0000 00****10 001****1 0010** **1000 0****000 00****10 0000**

And the equivalent assembly instruction is: **add** **$s0**, **$s1**, **$s2**

#### I Instruction Format

[**I**]**mmediate** instructions have an operand that is an immediate value to be operated onto a register.

An I instruction has the machine-code format:

[ **opcode** (6 bits) ] [ **Rs** (5 bits) ] [ **Rt** (5 bits) ] [ **Immediate** (16 bits) ]

Just as in R-type instructions, the **opcode** is the binary representation of the instruction and **Rs** and **Rt** represent **source register** and **target register** respectively.

The **immediate** value is also called the **offset** when it comes to the load instructions.

An incomplete list of I-type instructions are

- beq
- bne
- addi
- lb
- lui

An example of an I instruction in binary format would be: **0010 00****11 101****1 1101** **0000 0000 0000 0100**

And the equivalent assembly instruction is: **addi** **$sp**, **$sp**, **4**

#### J Instruction Format

[**J**]**ump** instructions describe the format for an instruction where a jump is being performed. Jumps and branches will be described in greater detail in Part 6: Jumps and Branches

A J instruction has the machine-code format:

[ **opcode** (6 bits) ] [ **absolute-address** (26 bits) ]

The **absolute address** is a 26-bit shortened memory address that is the destination of the jump.

An example of a J instruction in binary format would be: **0000 10****00 0001 0000 0000 0000 0000 0011**

And the equivalent assembly instruction is: **j** **0x0040000c**

#### MIPS Registers Encoding

Rs, Rt, and Rd are to be substituted with the corresponding MIPS registers $0-$31 in binary.

For example, the instruction: add $s0, $s1, $s2

Using the equivalent register numbers which can be viewed in the table from Part 3: MIPS Registers, the instruction can be read as: add $16, $17, $18

So the encoding will be to convert decimal 16, 17, and 18 to binary to get the encoding for Rd, Rs, and Rt.

#### MIPS Instruction Set Table

Below is a table with the most common MIPS-32 instructions adapted from the MIPS Assembly Wikibook and from here

To learn the instruction set, I recommend setting up a lab (either MARS, SPIM, or qemu) to test instructions and see what they do.

Instr uction Name |
Description | Syntax | Operation | Instr uction Type |
Encoding |
---|---|---|---|---|---|

add | add (with overflow) | add Rd, Rs, Rt |
Rd = Rs + Rt |
R | 0000 00ss ssst tttt dddd d000 0010 0000 |

addi | add immediate (with overflow) | addi Rt, Rs, Immediate |
Rt = Rs + Immediate |
I | 0010 00ss ssst tttt iiii iiii iiii iiii |

addiu | add immediate unsigned (no overflow) | addiu Rt, Rs, Immediate |
Rt = Rs + Immediate |
I | 0010 01ss ssst tttt iiii iiii iiii iiii |

addu | add unsigned (no overflow) | addu Rd, Rs, Rt |
Rd = Rs + Rt |
R | 0000 00ss ssst tttt dddd d000 0010 0001 |

and | bitwise AND | and Rd, Rs, Rt |
Rd = Rs & Rt |
R | 0000 00ss ssst tttt dddd d000 0010 0100 |

andi | bitwise AND immediate | andi Rt, Rs, immediate |
Rt = Rs & immediate |
I | 0011 00ss ssst tttt iiii iiii iiii iiii |

beq | branch on equal | beq Rs, Rt, offset |
(Rs == Rt) ? $pc + (offset << 2) : $pc + 4 |
I | 0001 00ss ssst tttt iiii iiii iiii iiii |

bne | branch on not equal | bne Rs, Rt, offset |
(Rs != Rt) ? $pc + (offset << 2) : $pc + 4 |
I | 0001 01ss ssst tttt iiii iiii iiii iiii |

blez | branch on less than or equal to zero | blez Rs, offset |
(Rs <= 0) ? $pc + (offset << 2) : $pc + 4 |
I | 0001 10ss sss0 0000 iiii iiii iiii iiii |

bltz | branch on less than zero | bltz Rs, offset |
(Rs < 0) ? $pc + (offset << 2) : $pc + 4 |
I | 0000 01ss sss0 0000 iiii iiii iiii iiii |

bltzal | branch on less than zero and link (saves return address) | bltzal Rs, offset |
(Rs < 0) ? $ra = $pc + 8; $pc + (offset << 2) : $pc + 4 |
I | 0000 01ss sss1 0000 iiii iiii iiii iiii |

bgez | branch on greater than or equal to zero | bgez Rs, offset |
(Rs >= 0) ? $pc + (offset << 2) : $pc + 4 |
I | 0000 01ss sss0 0001 iiii iiii iiii iiii |

bgtz | branch on greater than zero | bgtz Rs, offset |
(Rs > 0) ? $pc + (offset << 2) : $pc + 4 |
I | 0001 11ss sss0 0000 iiii iiii iiii iiii |

bgezal | branch on greater than or equal to zero and link (saves return address) | bgezal Rs, offset |
(Rs >= 0) ? $ra = $pc + 4; $pc + (offset << 2) : $pc + 4 |
I | 0000 01ss sss1 0001 iiii iiii iiii iiii |

div | divides Rs by Rt and stores quotient in $Lo and remainder in $Hi |
div Rs, Rt |
$Lo = Rs / Rt; $Hi = Rs % Rt |
R | 0000 00ss ssst tttt 0000 0000 0001 1010 |

divu | divides (unsigned) Rs by Rt and stores quotient in $Lo and remainder in $Hi |
divu Rs, Rt |
$Lo = Rs / Rt; $Hi = Rs % Rt |
R | 0000 00ss ssst tttt 0000 0000 0001 1011 |

j | jump to 26 bit absolute-address | j absolute-addr |
$pc = next-$pc; next-$pc = ($pc & 0xf0000000) | (absolute-addr << 2); |
J | 0000 10aa aaaa aaaa aaaa aaaa aaaa aaaa |

jal | jump and link (stores return address) | jal absolute-addr |
$ra = $pc + 8; $pc = next-$pc; next-$pc = ($pc & 0xf0000000) | (absolute-addr << 2); |
J | 0000 11aa aaaa aaaa aaaa aaaa aaaa aaaa |

jr | (jump register) jump to 4-byte address contained in register Rs |
jr Rs |
$pc = next-$pc; next-$pc = Rs; |
R | 0000 00ss sss0 0000 0000 0000 0000 1000 |

lb | (load byte) load one byte into target register from specified address | lb Rt, offset(Rs) |
Rt = Memory[Rs + offset] |
I | 1000 00ss ssst tttt iiii iiii iiii iiii |

lui | (load upper immediate) load 2 byte immediate value into upper 2 bytes of a register. Lower 2 bytes are zeroe'd out | lui Rt, immediate |
Rt = immediate << 16 |
I | 0011 11-- ---t tttt iiii iiii iiii iiii |

lw | (load word) load 4 bytes into target register from memory | lw Rt, offset(Rs) |
Rt = Memory[Rs + offset] |
I | 1000 11ss ssst tttt iiii iiii iiii iiii |

mfhi | (move from $Hi) contents of register $Hi are moved to destination register |
mfhi Rd |
Rd = $Hi |
R | 0000 0000 0000 0000 dddd d000 0001 0000 |

mflo | (move from $Lo) contents of register $Lo are moved to destination register |
mflo Rd |
Rd = $Lo |
R | 0000 0000 0000 0000 dddd d000 0001 0010 |

mult | (multiply) multiply Rs by Rt and stores result in $Lo | mult Rs, Rt |
$Lo = Rs * Rt |
R | 0000 00ss ssst tttt 0000 0000 0001 1000 |

multu | (multiply unsigned) multiply Rs by Rt and stores result in $Lo | multu Rs, Rt |
$Lo = Rs * Rt |
R | 0000 00ss ssst tttt 0000 0000 0001 1001 |

noop | no operation - CPU does nothing. Most instructions with $zero as the destination register can act as a noop. | noop | This particular encoding is implemented as sll $zero, $zero, $zero | R | 0000 0000 0000 0000 0000 0000 0000 0000 |

or | bitwise OR | or Rd, Rt, Rs |
Rd = Rs | Rt |
R | 0000 00ss ssst tttt dddd d000 0010 0101 |

ori | bitwise OR immediate | ori Rt, Rs, immediate |
Rt = Rs | immediate |
I | 0011 01ss ssst tttt iiii iiii iiii iiii |

sb | (store byte) store least significant byte of Rt to memory | sb Rt, offset(Rs) |
Memory[Rs + offset] = (0xff & Rt) |
I | 1010 00ss ssst tttt iiii iiii iiii iiii |

sw | (store word) store 4 bytes at a specified address | sw Rt, offset(Rs) |
Memory[Rs + offset] = Rt |
I | 1010 11ss ssst tttt iiii iiii iiii iiii |

sll | (shift left logical) shift register value left with zeroes by specified number of bits | sll Rd, Rt, x |
Rd = Rt << x |
R | 0000 00ss ssst tttt dddd dxxx xx00 0000 |

sllv | (shift left logical variable) shift register value left with zeroes by specified number of bits in source register | sllv Rd, Rt, Rs |
Rd = Rt << Rs |
R | 0000 00ss ssst tttt dddd d--- --00 0100 |

slt | (set on less than - signed) set destination register to 0x01 if source register is less than target register, else set destination register to 0x00. |
slt Rd, Rs, Rt |
(Rs < Rt) ? Rd = 1 : Rd = 0 |
R | 0000 00ss ssst tttt dddd d000 0010 1010 |

slti | (set on less than immediate - signed) set target register to 0x01 if source register is less than immediate value, else set target register to 0x00. | slti Rt, Rs, immediate |
(Rs < immediate) ? Rt = 1 : Rt = 0 |
R | 0010 10ss ssst tttt iiii iiii iiii iiii |

sra | (shift right arithmetic) shift a register value right with sign bit shifted in by the specified number of bits and place the value in destination register |
sra Rd, Rt, x |
Rd = Rt >> x |
R | 0000 00-- ---t tttt dddd dxxx xx00 0011 |

srl | (shift right logical) shift a register value right with zeroes in by the specified number of bits and place the value in destination register |
srl Rd, Rt, x |
Rd = Rt >> x |
R | 0000 00-- ---t tttt dddd dxxx xx00 0010 |

srlv | (shift right logical variable) shift a register value right with zeroes in by the specified number of shifts in the source register and place the value in destination register |
srlv Rd, Rt, Rs |
Rd = Rt >> Rs |
R | 0000 00ss ssst tttt dddd d000 0000 0110 |

sub | subtract two registers and store the result in the destination register | sub Rd, Rs, Rt |
Rd = Rs - Rt |
R | 0000 00ss ssst tttt dddd d000 0010 0010 |

syscall | Generate a software interrupt and perform appropriate system call based on value in $v0 | syscall 0x40404 | Operation dependant on syscall number. An example syscall is socket(2, 2, 0) | R | 0000 00-- ---- ---- ---- ---- --00 1100 |

xor | bitwise exclusive OR | xor Rd, Rs, Rt |
Rd = Rs ^ Rt |
R | 0000 00ss ssst tttt dddd d--- --10 0110 |

xori | bitwise exclusive OR to immediate value | xori Rt, Rs, immediate |
Rt = Rs ^ immediate |
I | 0011 10ss ssst tttt iiii iiii iiii iiii |

#### Pseudo-Instructions

The MIPS instruction set is very minimal thus there are several macros, also known as, pseudo-instructions that the assembler will translate into the corresponding instructions.

When writing MIPS assembly, some assemblers support the usage of certain pseudo-instructions and will convert them to the corresponding assembly instructions.

An example pseudo-instruction is move, the **move** instruction in MIPS is actually achieved using the **add** instruction.

So **move $s0, $s2** translates to => **add $s0, $s2, $zero**

Another common pseudo-instruction often seen in disassembly is: **la $a0, 0x7fffffff**

The **load address (la)** instruction is actually represented by two MIPS instructions:

- lui $a0, 0x7fff - (load upper 2 bytes of address)
- ori $a0, $a0, 0xffff - (load lower 2 bytes of address)

For more examples of pseudo-instructions, visit the link here

#### Further Reading

1. MIPS Instruction Reference (UIdaho)

2. Programmed Introduction to MIPS Assembly Language (Central Connecticut State University)