System software
School of Electrical Engineering
University of Belgrade
Copyright © 2017 Nikola Bebić
This is a school project for the system software class. The purpose of the project is to write the assembler for the MicroRISC language and the emulator which would execute the programs produced by the assembler.
Just run the build.sh
script and everything should be ready to go
Run the run.sh
script with the -f option with the name of the .ss
input file. The script will run the assembler with the input file, generate the output file, and run the emulator with that output file.
Example:
run.sh -f helloworld.ss
Input file:
org 0x0
.rodata.0
dd stack - 4
dd print_string_interrupt
dd 30 dup dummy
org 128
.data.0
out:
dd 0
in:
dd 0
.bss
stack:
DW 0x100 DUP ?
.text
dummy: ret
.global _start
_start:
call hello
load r0, #0
int r0
PRINT_STRING_INTERRUPT def 1
hello:
load r0, #hello_string
load r15, #PRINT_STRING_INTERRUPT
int r15 ; calls print_string_interrupt
ret
print_string:
loadub r1, [r0]
jz r1, skip
store r1, out
load r1, #1
add r0, r0, r1
jmp print_string
skip:
ret;
print_string_interrupt:
call print_string
ret
.rodata.1
hello_string:
DB 'H'
DB 'e'
DB 'l'
DB 'l'
DB 'o'
DB ','
DB ' '
DB 'W'
DB 'o'
DB 'r'
DB 'l'
DB 'd'
DB '!'
DB 10 ; CR
DB 13 ; LF
DB 0 ; end of string
Output:
Hello, World!
32-bit RISC processor
32-bit virtual address space, addressable unit - byte, little-endian
No floating point arithmetic
16 32-bit general purpose registers, R0
- R15
32-bit program counter: PC
32-bit stack pointer: SP
. Stack grows towards higher addresses, stack pointer points to the word at the top of the stack
Constant terms can containt the following:
- Literals
- Arithmetic operators (+, -, *, /)
- Subexpressions with parentheses
Literals are signed decimal, binary or hexadecimal integers, or ASCII characters, as well as named constants or labels
Labels can can contain letters, digits, and symbol _
, and can not start with a letter
There is a predefined symbol $
, which represents the address of the current instruction
- Immediate:
#constant_term
- Register direct:
Ri
- Register indirect:
[Ri]
- Register indirect with offset:
[Ri + offset]
.offset
is a constant term - PC relative:
$constant_term
. This is treated as register indirect with offset. Constant term must contain at least one label
Instruction format:
[label:] instruction [operand0, operand1, operand2] [; comment]
Instruction | Address modes | Comment |
INT op |
Register direct | Generates a software interrupt. Interrupt entry is in the register |
JMP op |
Memory direct, register indirect, register indirect with offset |
Jumps to given address |
CALL op |
Memory direct, register indirect, register indirect with offset |
Calls a subroutine.PC is pushed to the stack |
RET |
None | Returns from subroutine |
JZ reg, op |
reg : Register direct, op : Memory direct,register indirect, register indirect with offset |
Jumps to op if reg == 0 |
JNZ reg, op |
reg : Register direct, op : Memory direct,register indirect, register indirect with offset |
Jumps to op if reg != 0 |
JGZ reg, op |
reg : Register direct, op : Memory direct,register indirect, register indirect with offset |
Jumps to op if reg > 0 |
JGEZ reg, op |
reg : Register direct, op : Memory direct,register indirect, register indirect with offset |
Jumps to op if reg >= 0 |
JLZ reg, op |
reg : Register direct, op : Memory direct,register indirect, register indirect with offset |
Jumps to op if reg < 0 |
JLEZ reg, op |
reg : Register direct, op : Memory direct,register indirect, register indirect with offset |
Jumps to op if reg < 0 |
Load, sizes of operands:
- Unsigned byte, suffix:
UB
- Signed byte, suffix:
SB
- Unsigned word, suffix:
UW
- Signed word, suffix:
SW
- Double word, no suffix
Store, sizes of the operands:
- Byte, suffix:
B
- Word, suffix:
W
- Double word, no suffix
Size of word is 2 bytes, and size of double word is 2 words
Instruction | Address modes | Comment |
LOAD reg, op |
reg : Register direct, op : All |
Loads the data into the register |
STORE reg, op |
reg : Register direct, op : All except immediate |
Stores the data from the register |
- 32-bit double word is always pushed to the stack, and popped from the stack
Instruction | Address modes | Comment |
PUSH reg |
Register direct | Pushes the register to the stack |
POP reg |
Register direct | Pops the register from the stack |
- Work only on 32-bit operands
- Signed arithmetic
Instruction | Address modes | Comment |
ADD reg0, reg1, reg2 |
Register direct | reg0 = reg1 + reg2 |
SUB reg0, reg1, reg2 |
Register direct | reg0 = reg1 - reg2 |
MUL reg0, reg1, reg2 |
Register direct | reg0 = reg1 * reg2 |
DIV reg0, reg1, reg2 |
Register direct | reg0 = reg1 / reg2 |
MOD reg0, reg1, reg2 |
Register direct | reg0 = reg1 % reg2 |
AND reg0, reg1, reg2 |
Register direct | reg0 = reg1 & reg2 |
OR reg0, reg1, reg2 |
Register direct | reg0 = reg1 | reg2 |
XOR reg0, reg1, reg2 |
Register direct | reg0 = reg1 ^ reg2 |
NOT reg0, reg1 |
Register direct | reg0 = ~reg1 |
ASL reg0, reg1, reg2 |
Register direct | reg0 = reg1 << reg2 |
ASR reg0, reg1, reg2 |
Register direct | reg0 = reg1 >> reg2 |
Format:
[label:] definition data_specifier [, ...] [; comment]
Possible definitions:
DB
- defines a byteDW
- defines a wordDD
- defines a double word
Data specifiers:
constant_term [ DUP constant_term | ? ]
DUP
- First constant term denotes how many times the second constant term will occur?
- Undefined value
Named constant definition:
symbol DEF constant_expression [; comment]
Origin directive:
ORG constant_expression [; comment]
.text[.number]
- section containing the program code.data[.number]
- section containing initialized data.rodata[.number]
- section containing read only data.bss[.number]
- section containing uninitialized data
- IV table starts at the address 0 and has 32 entries
- During the interrupt execution, no hardware interrupt can happen
- Executing
INT 0
will end the program - Entry 0 in the IVT contains the starting value of the stack pointer
- Entry 3 in the IVT contains the address of the error interrupt routine
- Entry 4 in the IVT contains the address of the timer interrupt routine. This routine is called every
0,1s
- Entry 5 in the IVT contains the address of the keyboard interrupt routine. This routine is called every time a key is pressed
Two registers are mapped in the address space, right after the IV table.
The first register is mapped to the address 128
and is the stdout
register. Every time a value is written to this register, it will be written on the standard output stream
The second register is mapped to the address 132
and is the stdin
register. Every time a keyboard interrupt happens, this register will contain the ASCII code of the hit character. The value can be read more than once. New interrupts will not happen until the value is read at least once
MIT License
Copyright (c) 2017 Nikola Bebic
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.