tiny-os

A tiny operating system written in C and Assembly.

iso/boot

The iso directory contains multiple sub-directories that are then compressed into a singular iso file using the genisoimage binary. The directory contains the necesarry "grub" bootloadeer (which is a program responsible for putting the operating system onto main memory from secondary storage) subdirectory with a configuration file, as well as the stage2_eltorito bootloader file required for compessing the directory into the iso file. The compiled kernel.elf binary is also blased inside of iso/boot, which is the entry point of the operating system.

Source files

The operating system consists of multiple files:

io.s - an assembly file (along with its header file) that exposes outb and inb to allow the functions to be used inside of the kernel C code
kmain.c - the kernel "main" file where all of the "higher level" OS logic is defined
loader.s - an assembly file with the actual entrypoint of the kernel. This is where all the memory segments get initialised, along with the stack. External symbols from the kmain.c file are referenced and called (like the main function).
link.ld - a linker script, mostly used for alinging all the memory segments after 1MB (as everything before that is used by grub)
makefile - the makefile used for automating the build process

Overview

The process for beginning OS development starts with first using the loader.s file to put an arbitrary value into one of the registers, running the OS emulation software (qemu in this case) and passing in a log file path to save all the logs to. After running the OS the log file is checked for the value to see if the loader.s file has successfuly been called.

After this is finished, the stack is set up to allow C code to be ran. Functions in C can be called from assembly by first using the extern keyword to tell the nasm compiler that external symbols are going to be used, then putting a functions parameters onto the stack (or onto the registers) with respect to the calling convention (in this case cdecl) and finally calling "call" on the function.

The framebuffer is just a place in memory where each 2 bytes represent a pixel. It is an 80x25 display (so 2000 "pixels"). There are 16 colours that can be used. The first (right-to-left) byte is the actual ascii code of the character to be printed, and the following one is split into 2 4-bit values. The first 4 bits are responsible for the foreground colour and the following 4 for the background.

The framebuffer is stored at memory address 0xB8000 as stated by the OSDev wiki:

So if in the C code this memory address is treated as a char pointer, it can simply be used like an array to iterate over each proceeding byte of the framebuffer.

The Framebuffer also has two IO ports for describing data (0x3D4, also called the "Command Port"), and for the data itself (0x3D5, also called the "Data Port"). reading and writing to these ports its done via the out and in asm keywords, for which there are wrappers that allow these functions to be called from C code. These ports can be used to set the blinking cursor's position.

The code also provides a "printf" like function for writing to the screen's terminal using the framebuffer:

int main()
{
    clear();
    char buffer[] = "Hello World!";
    fb_write(buffer, sizeof(buffer));
    move_cursor(0, 0);
    return 0;
}

int fb_write(char *buf, unsigned int len)
{
    unsigned int writePosition = 0;

    for (unsigned int i = 0; i < len; i++)
    {
        if (buf[i] == '\n') {
            // Move to next line
            unsigned int cursor = writePosition / 2;
            unsigned int nextLine = ((cursor / 80) + 1) * 80;
            writePosition = nextLine * 2;
            continue;
        }
        
        fb_write_cell(writePosition, buf[i], FB_GREEN, FB_BLACK);

        writePosition += 2;  // advance one char
        fb_move_cursor(writePosition / 2);
    }

    return 0;
}

Serial Port

The serial port allows for communication between hardware devices via the line. The serial data port starts at the base address (usually 0x3F8), and all the other ports are obtained by providing offset values to the base address (+ 2, + 3, + 4...). A configuration byte is sent over the line command port to describe how data is sent over the line. When data is transmitted or received via the serial port its placed in buffers. If too much data is sent too fast the buffer will be full and data will be lost. The buffers are FIFO queues. The Modem is a control register that is used for simple hardware control flow. The Serial port has two pins called the Ready to Transmit and Data Terminal Ready pins which prevent data collision when data is already in the process of being sent ot received.

My goal for the second part of development was to focus on creating as organised of a project as I could in terms of readability and structure. I followed the same naming conventions for procedures and variables that were used in the files provided (PIC, interripts, etc) as well as follow some of the more minor details, like the header guards used in each file (instead of the usualy #pragma once) which ensure the header guards are compatible no matter the compiler used.

File structure

./
├── build
│   -- All The .o files go here to not create a mess in the source files
├── interrupts // Interrupt logic, handlers and programmable interrupt controller  
│   ├── hardware_interrupt_enabler.h
│   ├── hardware_interrupt_enabler.s
│   ├── interrupt_asm.s
│   ├── interrupt_handlers.s
│   ├── interrupts.c
│   ├── interrupts.h
│   ├── pic.c
│   └── pic.h
├── io // Everything to do with Input and output. Shell, commands, keyboard handling (for interrupts), std-like utilities
│   ├── frame_buffer.c
│   ├── frame_buffer.h
│   ├── io.h
│   ├── io.s
│   ├── keyboard.c
│   ├── keyboard.h
│   ├── shell
│   │   ├── shell.c
│   │   └── shell.h
│   └── std
│       ├── std.c
│       └── std.h
├── iso
│   └── boot
│       ├── grub
│       │   ├── menu.lst
│       │   └── stage2_eltorito
│       └── kernel.elf
├── kernel.elf
├── kmain.c
├── link.ld
├── loader.s
├── logQ.txt
├── makefile
├── os.iso
├── serial // Obsolete, logging utilities
│   ├── serial.c
│   └── serial.h
└── util // Utilities like type macros,
    ├── type.h
    ├── util.c
    ├── util.h
    └── util.s

Each file and folder contains only the appropriate logic and functionality in respect to the file's name. I have also tried to keep kmain.c as simple and small as possible:

int main() {
    interrupts_install_idt(); 
    enable_hardware_interrupts();
    shell_init_commands();
    clear();
    print_shell();
   
    return 0;
}

Overview

Part 2 consisted of getting interrupts to work so that text could be shown on the screen and creating a working terminal with commands and input validation.

There are multiple layers that go into getting interrupts to work. Interrupts are handled via the IDT (Interrupt descriptor table) which describes a handler for each interrupt. Interrupts are numbered 0 - 255. An entry in the IDT consists of 64 bits which define the address of the interrupt in the segment, if the handler is present in memory and other information.

Returning from an interrupt (which ultimately is just another asm procedure) the "iret" asm instruction needs to be used. This instruction expects the stack to be the same as at the time of the interrupt so stack cleanup and setup needs to be performed by the callee.

The PIC is a programmable interrupt controller which maps signals from hardware to interrupt procedures.

There used to be only one PIC with 8 interrupts, but as more hardware was being added 8 was not enough, so a second one called a "Slave" PIC was added. It is called the Slave interrupt because it passes all the interrupts it receives to the "Master" PIC. In the Master PIC interrupt number 2 is actually the slave PIC, therefore IRQ2 (interrupt request 2) can never actually happen (from OSDev.org).

Every time the CPU finishes with 1 instruction it checks the PIC's pin to see if an interrupt has happened. If it has, data is pushed onto the stack to preserve the current state and control flow is redirected to the interrupt handler.

In order to get the keyboard working the keyboard's pin in PIC1 must be read. The keyboard does not send ASCII codes, instead it sends "scan codes" which tell the programmer whether a button has been pushed down or released. This makes it easier to do things like checking whether multiple keys have been pressed at once (for getting uppercase letters working).

After a button has been pressed on the keyboard, the CPU notices that an interrupt has occured in one of the PIC's interrupt pins. The CPU will pause execution and pass the control flow to the keyboard handler. The keyboard handler will read the keyboard scan code and do everything the programmer wants it to do and then acknowledge the interrupt and return to the regular execution flow. If an interrupt is not acknowledged no other interrupts will go through.

Interrupts

To get interrupts for the keyboard working first the interrupt descriptor table must be installed via interrupts_install_idt which calls interrupts_init_descriptor which fills in one entry of the IDT. Afterwards enable_hardware_interrupts can be called.

The IDT can be loaded into memory using the "lidt" asm instruction for which a wrapper is created in interrupt_handlers.s.

There are lots of different reasons for why an interrupt can occur and each one needs its own handler for an interrupt. Because of this, a generic interrupt handler is written using NASM macro's in interrupt_asm which pushes all the general purpos registers, calls the interrupt handler function and restores all the registers. These macro's take one parameter which is the "interrupt number" and are expanded and filled directly into the binary any time they are called.

The hardware_interrupt_enabler files are simply wrappers for the sti and cli instructions which enable and disable interrupts.

Some of the code that was provided by the 2nd worksheet had parts of it that could be further abstracted away. A good example of this is the interrupts.c file.

I've further abstracted the interrupt handler (which is called from interrupt_asm.s whenever an interrupt occurs):

void interrupt_handler(__attribute__((unused)) struct cpu_state cpu, unsigned int interrupt, __attribute__((unused)) struct stack_state stack) {
    switch (interrupt)
    {
    case INTERRUPTS_KEYBOARD:
        while ((inb(0x64) & 1))
        {
            keyboard_handler(); // This is part of the io/keyboard.c file
            buffer_index = (buffer_index + 1) % INPUT_BUFFER_SIZE;
        }
        pic_acknowledge(interrupt);
        break;
    default:
        break;
    }
}

The keyboard_handler procedure belongs to the io/keyboard.c file, which only handles logic related to the keyboard (i.e. if the shift key was pressed, capitalise letters. If the enter or backspace keys were pressed, execute the appropriate instructions).

IO

The framebuffer logic has been abstracted into its own file which allows for a cleaner way of interacting with the screen's content.

One of the additions to the framebuffer functionality is the ability to scroll all the screen's content up:

void shift_up() {
    // For every row except the first
    for (int row = 1; row < 25; row++) {
        for (int column = 0; column < 80; column++) {
            unsigned int cellByte = (row * 80 + column) * 2;

            short cellData = fb_read_raw_cell(cellByte);

            char c = cellData & 0x00FF;
            unsigned char at = (cellData >> 8) & 0xFF;

            unsigned char fg = (at >> 4) & 0x0F;
            unsigned char bg = at & 0x0F;

            fb_write_cell(cellByte - 160, c, fg, bg);
        }
    }

    row--;
    clear_row(24);
}

This function is called whenever the terminal line gets to the very end of the framebuffer (25th row). Its called either when the user presses enter, or when the print function is printing new lines that get to the last row so that none of the content is lost.

I created my own printf function which does not take a length for the string, instead it gets the length within the function using an strlen utility I have written which keeps looping through each character until it gets to a null terminator. This function can be found inside of io/std/std.c

void printf(char *buf, unsigned char fg, unsigned char bg) { // No parameter for the length as it is computed from within the printf function.
    int len = strlen(buf);

    for (unsigned int i = 0; i < len; i++) {

        if (buf[i] == '\n') {
            row++;
            column = 0;
            continue;
        }

        // write at (row, column)
        unsigned int cell = column + row * 80;
        unsigned int byte_offset = cell * 2;
        column++;

        fb_write_cell(byte_offset, buf[i], fg, bg);
        fb_write_cell(byte_offset+2, 0, FB_BLACK, FB_GREEN);
        move_cursor(column, row);

        if (column >= 80) {
            column = 0;
            row++;
        }

        if (row > 23) {
            shift_up();
        }
    }
}

int strlen(char* str) {
    int len = 0;
    for (char *c = str; *c != '\0'; c++) len++;

    return len;
}

I also have a readline function which will only read userinput (i.e. everything after the beginning shell bit)

char* readline() { // returns ONLY the user input
    static char* line; // statuc so that the array's lifetime is as long as the entire program
    int i = 0;

    while(getc(i, row) != ' ') i++;

    if(getc(i, row) == ' ')
        i++; // increment so we get past the space

    for(int j = 0; j < 80 - i; j++) {
        line[j] = getc(i, row);
        i++;
    }

    return line;
}

This lets me easily parse commands the user types in:

void handle_enter() {
    parse_command(readline());
    if (row > 23) {
        shift_up();
        row = 22;
        clear_row(23);
    } else {
        row++;
    }
    column = 0;
    print_shell(); 
}

void parse_command(char* line) {
    char command[80];
    char arguments[80];

    int i = 0;

    for(i; i < 80; i++) {
        if (line[i] == ' ') break;
        if (i == 80) return; // Space is never found, line must be empty

        command[i] = line[i];
    }

    command[i] = '\0';
    
    if (line[i] == ' ') // this is unnecesarry, but its here for readability!
        i++;

    // Extract arguments
    int j = 0;
    
    for (i; i < 80; i++) {
        if (line[i] == '\0') break;
        if (i == 80) return; // null-terminator is never found, line must be empty
        
        arguments[j] = line[i];
        j++;
    }

    arguments[j] = '\0';

    // Dispatch commands
    for(int i = 0; i < NUM_OF_COMMANDS; i++) {
        if (strcmp(command, commands[i].name) == 0) {
            if (commands[i].name == "echo"){
                if(row > 23) shift_up();
                row++;
            }
                
            commands[i].function(arguments); // pass in 0 instead of NULL cuz null isnt defined
            
            return;
        }
    }

    
    if (strcmp(command, "") != 0) {
        row++;
        printf("Command not found!", FB_BLACK, FB_RED);
    }
}

printing a prompt is done with the following function:

void print_shell() {
   printf(shellString, FB_BLACK, FB_LIGHT_GREEN);
}

commands are defined like this, which is one of the things I feel like I couldve done better:

struct command {
    const char* name;
    const char* description;
    void (*function)(char* args);
};

struct command commands[NUM_OF_COMMANDS];

void shell_init_commands() {
    commands[0].name = "clear";
    commands[0].function = &clear;
    commands[0].description = " - clear the screen";
    commands[1].name = "echo";
    commands[1].function = &echo;
    commands[1].description = " [text] - print out strings to the terminal";
    commands[2].name = "help";
    commands[2].function = &print_help;
    commands[2].description = " - show this message";
    commands[3].name = "version";
    commands[3].function = &print_version;
    commands[3].description = " - print version information";
    commands[4].name = "shutdown";
    commands[4].function = &shutdown;
    commands[4].description = " - shutdown the system internally";
    commands[5].name = "reboot";
    commands[5].function = &reboot;
    commands[5].description = " - reboot the system internally";
}

The shutdown and reboot commands are particularly interesting as they send a shutdown/reboot trigger via the outb/outw (out word) instruction to the appropriate hardware devices. This allows for the ability of shutting down or rebooting the system internally.

void reboot() {
    unsigned char good = 0x02;
    while (good & 0x02)
        good = inb(0x64);
    outb(0x64, 0xFE); 
}

void shutdown() {
    row++;
    printf("Shutting down...", FB_BLACK, FB_LIGHT_GREEN);
    outw(0x604, 0x2000);
}

Terminal

Makefile & debugging

I've modified the makefile using the GNU Make documentation and AI. My modified makefile allows me to compile my program normally, or with debugging symbols and zero compiler optimisation. This made it incredibly easy to debug the program using gdb:

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
POCsrc		POCsrc
assets		assets
src		src
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tiny-os

iso/boot

Source files

Overview

Serial Port

File structure

Overview

Interrupts

IO

Terminal

Makefile & debugging

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

tiny-os

iso/boot

Source files

Overview

Serial Port

File structure

Overview

Interrupts

IO

Terminal

Makefile & debugging

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages