Building a Minimal OS from Scratch

About This Assignment

OS Assignment Overview

This assignment asks you to build a minimal but functional operating system in C and x86-32 NASM assembly. Starting from nothing but a BIOS-loaded 512-byte sector, you will add subsystems stage by stage — a scheduler, concurrency primitives, a memory manager, and finally a file system — exactly mirroring the sequence of OS lectures L07 through L12.

Every stage produces a system that boots in QEMU and passes a set of shell-level tests. Each later stage depends on the previous one; Stage 4 uses all of Stage 0–3. Read each stage completely before writing any code.

Textbook alignment All concepts in this article are covered in Stallings, Computer Organization and Architecture, 10th edition. Chapters are cited at the opening of each stage. The lecture notes for L07–L12 contain the detailed theory; this article is the implementation companion.

Toolchain Requirements

# Ubuntu / Debian
$ sudo apt install nasm gcc qemu-system-i386 make gdb gcc-multilib

# macOS (Homebrew)
$ brew install nasm qemu gdb i686-elf-gcc

# Verify
$ nasm --version && qemu-system-i386 --version

Submission

# 1. Create a private repository on your personal GitHub account
#    Go to https://github.com/new — set visibility to Private
#    Repository name: seng21213-os

# 2. Clone it and copy the starter kit in
$ git clone https://github.com/<your-username>/seng21213-os.git
$ cp -r stage0/* seng21213-os/
$ cd seng21213-os

# 3. Add a .gitignore to exclude build artefacts
$ echo "build/
seng21213.img
*.o
*.bin
*.elf" > .gitignore

# 4. Commit and push
$ git add . && git commit -m "Initial commit: Stage 0 starter kit"
$ git push origin main

# 5. Add the lecturer as a collaborator for review
#    GitHub → Settings → Collaborators → Add: tiroshanm

# 6. Tag each completed stage
$ git tag v0.1-stage0 && git push origin --tags
$ git tag v0.2-stage1 && git push origin --tags   # after Stage 1

Use your personal GitHub account. Create the repository at github.com/new and set visibility to Private. Add tiroshanm as a collaborator — this is how your submission will be reviewed. Do not commit binary artefacts: the build/ directory and seng21213.img must be in .gitignore.

Boot, VGA & Shell

Lectures 07–08 · Stallings Ch. 2, 11 · Starter kit provided

Stage 0

At power-on the CPU is in 16-bit Real Mode with a 1 MB address space. The BIOS runs POST, then loads the first 512-byte sector of the boot disk into address 0x7C00 and jumps to it. That sector — the MBR — is your bootloader. It has exactly 512 bytes including the 0xAA55 boot signature, so every byte counts.

Stage 0 gets you from that raw sector to a 32-bit Protected Mode kernel with a VGA text display and a working interactive shell — all without touching the BIOS again.

①

The x86 Boot Sequence

Real Mode → GDT → CR0 → Protected Mode → FAR JUMP

The CPU starts in Real Mode: 16-bit registers, segmented addressing (physical = segment × 16 + offset), maximum 1 MB RAM. Your bootloader must transition to Protected Mode — a 32-bit flat address space — before calling C code. The steps are strictly ordered:

#	Action	Why
1	`CLI` — disable interrupts	Prevent IRQs firing while GDT is half-set-up
2	Set up stack at `0x7C00`	C code and `CALL` instructions need a stack
3	Load kernel via `INT 0x13 AH=0x42`	BIOS disk read — last BIOS call before PM switch
4	`LGDT [gdt_descriptor]`	Tell CPU where the GDT lives
5	`OR CR0, 1` — set PE bit	Protection Enable — officially enters Protected Mode
6	`JMP 0x08:pm_entry` (FAR JUMP)	Flushes the prefetch queue; loads CS with ring-0 code selector
7	Set DS/ES/SS = `0x10`	Load data segment selector — now in 32-bit flat mode
8	`JMP kernel_main`	Hand off to C

Why the FAR JUMP? After writing to CR0, the CPU is in Protected Mode but the code-segment cache still holds Real Mode attributes. The FAR JUMP forces the CPU to reload CS from the GDT, completing the transition. Without it the next instruction may be decoded with the wrong operand size.

②

The Global Descriptor Table (GDT)

Three 8-byte descriptors: null, code (0x08), data (0x10)

The GDT defines memory segments. Each entry (descriptor) is 8 bytes and encodes a base address, a limit, and access/flag bytes. For a flat kernel we use base = 0, limit = 4 GB for both code and data so the entire 32-bit address space is accessible as a single segment.

; GDT layout in boot.asm — 3 entries × 8 bytes = 24 bytes
gdt_null:   dd 0, 0              ; Entry 0: null (required)

gdt_code:                           ; Entry 1: selector 0x08
    dw 0xFFFF                       ; Limit bits 15:0
    dw 0x0000                       ; Base  bits 15:0
    db 0x00                         ; Base  bits 23:16
    db 0x9A                         ; Access: P=1 DPL=0 S=1 Code R/X
    db 0xCF                         ; Flags: G=1(4KB) DB=1(32-bit) + Limit 19:16
    db 0x00                         ; Base  bits 31:24

gdt_data:                           ; Entry 2: selector 0x10
    dw 0xFFFF
    dw 0x0000
    db 0x00
    db 0x92                         ; Access: P=1 DPL=0 S=1 Data R/W
    db 0xCF
    db 0x00

③

VGA Text Driver

Writing characters directly to physical address 0xB8000

In Protected Mode there are no BIOS calls. To print text we write directly to the VGA text buffer at 0xB8000. The buffer is a 80×25 grid of 16-bit cells. The low byte is the ASCII character; the high byte is the attribute (background colour in bits 7–4, foreground in bits 3–0).

#define VGA_BUFFER  ((volatile uint16_t *)0xB8000)
#define VGA_COLS    80
#define VGA_ROWS    25

/* Build a VGA cell: attribute byte | ASCII char */
static inline uint16_t vga_cell(char c, uint8_t attr) {
    return (uint16_t)((attr << 8) | (uint8_t)c);
}

/* Write character at (row, col) */
static void vga_putchar_at(char c, uint8_t attr, int row, int col) {
    VGA_BUFFER[row * VGA_COLS + col] = vga_cell(c, attr);
}

/* Hardware cursor: CRT controller ports 0x3D4 / 0x3D5 */
void vga_update_hw_cursor(int row, int col) {
    uint16_t pos = (uint16_t)(row * VGA_COLS + col);
    outb(0x3D4, 0x0F); outb(0x3D5, (uint8_t)(pos & 0xFF));
    outb(0x3D4, 0x0E); outb(0x3D5, (uint8_t)(pos >> 8));
}

④

Interactive Shell

keyboard_readline() → tokenise → dispatch → repeat

The shell is a simple read-eval loop. keyboard_readline() blocks on PS/2 port 0x60 polling until Enter is pressed, echoing each character to the VGA display. The completed line is tokenised on whitespace and the first token is looked up in a command table.

/* Command table entry */
typedef struct {
    const char  *name;
    const char  *help;
    void       (*fn)(int argc, char *argv[]);
} command_t;

/* Shell main loop — never returns */
void shell_run(void) {
    char  line[256];
    char *argv[16];
    for (;;) {
        print_prompt();
        keyboard_readline(line, sizeof(line));
        int argc = tokenise(line, argv, 16);
        if (argc == 0) continue;
        dispatch(argc, argv);   /* look up commands[] table */
    }
}

The starter kit provides: help, clear, echo, version, colour, halt. Stages 1–4 add: ps, kill, meminfo, ls, touch, cat, write, rm.

⑤

Build, Run & Verify

make → make run → test all shell commands

$ make
  AS    boot/boot.asm
  CC    kernel/kernel.c
  CC    kernel/shell.c
  CC    kernel/string.c
  CC    drivers/vga/vga.c
  CC    drivers/keyboard/keyboard.c
  LD    build/kernel.elf
  BIN   build/kernel.bin
  IMG   seng21213.img  (34K)

$ make run
# Expected QEMU output:
# [  OK  ] VGA driver initialised
# [  OK  ] PS/2 keyboard driver initialised
# +=================================================+
# |   SENG 21213 - Stage 0 Kernel Shell            |
# |   Type 'help' for available commands            |
# +=================================================+
# kernel> _

Verification checklist — Stage 0 Run each command and confirm the output: help lists all commands · echo hello world prints hello world · version shows kernel name and version · colour 14 1 changes colours · clear clears the screen · halt stops the CPU (QEMU shows "halted").

Extension Ideas — Stage 0 (bonus marks)

Full ISO keyboard layout with Shift, CapsLock, and numeric keypad support
VGA scrolling ring buffer with Page-Up / Page-Down
ANSI escape-code colour support (ESC[31m … ESC[0m) in the VGA driver
Command history (up/down arrow) with a 20-entry ring buffer in keyboard_readline()
Tab-completion scanning the command table for prefix matches
Replace the custom bootloader with a GRUB2 multiboot header — load the kernel as a standard ELF binary

Process Table & Round-Robin Scheduler

Lecture 09 · Stallings Ch. 3, 9 · New files: process.h/c, scheduler.h/c, idt.h/c, switch.asm

Stage 1

Stage 1 makes the kernel multitasking. We need three pieces working together: an IDT (Interrupt Descriptor Table) so the CPU can call our C handlers, a PIT (Programmable Interval Timer) firing IRQ0 at 100 Hz, and a context switch routine that saves the current process's registers, picks the next READY process, and restores its registers. Lecture 09 covers the theory; this stage makes it run.

①

Process Control Block (PCB)

The kernel's representation of a process — pcb_t struct

Every process is described by a PCB stored in a fixed-size array — the process table. The fields that matter most for Stage 1 are the saved stack pointer (used to restore registers on context switch) and the process state.

/* include/process.h */
#define MAX_PROCS   16
#define STACK_SIZE  4096     /* 4 KB kernel stack per process */

typedef enum {
    PROC_UNUSED = 0,
    PROC_READY,
    PROC_RUNNING,
    PROC_BLOCKED,
    PROC_ZOMBIE,
} proc_state_t;

typedef struct {
    uint32_t      pid;
    proc_state_t  state;
    uint32_t      esp;          /* saved stack pointer (set on context switch) */
    uint32_t      stack_base;   /* bottom of this process's stack              */
    void        (*entry)(void); /* entry point function                        */
    char          name[32];
    uint32_t      ticks;        /* total CPU ticks consumed                    */
} pcb_t;

extern pcb_t proc_table[MAX_PROCS];
extern int   current_proc;

pcb_t *proc_create(const char *name, void (*entry)(void));
void   proc_exit(void);

②

IDT & PIT Timer (IRQ0)

256-entry IDT · i8253 PIT at 100 Hz · IRQ0 fires the scheduler

The IDT (Interrupt Descriptor Table) tells the CPU which handler to call for each interrupt vector. Vector 0–31 are CPU exceptions; vectors 32–47 are IRQs from the PIC. We remap the PIC so IRQ0 (timer) → vector 32, then program the i8253 PIT for 100 Hz.

/* kernel/idt.c */
#define PIC1_CMD   0x20
#define PIC1_DATA  0x21
#define PIC2_CMD   0xA0
#define PIC2_DATA  0xA1
#define PIT_CH0    0x40
#define PIT_CMD    0x43
#define PIT_HZ     100          /* 10 ms tick */
#define PIT_DIVISOR (1193182 / PIT_HZ)

void pit_init(void) {
    /* Remap PIC: IRQ0-7 → vectors 32-39, IRQ8-15 → 40-47 */
    outb(PIC1_CMD,  0x11); outb(PIC2_CMD,  0x11);
    outb(PIC1_DATA, 0x20); outb(PIC2_DATA, 0x28);
    outb(PIC1_DATA, 0x04); outb(PIC2_DATA, 0x02);
    outb(PIC1_DATA, 0x01); outb(PIC2_DATA, 0x01);
    outb(PIC1_DATA, 0x00); outb(PIC2_DATA, 0xFF);

    /* Program PIT channel 0: rate generator, divisor for 100 Hz */
    outb(PIT_CMD, 0x36);
    outb(PIT_CH0, (uint8_t)(PIT_DIVISOR & 0xFF));
    outb(PIT_CH0, (uint8_t)(PIT_DIVISOR >> 8));
}

③

Context Switch in NASM

PUSHAD saves all 8 GPRs · switch ESP · POPAD restores next process

The context switch must save all registers of the outgoing process and restore those of the incoming process. PUSHAD saves EAX, ECX, EDX, EBX, ESP (before PUSHAD), EBP, ESI, EDI — 32 bytes — onto the current stack. We store ESP in the PCB, swap to the new process's ESP, then POPAD restores that process's registers.

; boot/switch.asm
; context_switch(uint32_t *old_esp, uint32_t new_esp)
;   [esp+4] = pointer to save old esp into
;   [esp+8] = new esp value to load
global context_switch
context_switch:
    pushad                      ; Save EAX ECX EDX EBX ESP EBP ESI EDI
    mov  eax, [esp + 36]        ; arg0: old_esp ptr  (36 = 8 regs × 4 + ret)
    mov  [eax], esp             ; *old_esp = current ESP
    mov  esp, [esp + 40]        ; arg1: new_esp — switch stacks
    popad                       ; Restore next process's registers
    ret                         ; Return into next process's code

Stack layout after PUSHAD When context_switch() is entered, the stack (high → low) contains: ret_addr · arg0(old_esp*) · arg1(new_esp). After PUSHAD pushes 8 registers (32 bytes) the arguments are at [esp+36] and [esp+40].

④

Round-Robin Scheduler

IRQ0 handler → scheduler_tick() → pick next READY → context_switch()

/* kernel/scheduler.c */
void scheduler_tick(void) {
    proc_table[current_proc].ticks++;

    /* Find the next READY process (wrap around) */
    int next = (current_proc + 1) % MAX_PROCS;
    int searched = 0;
    while (searched < MAX_PROCS) {
        if (proc_table[next].state == PROC_READY) break;
        next = (next + 1) % MAX_PROCS;
        searched++;
    }
    if (searched == MAX_PROCS) return; /* no other process — keep running */

    int old = current_proc;
    proc_table[old].state  = PROC_READY;
    proc_table[next].state = PROC_RUNNING;
    current_proc = next;

    /* Send EOI to PIC before switching so the next tick fires */
    outb(0x20, 0x20);

    context_switch(&proc_table[old].esp, proc_table[next].esp);
}

EOI before context_switch Send the PIC End-Of-Interrupt (outb(0x20, 0x20)) before calling context_switch(), not after. After the switch you are executing in a different process — the EOI might never be sent if the original process does not run again immediately.

⑤

New Shell Commands & Verification

ps — list all processes with PID, state, name, ticks

/* Add to commands[] table in kernel/shell.c */
static void cmd_ps(int argc, char *argv[]) {
    (void)argc; (void)argv;
    vga_printf("%-4s %-12s %-10s %s\n",
               "PID", "NAME", "STATE", "TICKS");
    vga_puts("---- ------------ ---------- -----\n");
    for (int i = 0; i < MAX_PROCS; i++) {
        if (proc_table[i].state == PROC_UNUSED) continue;
        const char *states[] = {"UNUSED","READY","RUNNING","BLOCKED","ZOMBIE"};
        vga_printf("%-4d %-12s %-10s %u\n",
            proc_table[i].pid, proc_table[i].name,
            states[proc_table[i].state], proc_table[i].ticks);
    }
}

Verification — Stage 1 Create two simple processes from kernel_main() (each just loops printing a character). Run the kernel. Type ps — you should see 3 rows (idle + proc1 + proc2) with increasing tick counts each time you run ps.

Extension Ideas — Stage 1 (bonus marks)

Multi-level feedback queue (MLFQ) — 3 priority bands with ageing to prevent starvation
sleep(ms) syscall using the PIT tick counter and a wake-time-sorted sleep queue
fork() — duplicate the PCB, copy the 4 KB stack, return 0 to child / child PID to parent
CFS-style virtual runtime (vruntime) with a sorted run queue for proportional-share scheduling

Threads, Mutex & Semaphore

Lecture 10 · Stallings Ch. 4–5 · New files: thread.h/c, mutex.h/c, semaphore.h/c

Stage 2

Processes are heavyweight — each has its own address space. Threads are lightweight execution units that share the address space of their parent process but have their own stack and register state. Stage 2 adds kernel threads, a blocking mutex (binary lock), and a counting semaphore. The classic myglobal race condition is demonstrated live in the shell.

①

Kernel Threads

Thread shares process address space — new TCB extends the PCB idea

A kernel thread is almost identical to a process in Stage 1 — it has its own stack and saved ESP. The key difference is that threads belonging to the same process share all other resources (code, data, heap). For our flat-memory kernel this means we just allocate a new stack region and let the scheduler treat it the same as any other runnable unit.

/* include/thread.h */
typedef struct {
    uint32_t  tid;
    uint32_t  pid;            /* parent process */
    uint32_t  esp;            /* saved stack pointer */
    uint8_t   stack[STACK_SIZE];
    proc_state_t state;
    char      name[24];
} tcb_t;

tcb_t *thread_create(uint32_t pid, const char *name,
                      void (*fn)(void));
void   thread_exit(void);

②

The Race Condition

myglobal — demonstrating why mutual exclusion is necessary

The classic demonstration of a race condition: two threads both increment a shared counter. The increment is not atomic — it compiles to three instructions (LOAD → ADD → STORE) and the scheduler can interrupt between any two of them.

static volatile int myglobal = 0;
#define ITERS 100000

void thread_a(void) {
    for (int i = 0; i < ITERS; i++) myglobal++;  /* NOT atomic */
    vga_printf("A done. myglobal = %d\n", myglobal);
}

void thread_b(void) {
    for (int i = 0; i < ITERS; i++) myglobal++;
    vga_printf("B done. myglobal = %d\n", myglobal);
}
/* Expected: 200000 — Actual: somewhere between 100000 and 200000 */

Run this without a mutex first. Observe that myglobal is almost never 200000. Then add a mutex around the increment and observe that it is always exactly 200000. This is the empirical demonstration of Dijkstra's six mutual exclusion requirements from Lecture 10.

③

Mutex — Blocking Lock

Spin test with CAS · block if locked · unblock on release

A mutex has two states: unlocked (0) and locked (1). The key requirement is that the test-and-set must be atomic — no other thread can sneak in between the check and the set. x86 provides the XCHG instruction which is implicitly atomic (it asserts LOCK# on the bus).

/* include/mutex.h */
typedef struct {
    volatile int  locked;     /* 0 = free, 1 = held */
    int           owner;      /* TID of holder (-1 if free) */
    int           waiters[MAX_PROCS]; /* blocked thread queue */
    int           nwaiters;
} mutex_t;

/* kernel/mutex.c */
static inline int atomic_xchg(volatile int *ptr, int val) {
    int old;
    __asm__ volatile("xchgl %0, %1"
        : "=r"(old), "+m"(*ptr)
        : "0"(val) : "memory");
    return old;
}

void mutex_lock(mutex_t *m) {
    while (atomic_xchg(&m->locked, 1) == 1) {
        /* Lock is held — add self to waiters and block */
        m->waiters[m->nwaiters++] = current_tid;
        proc_table[current_tid].state = PROC_BLOCKED;
        scheduler_yield();  /* give up the CPU */
    }
    m->owner = current_tid;
}

void mutex_unlock(mutex_t *m) {
    m->owner  = -1;
    m->locked =  0;
    /* Unblock first waiter if any */
    if (m->nwaiters > 0) {
        int tid = m->waiters[--m->nwaiters];
        proc_table[tid].state = PROC_READY;
    }
}

④

Counting Semaphore

semWait blocks when count == 0 · semSignal unblocks a waiter

A semaphore generalises a mutex: the internal counter can be any non-negative integer, not just 0/1. semWait decrements the counter; if it would go negative the calling thread blocks. semSignal increments the counter and unblocks one waiter if any exist. This is the Dijkstra definition from Lecture 11.

/* include/semaphore.h */
typedef struct {
    volatile int  count;
    int           waiters[MAX_PROCS];
    int           nwaiters;
} semaphore_t;

void sem_init   (semaphore_t *s, int initial);
void sem_wait   (semaphore_t *s);   /* P / semWait / down */
void sem_signal (semaphore_t *s);   /* V / semSignal / up */

/* Producer-Consumer validation — kernel/shell.c */
#define BUF_SIZE 8
static int          buffer[BUF_SIZE];
static semaphore_t  sem_empty, sem_full, sem_mutex;

void producer_init(void) {
    sem_init(&sem_empty, BUF_SIZE); /* BUF_SIZE empty slots */
    sem_init(&sem_full,  0);        /* 0 full slots         */
    sem_init(&sem_mutex, 1);        /* binary mutex         */
}

Extension Ideas — Stage 2 (bonus marks)

Priority inheritance in the mutex: when a high-priority thread is blocked, boost the lock holder's priority to the waiter's level
Read-write lock (rwlock_t) — concurrent readers, exclusive writer, writer starvation prevention using a turn variable
Deadlock detector: periodically walk the resource-allocation graph using DFS to detect cycles
Barrier primitive (barrier_t) — N threads all call barrier_wait() and all unblock together when the last one arrives

Physical Memory Manager

Lecture 11 · Stallings Ch. 8 · New files: pmm.h/c

Stage 3

So far memory is allocated statically — stacks are fixed-size arrays in the BSS section. Stage 3 introduces a physical memory manager (PMM) that tracks which 4 KB page frames are in use using a bitmap. Every bit represents one frame: 0 = free, 1 = used. Allocation scans for the first zero bit; deallocation clears the bit.

①

BIOS E820 Memory Map

Detect available RAM at boot time using INT 0x15 EAX=0xE820

Before entering Protected Mode the bootloader calls INT 0x15, EAX=0xE820 to query the BIOS memory map. Each call returns one entry describing a region of physical memory (usable RAM, reserved, ACPI, etc.) The results are written into a known address (e.g. 0x8000) where pmm_init() reads them later.

; In boot.asm — call before LGDT/CR0 (still in Real Mode)
detect_memory:
    mov  di, 0x8004        ; Write entries starting at 0x8004
    xor  ebx, ebx           ; EBX=0 to start iteration
    xor  bp,  bp            ; BP = entry count
.e820_loop:
    mov  eax, 0xE820
    mov  ecx, 24            ; 24-byte entry
    mov  edx, 0x534D4150    ; Magic: 'SMAP'
    int  0x15
    jc   .e820_done         ; CF set = error or end of list
    inc  bp
    add  di, 24
    test ebx, ebx
    jnz  .e820_loop
.e820_done:
    mov  [0x8000], bp      ; Store count at 0x8000

②

Bitmap Frame Allocator

1 bit per 4 KB frame · pmm_alloc_frame() / pmm_free_frame()

For 32 MB of RAM there are 32 MB / 4 KB = 8192 frames. The bitmap is 8192 bits = 1 KB — tiny. We store it as an array of uint32_t, using bit manipulation to test, set, and clear individual bits efficiently.

/* kernel/pmm.c */
#define FRAME_SIZE      4096
#define BITMAP_SIZE     (32 * 1024 * 1024 / FRAME_SIZE / 32)

static uint32_t bitmap[BITMAP_SIZE];  /* 1 bit per frame */
static uint32_t total_frames;
static uint32_t free_frames;

static inline void bitmap_set  (uint32_t f) { bitmap[f/32] |=  (1u << (f%32)); }
static inline void bitmap_clear(uint32_t f) { bitmap[f/32] &= ~(1u << (f%32)); }
static inline int  bitmap_test (uint32_t f) { return (bitmap[f/32] >> (f%32)) & 1; }

uint32_t pmm_alloc_frame(void) {
    for (uint32_t i = 0; i < total_frames; i++) {
        if (!bitmap_test(i)) {
            bitmap_set(i);
            free_frames--;
            return i * FRAME_SIZE;  /* physical address */
        }
    }
    return 0;  /* OOM */
}

void pmm_free_frame(uint32_t phys) {
    uint32_t frame = phys / FRAME_SIZE;
    bitmap_clear(frame);
    free_frames++;
}

uint32_t pmm_free_frames (void) { return free_frames;  }
uint32_t pmm_total_frames(void) { return total_frames; }

③

pmm_init() — Parse E820 Map & Reserve Kernel

Mark all frames used, then free only usable regions above the kernel

void pmm_init(void) {
    /* Start with all frames marked used */
    memset(bitmap, 0xFF, sizeof(bitmap));

    uint16_t   count   = *(uint16_t *)0x8000;
    e820_entry_t *map = (e820_entry_t *)0x8004;

    for (int i = 0; i < count; i++) {
        if (map[i].type != 1) continue;   /* type 1 = usable RAM */
        uint32_t start = (uint32_t)map[i].base;
        uint32_t len   = (uint32_t)map[i].length;
        /* Free frames in this region, skipping the first 1 MB + kernel */
        uint32_t first = (start < KERNEL_END) ? KERNEL_END : start;
        for (uint32_t a = first; a < start + len; a += FRAME_SIZE) {
            bitmap_clear(a / FRAME_SIZE);
            free_frames++;
        }
    }
    total_frames = (32 * 1024 * 1024) / FRAME_SIZE;
}

meminfo shell command Add a meminfo command that calls pmm_free_frames() and pmm_total_frames() and prints used/free counts in KB. Expected output: Free: 30456 KB · Used: 1736 KB · Total: 32768 KB (values will vary by QEMU memory size).

Extension Ideas — Stage 3 (bonus marks)

Buddy allocator — splits frames into power-of-two blocks to reduce external fragmentation for contiguous multi-frame requests
Virtual memory manager — software-managed page table with a #PF (page fault) handler enabling demand paging
Slab allocator — kmalloc(n) / kfree(p) for variable-size kernel heap objects sitting on top of the frame allocator
Clock-algorithm page replacement when all frames are exhausted — swap victim pages to the RAM disk from Stage 4

RAM Disk File System

Lecture 12 · Stallings Ch. 11–12 · New files: ramdisk.h/c, fs.h/c

Stage 4

Stage 4 is the final stage. We allocate a 1 MB RAM disk (a region of physical memory treated as a block device), then build a simple inode-based file system on it. The design is inspired by ext2: a superblock describes the disk layout, bitmaps track free blocks and inodes, and each file's metadata lives in an inode. The shell gains ls, touch, cat, write, and rm.

①

RAM Disk Layout

Superblock · inode bitmap · block bitmap · inode table · data blocks

The 1 MB RAM disk is divided into fixed-size regions. All sizes are chosen to be round multiples of the 512-byte block size. The layout below uses 256-byte inodes and 512-byte data blocks.

Region	Offset	Size	Contents
Superblock	`0x000`	512 B	Magic, block/inode counts, offsets
Inode bitmap	`0x200`	512 B	1 bit per inode (supports 4096 inodes)
Block bitmap	`0x400`	512 B	1 bit per data block
Inode table	`0x600`	256 KB	1024 × 256-byte inodes
Data blocks	`0x40600`	~767 KB	512-byte blocks for file data

/* include/fs.h */
#define FS_MAGIC        0x21213F5   /* SENG21213 magic */
#define BLOCK_SIZE      512
#define MAX_INODES      1024
#define INODE_DIRECT    8          /* 8 direct block pointers per inode */

typedef struct {
    uint32_t magic;
    uint32_t total_blocks;
    uint32_t total_inodes;
    uint32_t free_blocks;
    uint32_t free_inodes;
    uint32_t inode_table_off; /* byte offset of inode table */
    uint32_t data_off;        /* byte offset of first data block */
} superblock_t;

typedef struct {
    uint32_t size;                    /* file size in bytes */
    uint32_t blocks[INODE_DIRECT];    /* direct block indices */
    uint32_t block_count;
    uint8_t  type;                    /* 0=free 1=file 2=dir */
    char     name[28];               /* flat FS: name lives in inode */
} inode_t;

②

File System API

open · read · write · close · unlink — POSIX-inspired interface

/* include/fs.h — public API */
void     fs_init    (void);
int      fs_open   (const char *name, int flags); /* returns fd or -1 */
int      fs_read   (int fd, void *buf, int n);
int      fs_write  (int fd, const void *buf, int n);
void     fs_close  (int fd);
int      fs_unlink (const char *name);
int      fs_ls     (inode_t *out, int max); /* fill array, return count */

/* Flags for fs_open */
#define O_RDONLY   0x01
#define O_WRONLY   0x02
#define O_CREAT    0x04   /* create if not exists */
#define O_TRUNC    0x08   /* truncate to zero on open */

③

New Shell Commands

ls · touch · cat · write · rm — validate the file system

/* Add to commands[] in kernel/shell.c */

static void cmd_ls(int argc, char *argv[]) {
    inode_t inodes[MAX_INODES];
    int n = fs_ls(inodes, MAX_INODES);
    vga_printf("%-24s %8s\n", "NAME", "SIZE");
    vga_puts  ("------------------------ --------\n");
    for (int i = 0; i < n; i++)
        vga_printf("%-24s %8u\n", inodes[i].name, inodes[i].size);
    vga_printf("%d file(s)\n", n);
}

static void cmd_write(int argc, char *argv[]) {
    /* write <filename> <text...> */
    if (argc < 3) { vga_puts("Usage: write <file> <text>\n"); return; }
    int fd = fs_open(argv[1], O_WRONLY | O_CREAT | O_TRUNC);
    if (fd < 0) { vga_puts("Cannot open file\n"); return; }
    for (int i = 2; i < argc; i++) {
        fs_write(fd, argv[i], strlen(argv[i]));
        if (i < argc - 1) fs_write(fd, " ", 1);
    }
    fs_close(fd);
}

Verification sequence — Stage 4 At the shell prompt run these in order and confirm each result:
ls → 0 files · touch readme → OK · write readme Hello from Stage 4 → OK · cat readme → Hello from Stage 4 · ls → readme 20 · rm readme → OK · ls → 0 files

④

Complete Project Structure — All Stages

Final directory tree after Stage 4

seng21213-os/
├── boot/
│   └── boot.asm            # MBR + E820 memory detection
│   └── switch.asm          # context_switch() — PUSHAD/POPAD
├── kernel/
│   ├── kernel.c            # kernel_main() — initialises all subsystems
│   ├── shell.c             # Shell + all commands
│   ├── string.c            # memset / strlen / strcmp etc.
│   ├── process.c           # Process table (Stage 1)
│   ├── scheduler.c         # Round-Robin scheduler (Stage 1)
│   ├── idt.c               # IDT + PIC + PIT init (Stage 1)
│   ├── thread.c            # Kernel threads (Stage 2)
│   ├── mutex.c             # Blocking mutex (Stage 2)
│   ├── semaphore.c         # Counting semaphore (Stage 2)
│   ├── pmm.c               # Physical memory manager (Stage 3)
│   ├── ramdisk.c           # 1 MB RAM disk block device (Stage 4)
│   └── fs.c                # Inode file system (Stage 4)
├── drivers/
│   ├── vga/
│   │   └── vga.c
│   └── keyboard/
│       └── keyboard.c
├── include/
│   ├── kernel.h  vga.h  keyboard.h  io.h  shell.h  string.h
│   ├── process.h  scheduler.h  idt.h
│   ├── thread.h   mutex.h      semaphore.h
│   ├── pmm.h
│   └── ramdisk.h  fs.h
├── linker.ld
├── Makefile
├── run.sh
└── README.md

Green entries are new files you add at that stage. Black entries are from Stage 0.

Extension Ideas — Stage 4 (bonus marks)

Single-indirect block pointer in the inode — expand max file from 8 × 512 B = 4 KB to 4 MB + 4 KB
Subdirectories — mkdir / cd / pwd with a per-process current-directory inode stored in the PCB
Write-ahead journaling (8 KB log) — the FS survives a simulated crash (halt at a bad moment) and replays the log on next mount
VFS abstraction layer — a file_ops_t vtable shared by the RAM disk FS and a second ROM read-only FS; mount() and umount() shell commands

Summary

What You Have Built

Stage 0 — Boot & Shell: A 512-byte NASM MBR that transitions the CPU to 32-bit Protected Mode, loads a C kernel, drives the VGA display without BIOS, reads the PS/2 keyboard by polling, and runs an interactive shell.

Stage 1 — Scheduler: An IDT, PIC remapping, a 100 Hz PIT timer, a 16-entry process table (PCB), a NASM context-switch stub using PUSHAD/POPAD, and a Round-Robin scheduler invoked on every timer tick.

Stage 2 — Threads & Concurrency: Lightweight kernel threads sharing a process address space, a blocking mutex implemented with the atomic XCHG instruction, a Dijkstra counting semaphore, and a live demonstration of the myglobal race condition with and without synchronisation.

Stage 3 — Memory Manager: A BIOS E820 memory map parsed at boot, a bitmap-based physical frame allocator tracking all 4 KB frames in a 32 MB address space, and a meminfo shell command reporting usage.

Stage 4 — File System: A 1 MB RAM disk, an inode-based file system with superblock, bitmaps, inode table, and data blocks, a complete POSIX-style open/read/write/close/unlink API, and shell commands ls/touch/cat/write/rm.

Final submission reminder Tag your final commit: git tag v0.5-stage4 && git push origin --tags. Your repository must be private on GitHub with tiroshanm added as a collaborator. Include a brief README.md describing which extensions you implemented and how to test them.