174

SparkKernel & Distributed Systems

A multithreaded OS kernel simulation plus a from-scratch MPI MapReduce framework — high-performance distributed systems research.

A body of academic systems work at FAST NUCES exploring concurrency, scheduling, and distributed computation from first principles.

SparkKernel — Multithreaded OS Simulation

A simulated operating-system kernel written in C++ using multithreading to model how a real kernel schedules and manages work.

  • Custom process scheduler with First-Come-First-Serve, Round Robin, and Preemptive Priority strategies.
  • Process & thread management following the classic five-state model.
  • Kernel data structures — initializes processes/threads with their stacks and PCBs, tracking each process's state for statistics.

The five-state process model:

  • NEW — being created, not yet executing.
  • READY — ready to run, waiting to be scheduled on a CPU.
  • RUNNING — currently executing on a CPU.
  • WAITING — paused, waiting on an I/O request to complete.
  • TERMINATED — completed.

SparkKernel

MapMPI — Distributed MapReduce

A MapReduce framework implemented from scratch using MPI (Message Passing Interface):

  • Distributed matrix multiplication and parallel Quick/Merge sorts across clustered nodes.
  • Simulated Hadoop ecosystems to optimize grouping and indexing of large-scale academic-journal datasets by authorship metadata.

Tech Stack

C++ · Java · MPI · Hadoop · Multithreading