CS595-03: OS and Runtime System Design for Supercomputing

Instructor: Kyle C. Hale (Office Hours: Thurs. 4:30-6:30PM, SB 237F)

Instructor e-mail: khale [AT] cs [DOT] iit [DOT] edu

TA: There will be no TA for this course

Semester: Fall 2016

Time: Tuesdays and Thursdays, 3:15PM-4:30PM

Location: SB 225

Course number: CS595-03

Overview

When one writes programs that push the limits of the hardware, the general abstractions and convenient programming environments provided by typical operating systems can actually block progress towards achieving breakneck performance. Computer systems are ultimately driven by their usage, and OS designers can hardly be expected to optimize for every use case. In high-performance and parallel computing, where no one use case can be described as the "common" one, we must consider specialization of our systems software. We must sometimes choose simplicity and speed over generality and elegance. With this theme in mind, we will approach various topics in the design of operating systems and runtime systems as they apply to supercomputing.

This is a seminar course that will expose you to current research at the intersection of several areas within computer systems, including Operating Systems, Computer Architecture, High-performance Computing, Parallel Programming, and Programming Languages. Students will apply what they learn to complete a course project in the design of OS internals for a specific runtime system, parallel language, or parallel algorithm.

Audience

This course is aimed at advanced CS undergraduates and graduate students, particularly Ph.D. students. Please contact me if you would like to take this class but are not sure if you have the required background.

Prerequisites

I expect you to, at a minimum, be well-versed in programming in C in a UNIX environment and have basic familiarity with x86 systems. This essentially means you have taken CS 351 (Systems Programming) or an equivalent course. You should also have a firm grasp on basic operating systems concepts, to the level of CS 450, CS 550, or CS 551. Experience with Computer Architecture (CS 470), Parallel Systems (CS 546), or Data-Intensive Computing (CS 554) is a plus.

Books and Software

There are no required textbooks for this course. This means it is absolutely essential that you attend class, participate in discussions, and come to office hours. We will primarily be reading research papers. However, here are some books I recommend you have on your shelf for reference, especially if this is a topic you intend to pursue at a deeper level:

I expect we will be using some subset of the following software:

Development Environment

We will primarily be using virtualization for these projects. That means QEMU, QEMU+kvm, or Palacios. The setup is fairly easy and I will go through it either in class or on Piazza.

Grading

The following components will constitute your grade in this course:

Project

You will work in groups on a research-oriented project in OS or runtime systems. I will hand out a list of potential projects at the beginning of the semester. Once you have formed groups (I encourage you to use Piazza for this if you cannot find someone in person) and selected a project, I will expect weekly status reports from you until the end of the quarter, at which point you will submit a final report and give a final project presentation that includes a demo of your work.

Communication

We will be using Piazza for discussion and announcements. I will enroll you in the course page. I suggest you post your questions and comments on Piazza first before using direct e-mail. This way, the whole class can benefit from the answers. I will also use Piazza to post details on readings, background information, and other miscellaneous advice.

Schedule


Reviews due in class
Week Date Discussion Topic HW/Extra Reading
1 8/23 Introduction, getting set up, projects, course structure, background Required: Blue Gene's CNK
Required: What is a Lightweight Kernel?
Recommended: Hamming - You and Your Research
Optional: Mickens - The Slow Winter
8/25 Review of OS architecture and Operation
Nautilus code walkthrough
2 8/30 Presentation: (given by Instructor) Blue Gene's CNK Required: Palacios and Kitten
Required: The Task of the Referee
9/1 Finish walk-through of boot process with code
Getting setup with development environment
Presentation: (Benjamin) Palacios and Kitten
Required: Boyd-Wickizer - Linux scalability
HW: Download, build, and run Nautilus on QEMU
HW: Reviews for Linux scalability paper
3 9/6 Presentation: (Sunny) Boyd-Wickizer - Linux scalability
Reviews due in class
E-MAIL ME WITH YOUR PROJECT DECISION IF YOU HAVEN'T ALREADY
Required: Hansen - RC 4000 paper
Recommended: Thompson - UNIX Implementation
9/8 Presentation: (Amal) Hansen - RC 4000 paper
Required: Liedtke - On Microkernel Construction
4 9/13 Presentation: (Alexandru) Liedtke - On Microkernel Construction
Reviews due in class
Required: Engler - Exokernel
9/15 Presentation: (Poornima) Engler - Exokernel
Required: Hand - Nemesis
Recommended: Montz - Scout
5 9/20 Presentation: (Manqi) Hand - Nemesis
Reviews due in class
Required: Cheriton - Stanford Cache Kernel
Recommended: Hunt - Singularity OS
Recommended: Anderson - Scheduler Activations
9/22 Presentation: (Imran) Cheriton - Stanford Cache Kernel
Required: Popek - Formal Requirements for Virtualization
Recommended: Barham - Xen and the Art of Virtualization
6 9/27 Presentation: (Sheshadri) Popek - Formal Requirements for Virtualization
Reviews due in class
Required: Fisher-Ogden - HW Support for Virtualization
Recommended: Waldspurger - VMware memory management
9/29 Presentation: (Goutham) Fisher-Ogden HW Support for Virtualization
Required: Bugnion - Disco
Recommended: Kivity - KVM paper
7 10/4 Presentation: (Conghao) Bugnion - Disco
Reviews due in class
Required: Gordon - ELI
Recommended: Hand - VMMs/microkernels
10/6 Presentation: (Alex) Gordon - ELI
Required: Blumofe - Cilk
Recommended: PVM: A Framework for Parallel Distributed Computing
Recommended: The MPI Standard
8 10/11 Presentation: (Poornima) Blumofe - Cilk
Reviews due in class
Required: Blagojevic - PGAS runtimes
Recommended: OpenMP
10/13 Presentation: (Manqi) Blagojevic - PGAS runtimes
Required: Bauer - Legion runtime
Recommended: Jade
Recommended: Charm++
9 10/18 Presentation: (Imran) Bauer - Legion runtime
Reviews due in class
Required: Hillis - Data Parallel Algorithms
Recommended: Nested Data Parallelism (NESL)
10/20 Presentation: (Sunny) Hillis - Data Parallel Algorithms Required: Krieger - K42
10 10/25 Presentation: (Sheshadri) Krieger - K42
Reviews due in class
Required: Baumann - Barrelfish
10/27 Presentation: (Amal) Baumann - Barrelfish Required: Boyd-Wickizer - Corey
Recommended: NUMA
11 11/1 Presentation: (Conghao) Boyd-Wickizer - Corey
Reviews due in class
Required: Colmenares - Tesselation
11/3 Presentation: (Ben) Colmenares - Tesselation Required: Peter - Arrakis
12 11/8 Presentation: (Goutham) Peter - Arrakis
Reviews due in class
Required: Porter - Drawbridge
11/10 Presentation: (Ben) Colmenares - Tesselation Required: Madhavapeddy - Unikernels
Recommended: OSv
13 11/15 NO CLASS Madhavapeddy - Unikernels
Reviews STILL due
Required: Hale - Hybrid Runtimes
Recommended: Hale - Nautilus short paper
11/17 Presentation: (Hale) Hale - Hybrid Runtimes Required: Wisniewski - mOS
14 11/22 Presentation: (Conghao) Wisniewski - mOS Required: Ouyang - Co-kernels
11/24 Thanksgiving Break - No Class
15 11/29 Presentation: (Manqi) Ouyang - Co-kernels
NO REVIEW DUE
12/1 Final Project Presentations (PIZZA PARTY!) Written Reports Due
The Night Watch

What You Will Learn

The current list of topics includes the following. We may not be able to cover all of these, but I hope to cover most. Topics covered will be tailored to interests of the students.

Other Useful Links and Resources