Semester: Spring 2015
Lecture Time: Monday/Wednesday, 11:25AM - 12:40PM
Lecture Location: Stuart Building 113
Professor: Dr. Ioan Raicu (iraicu@cs.iit.edu)
Office Hours Time: Wednesday 12:45PM-1:45PM
Office Hours Location: Stuart Building 237D
Teaching Assistant: Ke Wang (kwang22@hawk.iit.edu)
Office Hours Time: Monday 10:15AM-11:15AM
Office Hours Location: Stuart Building 006
Teaching Assistant: Tonglin Li (tli13@iit.edu)
Office Hours Time: Tuesday 12:45PM-1:45PM
Office Hours Location: Stuart Building 006
Teaching Assistant: Dongfang Zhao (dzhao8@hawk.iit.edu)
Office Hours Time: Thursday 12:45PM-1:45PM
Office Hours Location: Stuart Building 006
This course is a tour through various research topics in distributed data-intensive computing, covering topics in cluster computing, grid computing, supercomputing, and cloud computing. We will explore solutions and learn design principles for building large network-based computational systems to support data intensive computing. This course is geared for junior/senior level undergraduates and graduate students in computer science. Prerequsites: CS450; however, one or more of the following courses would be recommended: 451, CS546, CS550, CS552, CS553, or CS570.
We will be using Piazza to facilitate course discussions, at http://piazza.com/iit/spring2015/cs554/home
In order to highight some of the best projects from the class this year (11 of the 27 projects), I have posted some of the final reports below (for a complete list of project titles and students, click here):
Schedule
Date | Lecture Topic | Reading (To be completed by posted date) | Assignments |
01-12-2015 | Syllabus (Slides, PDF) | ||
01-14-2015 | Introduction to Distributed Systems (Slides) | ||
01-19-2015 | NO CLASS | ||
01-21-2015 | Introduction to Distributed Systems | 1.
Foreward, by Gordon Bell 2. Jim Gray on eScience: A Transformed Scientific Method |
Quiz#1 |
01-26-2015 | Introduction to Data-Intensive Distributed Computing (Slides) | ||
01-28-2015 | ZHT: the Zero-hop Distributed Hash Table (Slides) -- Tonglin Li | ZHT: A Light-weight Reliable Persistent Dynamic Scalable Zero-hop Distributed Hash Table, IEEE IPDPS 2013 | |
02-02-2015 | FusionFS: the Fusion Distributed File System (Slides) -- Dongfang Zhao |
FusionFS: Towards Supporting Data-Intensive Scientific Applications on
Extreme-Scale High-Performance Computing Systems, IEEE BigData 2014
Optional: Virtual Chunks: On Supporting Random Accesses to Scientific Data in Compressible Storage Systems, IEEE BigData 2014 HyCache+: Towards Scalable High-Performance Caching Middleware for Parallel File Systems, IEEE/ACM CCGrid 2014 Distributed Data Provenance for Large-Scale Data-Intensive Computing”, IEEE Cluster 2013 Towards High Performance Key-Value Stores through GPU-Accelerated Coding (see BB) |
|
02-04-2015 | MATRIX: a Many-Task Computing Eexecution Fabric (Slides) -- Ke Wang |
Distributed Load-Balancing with Adaptive Work Stealing for Many-Task
Computing on Billion-Core Systems (see BB) Optional: Optimizing Load Balancing and Data-Locality with Data-aware Scheduling , IEEE BigData 2014 SimMatrix: Simulator for MAny-Task computing execution fabRIc at eXascales, ACM HPC 2013 |
|
02-09-2015 | Slurm++: a Distributed Workload Manager for High-Performance Computing (Slides) -- Ke Wang |
Slurm++: a Distributed Workload Manager for Extreme-Scale
High-Performance Computing Systems (see BB) Optional: Next Generation Job Management Systems for Extreme Scale Ensemble Computing, ACM HPDC 2014 |
|
02-11-2015 | FaBRiQ: a Distributed Message Queuing System (Slides) -- Iman Sadooghi |
FaBRiQ: Leveraging Distributed Hash Tables towards Distributed
Publish-Subscribe Message Queues (see BB) Optional: "Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues, SCRAMBL 2014 |
Quiz#2 |
02-16-2015 | GeMTC: ManyGPU-enabled Many-Task Computing (Slides) | Design and Evaluation of the GeMTC Framework for GPU-enabled Many-Task Computing, ACM HPDC 2014 | |
02-18-2015 | GeMTC: ManyGPU-enabled Many-Task Computing | Project Brainstorming Writeups | Project Proposal Writeup |
02-23-2015 | CloudKon: a Cloud enabled Distributed tasK executiON framework (slides) -- Iman Sadooghi | Achieving Efficient Distributed Scheduling with Message Queues in the Cloud for Many-Task Computing and High-Performance Computing, IEEE/ACM CCGrid 2014 | |
02-25-2015 | Project Brainstorming (slides) -- Dongfang Zhao, Tonglin Li, Ke Wang, Iman Sadooghi | ||
03-02-2015 | Project Brainstorming (slides) | ||
03-04-2015 | Project Brainstorming | ||
03-06-2015 | Group formation Due Project Proposal Due Quiz#3 | ||
03-09-2015 | MapReduce (Slides) |
MapReduce: Simplified Data Processing on Large Clusters Optional MapReduce: a flexible data processing tool Apache Hadoop YARN: yet another resource negotiator Google’s MapReduce programming model — Revisited |
|
03-11-2015 | Swift Workflow System (Slides) |
Swift/T:
Large-scale Application Composition via Distributed-memory Dataflow
Processing Optional Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications Compiler Techniques for Massively Scalable Implicit Task Parallelism Swift: A language for distributed parallel scripting |
|
03-16-2015 | NO CLASS (Spring Break) | ||
03-18-2015 | NO CLASS (Spring Break) | ||
03-23-2015 | Swift Workflow System | ||
03-25-2015 | Swift Workflow System | ||
03-30-2015 | A Berkeley View of Resource Management(Spark, Mesos, RDD, Shark, Sparrow) (Slides #1,Slides #2) |
Sparrow: distributed, low latency scheduling Optional Spark: cluster computing with working sets Mesos: A platform for fine-grained resource sharing in the data center Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing Shark: fast data analysis using coarse-grained distributed memory |
Project Midterm Report Writeup |
04-01-2015 | A Berkeley View of Resource Management | Quiz#4 Project Final Report Writeup |
|
04-06-2015 | Parallel File Systems (Slides #1,Slides #2, Slides #3) |
I/O Performance
Challenges at Leadership Scale Optinoal GPFS: A Shared-Disk File System for Large Computing Clusters (PDF) PVFS: A Parallel File System for Linux Clusters (PDF) Lustre: Building a File System for 1,000-node Clusters (PDF) Scalable Performance of the Panasas Parallel File System (PDF) |
Project Midterm Progress Report Due |
04-08-2015 | Distributed File Systems (Slides) | The Google File System | Emulated PC Meeting Instructions |
04-13-2015 | Distributed File Systems (Slides #1,Slides #2) |
Ceph: A Scalable,
High-Performance Distributed File System Optional Ceph as a scalable alternative to the Hadoop Distributed File System |
|
04-15-2015 | Distributed Databases |
Hive-a petabyte scale data warehouse using hadoop Optional Pig latin: a not-so-foreign language for data processing Dremel: interactive analysis of web-scale datasets Spanner: Google's Globally-Distributed Database |
|
04-20-2015 | Emulated PC Meeting | ||
04-22-2015 | Emulated PC Meeting | Quiz#5 |
|
04-27-2015 8AM-8PM |
NO CLASS (everyone attending GCASR 2015 at UIC) | ||
04-29-2015 10AM-4:30PM |
Final Presentations | ||
05-04-2015 | NO CLASS | Project Final Reports Due |
CS550