EECS 395 / EECS 495
Hot Topics in Distributed Systems:
Data-Intensive Computing
Quarter: Winter 2010
Lecture Time: Tuesday/Thursday, 12:30PM - 1:50PM
Lecture Location: TECH L158
Office Hours Time: Thursday, 2:00PM - 3:00PM
Office Hours Location: TECH M384
Instructor: Dr. Ioan Raicu
(iraicu@eecs.northwestern.edu,
1-847-491-8163)
The support for Data Intensive Computing is critical to advancing modern science as storage systems have experienced an increasing gap between its capacity and its bandwidth by more than 10-fold over the last decade. There is an emerging need for advanced techniques to manipulate, visualize and interpret large datasets. Building large scale distributed systems that support data-intensive computing involves challenges at multiple levels, from the network (e.g., transport, routing) to the algorithmic (e.g., data distribution, resource management) and even the social (e.g., incentives). This course is a tour through various research topics in distributed systems, covering topics in cluster computing, grid computing, supercomputing, and cloud computing. We will explore solutions and learn design principles for building large network-based computational systems to support data intensive computing. Our readings and discussions will help us identify research problems and understand methods and general approaches to design, implement, and evaluate distributed systems to support data intensive computing. Topics include resource management (e.g. discovery, allocation, compute models, data models, data locality, virtualization, monitoring, provenance), programming models, application models, and system characterization. Our discussions will often be grounded in the context of deployed distributed systems, such as the TeraGrid, Amazon EC2 and S3, various top supercomputers (e.g. IBM BlueGene/P, Sun Constellation, Cray XT5), and various software/programming platforms (e.g. Google's MapReduce, Hadoop, Dryad, Sphere/Sector, Swift/Falkon, and Parrot/Chirp). The course involves lectures, outside invited speakers, discussions of research papers, and a major project (including both a written report and an oral presentation).
Lecture topics:
Date | Lecture Topic | Reading | Assignments |
01-05-2010 | Syllabus (Slides, PDF) |
Reading #1 Foreword (PDF) Jim Gray on eScience (PDF) |
Reading Write-up Instructions |
01-06-2010 | Reading Writeup #1 Due at 11:59PM (just "Summary of Paper", at least 300 words collectively) |
||
01-07-2010 | Data Intensive Computing Overview (Slides) | ||
01-12-2010 | Data Intensive Computing Overview Continued (Slides) | Reading #2 Cloud Computing and Grid Computing 360-Degree Compared (PDF) |
Homework 1 (PDF) |
01-13-2010 | Homework 1 due at 11:59PM Reading Writeup #2 Due at 11:59PM (just "Summary of Paper", at least 300 words collectively) |
||
01-14-2010 | Distributed Systems: Clusters, Supercomputers, Grids and Clouds (Slides) | Homework 2 | |
01-18-2010 | Homework 2 due at 11:59PM | ||
01-19-2010 | Projects Brainstorming (Slides) | Project | |
01-21-2010 | Local Resource Management Systems (Slides) | ||
01-26-2010 | Storage Systems: Data Diffusion (Slides) | Reading #3 The Google File System (PDF) The Hadoop Distributed File System: Architecture and Design (PDF) Sector and Sphere: The Design and Implementation of a High Performance Data Cloud (PDF) |
Project proposal due at 12PM |
01-27-2010 | Reading Writeup #3 Due at 11:59PM Reading Write-up Instructions |
||
01-28-2010 | Distributed File Systems (Slides) | Reading #4 MapReduce: Simplified Data Processing on Large Clusters (PDF) |
|
02-01-2010 | Reading Writeup #4 Due at 11:59PM Reading Write-up Instructions |
||
02-02-2010 | MapReduce (Slides) | Reading #5 GPFS: A Shared-Disk File System for Large Computing Clusters (PDF) Lustre: Building a File System for 1,000-node Clusters (PDF) PVFS: A Parallel File System for Linux Clusters (PDF) |
|
02-08-2010 | Reading Writeup #5 Due at 11:59PM Reading Write-up Instructions |
||
02-04-2010 | Shared and Parallel File Systems (Slides) | ||
02-09-2010 | Parallel Programming Systems and Models (Slides) | Reading #6 Reevaluating Amdahl's Law in the Multicore Era (PDF) |
|
02-10-2010 | Reading Writeup #6 Due at 11:59PM (just "Summary of Paper", at least 300 words collectively) |
||
02-11-2010 | Parallel Programming Systems and Models (Slides) | ||
02-16-2010 | Project mid-quarter status presentations Vaibhav Rastogi & Yinzhi Cao, DataShed: Monitoring and Diagnosis of Large Scale P2P Video Streaming Networks (Slides) Arefin Huq, Tunebot in the Cloud (Slides) |
Project mid-quarter status presentations due at
12:30PM |
|
02-18-2010 | Project mid-quarter status presentations Hongyu Gao, Automatic Parallelism Discovery (Slides) Chen Jin, Distributed File System (Slides) |
Reading #7 Reactive NUCA: Near-Optimal Block Placement and Replication in Distributed Caches (PDF) |
|
02-22-2010 | Reading Writeup #7 Due at 11:59PM Reading Write-up Instructions |
||
02-23-2010 | Guest Lecture: Dr. Nikos Hardavellas Many-core Computing Era and New Challenges (Slides) |
Reading #8 Parallel Scripting for Applications at the Petascale and Beyond (PDF) Workflows and e-Science: An overview of workflow system features and capabilities (PDF) |
|
02-24-2010 | Reading Writeup #8 Due at 11:59PM Reading Write-up Instructions |
||
02-25-2010 | Workflow Systems (Slides) | ||
03-02-2010 | Workflow Systems (Slides) | ||
03-04-2010 | Workflow Systems (Slides) | Reading #9 A high-performance, portable implementation of the MPI message passing interface standard (PDF) |
|
03-08-2010 | Reading Writeup #9 Due at 11:59PM Reading Write-up Instructions |
||
03-09-2010 | MPI (Slides) | ||
03-11-2010 | Future Research Directions: Exascale Many-Task Computing with Billions of Processors (Slides) | Project final report write-up instructions | |
03-17-2010 | Project final report due at 11:59PM | ||
03-18-2010 12:30PM - 3:30PM |
Final Presentations (Schedule) |
Project final presentations due in class |
Last modified:
July 07, 2011