CS595: Hot Topics in Distributed Systems: Data-Intensive Computing
Quarter: Fall 2010
Lecture Time: Monday/Wednesday, 1:50PM - 3:15PM
Lecture Location: Stuart Building 106
Office Hours Time: Wednesday, 3:15PM - 4:15PM
Office Hours Location: Stuart Building 237D
Professor: Dr. Ioan Raicu
(iraicu@cs.iit.edu)
The support for Data Intensive Computing is critical to advancing modern science as storage systems have experienced an increasing gap between its capacity and its bandwidth by more than 10-fold over the last decade. There is an emerging need for advanced techniques to manipulate, visualize and interpret large datasets. Building large scale distributed systems that support data-intensive computing involves challenges at multiple levels, from the network (e.g., transport, routing) to the algorithmic (e.g., data distribution, resource management) and even the social (e.g., incentives). This course is a tour through various research topics in distributed systems, covering topics in cluster computing, grid computing, supercomputing, and cloud computing. We will explore solutions and learn design principles for building large network-based computational systems to support data intensive computing. Our readings and discussions will help us identify research problems and understand methods and general approaches to design, implement, and evaluate distributed systems to support data intensive computing. Topics include resource management (e.g. discovery, allocation, compute models, data models, data locality, virtualization, monitoring, provenance), programming models, application models, and system characterization. Our discussions will often be grounded in the context of deployed distributed systems, such as the TeraGrid, Amazon EC2 and S3, various top supercomputers (e.g. IBM BlueGene/P, Sun Constellation, Cray XT5), and various software/programming platforms (e.g. Google's MapReduce, Hadoop, Dryad, Sphere/Sector, Swift/Falkon, and Parrot/Chirp). The course involves lectures, outside invited speakers, discussions of research papers, and a major project (including both a written report and an oral presentation).
Lecture topics:
Date | Lecture Topic | Reading (To be completed by posted date) | Assignments |
08-23-2010 | Syllabus (Slides, PDF) | Reading Write-up Instructions | |
08-25-2010 | Data Intensive Computing Overview (Slides) | ||
08-30-2010 | Data Intensive Computing Overview Continued (Slides) | Reading #1 Foreword (PDF) Jim Gray on eScience (PDF) |
Reading Writeup #1 Due at 12PM (just "Summary of
Paper", 300 words) Homework 1 (PDF) |
09-01-2010 | High-Performance Computing (Slides) | Reading #2 What's next in high-performance computing? (PDF) |
Reading Writeup #2 Due at 12PM (just "Summary of
Paper", 300 words for each) |
09-06-2010 | Labor Day - NO CLASS | ||
09-08-2010 | Many-Task Computing (Slides) | Reading #2 Many-Task Computing for Grids and Supercomputers (PDF) Optional: Falkon: a Fast and Light-weight tasK executiON framework (PDF) |
Reading Writeup #3 Due at 1:45PM (just "Summary of
Paper", 300 words for each) Homework 1 due at 1:45PM |
09-13-2010 | Projects Brainstorming (Slides) | Project Proposal (PDF) | |
09-15-2010 | Cloud Computing and Grid Computing (Slides) | Reading #3 Cloud Computing and Grid Computing 360-Degree Compared (PDF) Optional: Above the clouds: A Berkeley view of cloud computing (PDF) |
|
09-20-2010 | Parallel File Systems (Slides) Guest Lecture by Sam Lang |
Reading #4 GPFS: A Shared-Disk File System for Large Computing Clusters (PDF) I/O Performance Challenges at Leadership Scale (PDF) Optional: PVFS: A Parallel File System for Linux Clusters (PDF) |
Reading Writeup #4 Due at 1:45PM (Reading Write-up Instructions) |
09-22-2010 | Cloud Computing and Grid Computing (Slides) | Project proposal due at 1:45PM | |
09-27-2010 | Cloud Computing and Grid Computing (Slides) | ||
09-29-2010 | Parallel File Systems (Slides) | Reading #5 Lustre: Building a File System for 1,000-node Clusters (PDF) |
Reading Writeup #5 Due at 1:45PM |
10-04-2010 | Distributed File Systems (Slides) Discussion Leader: Raman Verma |
Reading #6 The Google File System (PDF) Sector and Sphere: The Design and Implementation of a High Performance Data Cloud (PDF) |
Reading Writeup #6 Due at 1:45PM |
10-06-2010 | Distributed File Systems (Slides) | ||
10-11-2010 | Fall Break - NO CLASS | ||
10-13-2010 | Parallel Programming Systems and Models (Slides) | ||
10-18-2010 | MapReduce (Slides) Discussion Leader: Xi Yang |
Reading #7 MapReduce: Simplified Data Processing on Large Clusters (PDF) |
Reading Writeup #7 Due at 1:45PM |
10-20-2010 | MapReduce (Slides) Discussion Leaders: Harit Shah & Krishnaprasad Shetty |
Reading #8 A comparison of approaches to large-scale data analysis (PDF) MapReduce: A Flexible Data Processing Tool (PDF) MapReduce and Parallel DBMSs: Friends or Foes? (PDF) |
Reading Writeup #8 Due at 1:45PM |
10-25-2010 | MapReduce (Slides) Discussion Leader: Juan Carlos Hernández Munuera |
Reading #9 MapReduce Online (PDF) MAD Skills: New Analysis Practices for Big Data (PDF) Optional: Large-scale Incremental Processing Using Distributed Transactions and Notifications (PDF) |
Reading Writeup #9 Due at 1:45PM |
10-27-2010 | MapReduce (Slides) Discussion Leader: Tonglin Li |
Reading #10 Building a High-Level Dataflow System on top of Map-Reduce: The Pig Experience (PDF) |
Reading Writeup #10 Due at 1:45PM |
11-01-2010 | Project mid-quarter status presentations (15
min each) Xi Yang: Map/Reduce Scheduling under the voluntary computing environment (Slides) Jin-Chuan Chen: The programming framework of distributed file system: Hadoop/MapReduce, Sector/Sphere, and Windows HPC Server/Dryad (Slides) Xi Duan: The Impact of Stripe Size on Parallel Distributed File Systems (Slides) Raman Verma: Implementing Data Replication and High Availability in Parallel Virtual File System – 2 (Slides) |
Project mid-quarter status presentations due at
1:45PM Presenters: Xi Yang, Jin-Chuan Chen, Xi Duan, Raman Verma |
|
11-03-2010 | Project mid-quarter status presentations (15
min each) Harit Shah & Krishnaprasad Shetty: Exascale File System: Snapshot and Record Append Operation (Slides) Juan Carlos Hernández Munuera: Adapting Distributed Hash Tables to be implemented into a Distributed File System (Slides) Zhou Zhou: D3: Distributed Data Structure (Slides) Tonglin (Tony) Li: An Initial and Experimental Approach for Fusion Distributed File System (Slides) |
Project mid-quarter status presentations due at
1:45PM Presenters: Harit Shah, Krishnaprasad Shetty, Juan Carlos Hernández Munuera, Zhou Zhou, Tonglin (Tony) Li |
|
11-08-2010 | Workflow Systems (Slides) Discussion Leader: Jin-Chuan Chen |
Reading #11 Swift: Fast, Reliable, Loosely Coupled Parallel Computation (PDF) Parallel Scripting for Applications at the Petascale and Beyond (PDF) |
Reading Writeup #11 Due at 1:45PM |
11-10-2010 | Workflow Systems (Slides) | ||
11-15-2010 | Distributed Data Mining (Slides
1, Slides 2,
Slides 3) Guest Lecture by David Grossman |
Reading #12 Planet: massively parallel learning of tree ensembles with mapreduce (PDF) Map-Reduce for Machine Learning on Multicore (PDF) |
Reading Writeup #12 Due at 1:45PM |
11-17-2010 | Query Prediction in Large Scale Data Intensive Event
Stream Analysis Systems (Slides) Guest Lecture by Huaiming Song No Office Hours |
Optional Reading Query Prediction in Large Scale Data Intensive Event Stream Analysis Systems (PDF) |
|
11-22-2010 | Workflow Systems (Slides) | ||
11-24-2010 | Thanksgiving Break - NO CLASS | ||
11-29-2010 | Distributed Hash Tables Discussion Leader: Zhou Zhou |
Reading #13 Dynamo: Amazon’s Highly Available Key-value Store Dynamo (PDF) Kademlia: A Peer-to-peer Information System Based on the XOR Metric (PDF) |
Reading Writeup #13 Due at 1:45PM |
12-01-2010 | Many-core Computing (Slides) Discussion Leader: Xi Duan |
Reading #14 Amdahl's law in the multicore era (PDF) Reevaluating Amdahl's Law in the Multicore Era (PDF) |
Reading Writeup #14 Due at 1:45PM Project final report write-up instructions (PDF) |
12-08-2010 1:15PM - 4:15PM |
Final Presentations (20 min each) Xi Yang: Map/Reduce Scheduling under the voluntary computing environment (Slides) Jin-Chuan Chen: The programming framework of distributed file system: Hadoop/MapReduce, Sector/Sphere, and Windows HPC Server/Dryad (Slides) Xi Duan: The Impact of Stripe Size on Parallel Distributed File Systems (Slides) Raman Verma: Implementing Data Replication and High Availability in Parallel Virtual File System – 2 (Slides) Harit Shah & Krishnaprasad Shetty: Exascale File System: Snapshot and Record Append Operation (Slides) Juan Carlos Hernández Munuera: Adapting Distributed Hash Tables to be implemented into a Distributed File System (Slides) Zhou Zhou: D3: Distributed Data Structure (Slides) Tonglin (Tony) Li: An Initial and Experimental Approach for Fusion Distributed File System (Slides) |
Project final
report due at 1:15PM Project final presentations due in class |
Last modified:
July 07, 2011