CS554: Data-Intensive Computing

Semester: Spring 2015

Lecture Time: Monday/Wednesday, 11:25AM - 12:40PM

Lecture Location: Stuart Building 113

Professor: Dr. Ioan Raicu (iraicu@cs.iit.edu)

Office Hours Time: Wednesday 12:45PM-1:45PM

Office Hours Location: Stuart Building 237D

Teaching Assistant: Ke Wang (kwang22@hawk.iit.edu)

Office Hours Time: Monday 10:15AM-11:15AM

Office Hours Location: Stuart Building 006

Teaching Assistant: Tonglin Li (tli13@iit.edu)

Office Hours Time: Tuesday 12:45PM-1:45PM

Office Hours Location: Stuart Building 006

Teaching Assistant: Dongfang Zhao (dzhao8@hawk.iit.edu)

Office Hours Time: Thursday 12:45PM-1:45PM

Office Hours Location: Stuart Building 006

This course is a tour through various research topics in distributed data-intensive computing, covering topics in cluster computing, grid computing, supercomputing, and cloud computing. We will explore solutions and learn design principles for building large network-based computational systems to support data intensive computing. This course is geared for junior/senior level undergraduates and graduate students in computer science. Prerequsites: CS450; however, one or more of the following courses would be recommended: 451, CS546, CS550, CS552, CS553, or CS570.

We will be using Piazza to facilitate course discussions, at http://piazza.com/iit/spring2015/cs554/home

In order to highight some of the best projects from the class this year (11 of the 27 projects), I have posted some of the final reports below (for a complete list of project titles and students, click here):

Arvind Shekar, Arihant Raj Nagarajan, Itua Ijagbone, Shivakumar Vinayagam. "Distributed Scheduling and monitoring service leveraging FaBRiQ as a building block for CloudKon+", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Antonios Kougkas. "A Decoupled Execution Paradigm Programming Model", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Kevin Brandstatter. "FusionFS: Enabling Distributed Indexing And Text Search", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Tonglin Li, Chaoqi Ma, Jiabao Li, Ioan Raicu. "ZHT+: A Graph Database On ZHT", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Alekya Thalari, Krishnaja Kethireddy, Nirmal Kumar Ravi, Prathamesh Mantri. "T-FUSE: Improving Hadoop through FusionFS", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Vivek Viswanathan. "Hadoop Mapreduce OpenCL Plugin", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Eric Faurie, Chaitanya Reddy Chatla. "JFusionFS: A Java Implementation of FusionFS", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Thomas Dubucq, Tony Forlini, Virgile Landeiro Dos Reis, and Isabelle Santos. "MATRIX: Bench - Benchmarking the state-of-the-art Task Execution Frameworks of Many-Task Computing", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Karl Stough, Serapheim Dimitropoulos, Poornima Nookala. "Evaluating the Support of MTC Applications on Intel Xeon Phi Many-Core Accelerators", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Sughosh Divanji, Raghav Kapoor, Dongfang Zhao, Ioan Raicu. "PVFS simulation using CODES/ROSS simulator", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015
Gagan Munisiddha Gowda Benjamin L. Miwa Anirudh Sunkineni. "ZHT+ : Design and Implementation of a Graph Database Using ZHT", Illinois Institute of Technology, Department of Computer Science, Technical Report, 2015

Schedule

Date	Lecture Topic	Reading (To be completed by posted date)	Assignments
01-12-2015	Syllabus (Slides, PDF)
01-14-2015	Introduction to Distributed Systems (Slides)
01-19-2015	NO CLASS
01-21-2015	Introduction to Distributed Systems	1. Foreward, by Gordon Bell 2. Jim Gray on eScience: A Transformed Scientific Method	Quiz#1
01-26-2015	Introduction to Data-Intensive Distributed Computing (Slides)
01-28-2015	ZHT: the Zero-hop Distributed Hash Table (Slides) -- Tonglin Li	ZHT: A Light-weight Reliable Persistent Dynamic Scalable Zero-hop Distributed Hash Table, IEEE IPDPS 2013
02-02-2015	FusionFS: the Fusion Distributed File System (Slides) -- Dongfang Zhao	FusionFS: Towards Supporting Data-Intensive Scientific Applications on Extreme-Scale High-Performance Computing Systems, IEEE BigData 2014 Optional: Virtual Chunks: On Supporting Random Accesses to Scientific Data in Compressible Storage Systems, IEEE BigData 2014 HyCache+: Towards Scalable High-Performance Caching Middleware for Parallel File Systems, IEEE/ACM CCGrid 2014 Distributed Data Provenance for Large-Scale Data-Intensive Computing”, IEEE Cluster 2013 Towards High Performance Key-Value Stores through GPU-Accelerated Coding (see BB)
02-04-2015	MATRIX: a Many-Task Computing Eexecution Fabric (Slides) -- Ke Wang	Distributed Load-Balancing with Adaptive Work Stealing for Many-Task Computing on Billion-Core Systems (see BB) Optional: Optimizing Load Balancing and Data-Locality with Data-aware Scheduling , IEEE BigData 2014 SimMatrix: Simulator for MAny-Task computing execution fabRIc at eXascales, ACM HPC 2013
02-09-2015	Slurm++: a Distributed Workload Manager for High-Performance Computing (Slides) -- Ke Wang	Slurm++: a Distributed Workload Manager for Extreme-Scale High-Performance Computing Systems (see BB) Optional: Next Generation Job Management Systems for Extreme Scale Ensemble Computing, ACM HPDC 2014
02-11-2015	FaBRiQ: a Distributed Message Queuing System (Slides) -- Iman Sadooghi	FaBRiQ: Leveraging Distributed Hash Tables towards Distributed Publish-Subscribe Message Queues (see BB) Optional: "Towards In-Order and Exactly-Once Delivery using Hierarchical Distributed Message Queues, SCRAMBL 2014	Quiz#2
02-16-2015	GeMTC: ManyGPU-enabled Many-Task Computing (Slides)	Design and Evaluation of the GeMTC Framework for GPU-enabled Many-Task Computing, ACM HPDC 2014
02-18-2015	GeMTC: ManyGPU-enabled Many-Task Computing	Project Brainstorming Writeups	Project Proposal Writeup
02-23-2015	CloudKon: a Cloud enabled Distributed tasK executiON framework (slides) -- Iman Sadooghi	Achieving Efficient Distributed Scheduling with Message Queues in the Cloud for Many-Task Computing and High-Performance Computing, IEEE/ACM CCGrid 2014
02-25-2015	Project Brainstorming (slides) -- Dongfang Zhao, Tonglin Li, Ke Wang, Iman Sadooghi
03-02-2015	Project Brainstorming (slides)
03-04-2015	Project Brainstorming
03-06-2015			Group formation Due Project Proposal Due Quiz#3
03-09-2015	MapReduce (Slides)	MapReduce: Simplified Data Processing on Large Clusters Optional MapReduce: a flexible data processing tool Apache Hadoop YARN: yet another resource negotiator Google’s MapReduce programming model — Revisited
03-11-2015	Swift Workflow System (Slides)	Swift/T: Large-scale Application Composition via Distributed-memory Dataflow Processing Optional Turbine: A Distributed-memory Dataflow Engine for High Performance Many-task Applications Compiler Techniques for Massively Scalable Implicit Task Parallelism Swift: A language for distributed parallel scripting
03-16-2015	NO CLASS (Spring Break)
03-18-2015	NO CLASS (Spring Break)
03-23-2015	Swift Workflow System
03-25-2015	Swift Workflow System
03-30-2015	A Berkeley View of Resource Management(Spark, Mesos, RDD, Shark, Sparrow) (Slides #1,Slides #2)	Sparrow: distributed, low latency scheduling Optional Spark: cluster computing with working sets Mesos: A platform for fine-grained resource sharing in the data center Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing Shark: fast data analysis using coarse-grained distributed memory	Project Midterm Report Writeup
04-01-2015	A Berkeley View of Resource Management		Quiz#4 Project Final Report Writeup
04-06-2015	Parallel File Systems (Slides #1,Slides #2, Slides #3)	I/O Performance Challenges at Leadership Scale Optinoal GPFS: A Shared-Disk File System for Large Computing Clusters (PDF) PVFS: A Parallel File System for Linux Clusters (PDF) Lustre: Building a File System for 1,000-node Clusters (PDF) Scalable Performance of the Panasas Parallel File System (PDF)	Project Midterm Progress Report Due
04-08-2015	Distributed File Systems (Slides)	The Google File System	Emulated PC Meeting Instructions
04-13-2015	Distributed File Systems (Slides #1,Slides #2)	Ceph: A Scalable, High-Performance Distributed File System Optional Ceph as a scalable alternative to the Hadoop Distributed File System
04-15-2015	Distributed Databases	Hive-a petabyte scale data warehouse using hadoop Optional Pig latin: a not-so-foreign language for data processing Dremel: interactive analysis of web-scale datasets Spanner: Google's Globally-Distributed Database
04-20-2015	Emulated PC Meeting
04-22-2015	Emulated PC Meeting		Quiz#5
04-27-2015 8AM-8PM	NO CLASS (everyone attending GCASR 2015 at UIC)
04-29-2015 10AM-4:30PM	Final Presentations
05-04-2015	NO CLASS		Project Final Reports Due

Next Semester Fall 2015

CS550

Ioan Raicu

Illinois Institute of Technology

Argonne National Laboratory

CS554: Data-Intensive Computing

Next Semester Fall 2015