IIT Database Group

Header bar

Courses

We teach several courses most of which are related to databases.

CS116 - Introduction to Object-Oriented Programming II

Continuation of CS 115. Introduces more advanced elements of object-oriented programming – including dynamic data structures, recursion, searching and sorting, and advanced object-oriented programming techniques. For students in CS and CS related degree programs.

CS331 - Datastructures and Algorithms

Implementation and application of the essential data structures used in computer science. Analysis of basic sorting and searching algorithms and their relationship to these data structures. Particular emphasis is given to the use of object-oriented design and data abstraction in the creation and application of data structures. Course Objectives: Students should be able to ...

  • Explain, implement, and apply the following data-structures: lists (unordered and ordered), stacks, queues, expression trees, binary search trees, AVL-trees, hash tables, and heaps.
  • Analyze the time and space complexity of algorithms using asymptotic upper bounds (big-O notation).
  • Explain and use references and linked structures.
  • Outline basic object-oriented design concepts: composition, inheritance, polymorphism.
  • Outline basic concepts of immutable data structures and explain the trade-offs compared to mutable data.
  • Write and test recursive procedures, and explain the run-time stack concept.
  • Analyze searching and sorting algorithms, and explain their relationship to data-structures.
  • Choose and implement appropriate data-structures to solve an application problem.
  • Understand techniques of software development, such as unit testing and version control.

CS425 - Database Organization

Databases management systems are a crucial part of most large-scale industry and open-source systems. This course familiarizes students with important concepts of database systems and design. We will learn how to design a database using the Entity-Relationship model, how query and modify a database using the declarative SQL language, and study APIs for writing application programs that use a database system to persist data. Furthermore, the course gives an overview of important database systems concepts such as indexing, query optimization and execution, concurrency control, and recovery.

Students will develop a database application in a group project. This project will cover all phases of development: assessing the application requirements, designing the database schema, and implementing the application.

CS520 - Data Integration, Warehousing, and Provenance

This course introduces the basic concepts of data integration, data warehousing, and provenance. We will learn how to resolve structural heterogeneity through schema matching and mapping. The course introduces techniques for querying several heterogeneous datasources at once (data integration) and translating data between databases with different data representations (data exchange). Furthermore, we will cover the data-warehouse paradigm including the Extract-Transform-Load (ETL) process, the data cube model and its relational representations (such as snowflake and star schema), and efficient processing of analytical queries. This will be contrasted with Big Data analytics approaches that (besides other differences) significantly reduce the upfront cost of analytics. When feeding data through complex processing pipelines such as data exchange transformations or ETL workflows, it is easy to loose track of the origin of data. In the last part of the course we therefore cover techniques for representing and keeping track of the origin and creation process of data - aka its provenance.

The course is emphasizing practical skills through a series of homework assignments that help students develop a strong background in data integration systems and techniques. At the same time, it also addresses the underlying formalisms. For example, we will discuss the logic based languages used for schema mapping and the dimensional data model as well as their practical application (e.g., developing an ETL workflow with rapid miner and creating a mapping between two example schemata). The literature reviews will familiarize students with data integration and provenance research.

CS525 - Advanced Database Organization

Databases management systems are a crucial part of most large-scale industry and open-source systems. This course provides comprehensive coverage of issues associated with database system development and an in-depth examination of structures and techniques used in contemporary database management systems (DBMSs). Students will learn about the inner workings of these exciting systems: Which algorithms are used? What are typical architectures used to build a system as complex as a DBMS? What are implementation strategies? These questions and more will be answered during the course.

The course is highly applied, emphasizing practical skills and habits through a series of programming assignments during which students will develop their own tiny DBMS like engine. We will cover the most important aspects/components of a DBMS: storage and buffer management, indexing, query optimization, query execution, and concurrency control and recovery.

CS595 - Topics in Big Data Analytics

Big data technologies, in particular, scalable distributed platforms for storage and analytics en- able processing of massive datasets for analytics, machine learning, and other use cases. This course provides a comprehensive overview of algorithms, systems, and techniques for Big Data processing. In a semester-long project, students will extend existing big data platforms. Additionally, in the seminar component of this course we will discuss cutting edge research and industrial developments in the field.