CS 525 - Advanced Database Organization
Autumn, 2004
|
Syllabus
This course provides comprehensive coverage of the problems involved in database system implementation and an in-depth examination of contemporary structures and techniques used in modern database management systems (DBMSs). The course teaches advanced skills appropriate for DBMS architects and developers, database specialists, and the designers and developers of client/server and distributed systems. The focus of this course is on transaction management, database structures and distributed processing. The course contents and exercises are designed to complement CS 425 (undergraduate database systems). Prerequisites are CS425 and 401/402 or equivalent.
The schedule of this semester includes 14 sessions of 3 class hours each. The syllabus is ambitious. We may modify it as the semester progresses. The topics that we plan to cover, along with the associated reading assignments, are as follows:
The topics described below reference an older version of Ramakrishnan's text. All of these topics, however, are in all the recommended texts for the class in the level of detail I require.
We also reference a course manual (slides) below, which can be found at www.cs.iit.edu/~cs525/slides/dld525.html. As a side note, if you print them out, please try to conserve paper. Double sided, two slides per page, etc.
-
Introduction. History of database management. Goals of database system development.
Readings:
- Chapter 1 of the course manual.
- Chapter 1 of the textbook: sections 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8 and 1.9.
- Relational systems. Data models. Basic relational concepts. Integrity constraints. Relational algebra.
Readings:
- Chapter 2 of the course manual.
- Chapter 4 of the textbook: sections 4.1 and 4.2.
- Buffer management. Goals and importance of buffering. Buffer implementation. Global buffering schemes (FIFO, CLOCK, LFU, LRU, LRU-K).
Readings:
- Chapter 3 of the course manual.
- Chapter 7 of the textbook: sections 7.3 and 7.4.
- Data-store organization. Inverted-file organization. Physical representation of attributes. Physical representation of tuples. Internal organization of data pages. Record identification. Mapping short records onto pages. Mapping long records onto pages. Storage allocation and free-space management.
Readings:
- Chapter 4 of the course manual.
- Chapter 7 of the textbook: sections 7.5, 7.6 and 7.7.
- Indexing and hashing. Goals and importance of indexing. Hashing (static hashing, extensible hashing). Comparative indexing schemes (ISAM, B-trees, Prefix B-trees, prefix and trailing key compression).
Readings:
- Chapter 5 of the course manual.
- Chapter 9 of the textbook: sections 9.1, 9.2, 9.3, 9.4, 9.5, 9.6, 9.7 and 9.8.
- Chapter 10 of the textbook: sections 10.1 and 10.2.
- Database recovery. Concept of a transaction. ACID properties of transactions. Transaction recovery. Physical recovery schemes (Redo/No-undo, Undo/No-Redo, Undo/Redo, No-Undo/No-Redo).
Readings:
- Chapter 7 of the course manual.
- Chapter 18 of the textbook: sections 18.1, 18.5.
- Concurrency control. Transactions and concurrency. Schedules and serializability. Bad dependencies. Locking (lock modes, lock duration, lock granularity, two-phase locking, isolation levels). Lock implementation (hard locks, intention locks, lock queues, lock management and structures).
Readings:
- Chapter 8 of the course manual.
- Chapter 18 of the textbook: sections 18.2, 18.3, 18.4.
- Chapter 19 of the textbook: sections 19.1, 19.2, 19.4 and 19.5.
- Page-level logical recovery. Brief overview. Basic structures. Recovery operations (forward processing, rollback, checkpointing, and restart processing).
Readings:
- Chapter 9 of the course manual.
- Chapter 20 of the textbook: sections 20.1, 20.2.
- Query Processing (under construction)
- Query Optimization (under construction)
- Issues of distributed databases: Date's requirements for distributed data management. Problems of distributed database management: object naming; data-dictionary management; data fragmentation; distributed query processing and optimization; global transactions and two-phase commit.
Readings:
- Chapter 10 of the course manual.
- Chapter 21 of the textbook: sections 21.5, 21.6, 21.8, 21.9, 21.11, 21.12.
-
Data replication. Objectives and requirements of data replication. Replication schemes: synchronous replication, periodic state-based replication, asynchronous replication, symmetric replication. Evaluation of different replication schemes.
Readings:
- Chapter 21 of the textbook: sections 21.7, 21.10, 21.13.
- System Architecture. Client/server architecture. Processes and multi-process server architectures. Threading and multi-threaded server architectures.
Readings:
- Chapter 6 of the course manual.
- Research Topics in Database Systems