Bahareh Arab, Ph.D. Student (Alumnus)
Bahareh Sadat Arab received her B.Sc in software computer engineering in 2006 from Azad University and M.S. degree in Computer Science in 2011 from University Putra Malaysia. She published papers related to web services and QoS in SOA. She worked as a software engineer in different companies for several years. Bahareh started her Ph.D. at the IIT DBGroup in Fall 2013. She is mainly working on adding provenance support to databases using temporal database techniques. This project is a collaboration between the Oracle Corporation and the IIT DBGroup. The initial focus is computing provenance for database updates and transactions.
Awards
- GHC17 Scholarship (2017)
- WEIS Travel Grant (2017)
- IEEE S&P Travel Award (2017)
- GREPSEC Travel Award (2017)
- CRA-W Grad Cohort Workshop Scholarship (2017)
- Tsao`s Scholarship (2014)
Teaching
I have been TA for the following courses:- 2016 Fall: CS525 - Advanced Database Organization
- 2018 Fall: CS425 - Database Organization
- 2014 Fall: CS425 - Database Organization (SQL Introduction)
- 2016 Fall: CS425 - Database Organization (Formal Relational Query Languages)
Research Projects
I have been involved in the following research projects:- GProM - A database-independent middleware for computing the provenance of queries, updates, and transactions
- Provenace for Updates and Transactions - In this project, we study provenance models for update and transactions and their implementation through reenactment, a declarative replay technique which utilizes audit logs and temporal database technologies.
Publications
-
Efficient Answering of Historical What-if Queries
Felix Campbell, Bahareh Arab and Boris Glavic
Proceedings of the 48th International Conference on Management of Data (SIGMOD) (2022), pp. 1556–1569.@inproceedings{CA22, author = {Campbell, Felix and Arab, Bahareh and Glavic, Boris}, booktitle = {Proceedings of the 48th International Conference on Management of Data ({SIGMOD})}, keywords = {Reenactment; What-if; Uncertainty}, projects = {GProM; Reenactment}, title = {Efficient Answering of Historical What-if Queries}, pages = {1556--1569}, pdfurl = {https://dl.acm.org/doi/pdf/10.1145/3514221.3526138}, doi = {10.1145/3514221.3526138}, longversionurl = {https://arxiv.org/pdf/2203.12860}, video = {https://www.youtube.com/watch?v=6O0InOM-ZbI&t=2s}, venueshort = {SIGMOD}, year = {2022} }
-
Provenance For Transactional Updates
Bahareh Arab
Illinois Institue of Technology.@phdthesis{A19, author = {Arab, Bahareh}, keywords = {Provenance; GProM; Reenactment}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/A19.pdf}, projects = {GProM}, school = {Illinois Institue of Technology}, title = {{Provenance For Transactional Updates}}, venueshort = {PhD Thesis}, year = {2019} }
Database provenance explains how results are derived by queries. However, many use cases such as auditing and debugging of transactions require understanding of how the current state of a database was derived by a transactional history. We introduce an approach for capturing the provenance of transactions. Our approach does not just work for serializable transactions but also non-serializable transaction such as read committed snapshot isolation (RC-SI). The main drivers of our approach are a provenance model for queries, updates, and transactions and reenactment, a novel technique for retroactively capturing the provenance of tuple versions. We introduce the MV-semirings provenance model for updates and transactions as an extension of the existing semiring provenance model for queries. Our reenactment technique exploits the time travel and audit logging capabilities of modern DBMS to replay parts of a transactional history using queries. Importantly, our technique requires no changes to the transactional workload or underlying DBMS and results in only moderate runtime overhead for transactions. Furthermore, we discuss how our MV-semirings model and reenactment approach can be used to serve a wide variety of applications and use cases including answering of historical what-if queries which determine the effect of hypothetical changes to past operations of a business, post- mortem debugging of transactions, and Provenance-aware Versioned Dataworkspaces (PVDs). We have implemented our approach on top of a commercial DBMS and our experiments confirm that by applying novel optimizations we can efficiently capture provenance for complex transactions over large data sets.
-
Using Reenactment to Retroactively Capture Provenance for Transactions
Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
IEEE Transactions on Knowledge and Data Engineering. 30, 3 (2018) , 599–612.@article{AG17c, author = {Arab, Bahareh and Gawlick, Dieter and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh and Glavic, Boris}, doi = {10.1109/TKDE.2017.2769056}, journal = {IEEE Transactions on Knowledge and Data Engineering}, keywords = {Provenance; GProM; Reenactment; Concurrency Control}, number = {3}, pages = {599--612}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG17c.pdf}, projects = {GProM; Reenactment}, title = {Using Reenactment to Retroactively Capture Provenance for Transactions}, venueshort = {TKDE}, volume = {30}, year = {2018} }
-
GProM - A Swiss Army Knife for Your Provenance Needs
Bahareh Arab, Su Feng, Boris Glavic, Seokki Lee, Xing Niu and Qitian Zeng
IEEE Data Engineering Bulletin. 41, 1 (2018) , 51–62.@article{AF18, author = {Arab, Bahareh and Feng, Su and Glavic, Boris and Lee, Seokki and Niu, Xing and Zeng, Qitian}, bibsource = {dblp computer science bibliography, https://dblp.org}, biburl = {https://dblp.org/rec/bib/journals/debu/ArabFGLNZ17}, journal = {{IEEE} Data Engineering Bulletin}, keywords = {GProM; Provenance; Annotations}, number = {1}, pages = {51--62}, pdfurl = {http://sites.computer.org/debull/A18mar/p51.pdf}, projects = {GProM; Reenactment}, timestamp = {Fri, 02 Mar 2018 18:50:49 +0100}, title = {{GProM} - {A} Swiss Army Knife for Your Provenance Needs}, venueshort = {Data Eng. Bull.}, volume = {41}, year = {2018}, bdsk-url-1 = {http://sites.computer.org/debull/A18mar/p51.pdf} }
-
Adaptive Schema Databases
William Spoth, Bahareh Arab, Eric S. Chan, Dieter Gawlick, Adel Ghoneimy, Boris Glavic, Beda Hammerschmidt, Oliver Kennedy, Seokki Lee, Zhen Hua Liu, Xing Niu and Ying Yang
Proceedings of the 8th Biennial Conference on Innovative Data Systems (2017).@inproceedings{SA17, author = {Spoth, William and Arab, Bahareh and Chan, Eric S. and Gawlick, Dieter and Ghoneimy, Adel and Glavic, Boris and Hammerschmidt, Beda and Kennedy, Oliver and Lee, Seokki and Liu, Zhen Hua and Niu, Xing and Yang, Ying}, booktitle = {Proceedings of the 8th Biennial Conference on Innovative Data Systems}, keywords = {Schema Evolution; Data Integration}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/SA17.pdf}, projects = {Vizier}, title = {{Adaptive Schema Databases}}, venueshort = {CIDR}, year = {2017}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/SA17.pdf} }
-
Answering Historical What-if Queries with Provenance, Reenactment, and Symbolic Execution
Bahareh Arab and Boris Glavic
Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance (2017).@inproceedings{AG17b, author = {Arab, Bahareh and Glavic, Boris}, booktitle = {Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance}, isworkshop = {true}, keywords = {Provenance;Reenactment;What-if}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG17b.pdf}, projects = {GProM;Reenactment}, title = {Answering Historical What-if Queries with Provenance, Reenactment, and Symbolic Execution}, venueshort = {TaPP}, year = {2017}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG17b.pdf} }
-
Debugging Transactions and Tracking their Provenance with Reenactment
Xing Niu, Boris Glavic, Seokki Lee, Bahareh Arab, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Su Feng and Xun Zou
Proceedings of the VLDB Endowment (Demonstration Track). 10, 12 (2017) , 1857–1860.@article{NG17, author = {Niu, Xing and Glavic, Boris and Lee, Seokki and Arab, Bahareh and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Feng, Su and Zou, Xun}, journal = {Proceedings of the VLDB Endowment (Demonstration Track)}, keywords = {Provenance; GProM; Reenactment; Debugging; Concurrency Control; Reenactment}, number = {12}, pages = {1857--1860}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XG17.pdf}, projects = {GProM; Reenactment}, title = {Debugging Transactions and Tracking their Provenance with Reenactment}, venueshort = {PVLDB}, volume = {10}, year = {2017}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XG17.pdf} }
-
Formal Foundations of Reenactment and Transaction Provenance
Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
Technical Report #IIT/CS-DB-2016-01
Illinois Institute of Technology.@techreport{AG16a, author = {Arab, Bahareh and Gawlick, Dieter and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh and Glavic, Boris}, date-added = {2014-09-17 20:07:29 +0000}, date-modified = {2014-09-17 20:09:08 +0000}, institution = {Illinois Institute of Technology}, keywords = {Provenance; Concurrency Control; Reenactment; GProM}, number = {IIT/CS-DB-2016-01}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG16.pdf}, projects = {GProM; Reenactment}, title = {Formal Foundations of Reenactment and Transaction Provenance}, venueshort = {Techreport}, year = {2016}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG16.pdf} }
-
Reenactment for Read-Committed Snapshot Isolation
Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
Proceedings of the 25th ACM International Conference on Information and Knowledge Management (2016), pp. 841–850.@inproceedings{AG17, author = {Arab, Bahareh and Gawlick, Dieter and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh and Glavic, Boris}, booktitle = {Proceedings of the 25th ACM International Conference on Information and Knowledge Management}, keywords = {Provenance; Concurrency Control; Reenactment; GProM}, pages = {841--850}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG17.pdf}, longversionurl = {https://arxiv.org/pdf/1608.08258}, projects = {GProM; Reenactment}, title = {Reenactment for Read-Committed Snapshot Isolation}, venueshort = {CIKM}, year = {2016} }
-
Provenance-aware Versioned Dataworkspaces
Xing Niu, Bahareh Arab, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Oliver Kennedy and Boris Glavic
Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance (2016).@inproceedings{XN16, author = {Niu, Xing and Arab, Bahareh and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Kennedy, Oliver and Glavic, Boris}, booktitle = {Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance}, isworkshop = {true}, keywords = {Provenance; GProM; Data Cleaning}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XN16.pdf}, projects = {GProM}, title = {Provenance-aware Versioned Dataworkspaces}, venueshort = {TaPP}, year = {2016} }
-
Reenactment for Read-Committed Snapshot Isolation (long version)
Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
Illinois Institute of Technology.@techreport{AG17a, author = {Arab, Bahareh and Gawlick, Dieter and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh and Glavic, Boris}, institution = {Illinois Institute of Technology}, keywords = {Provenance; Concurrency Control; Reenactment; GProM}, pdfurl = {http://cs.iit.edu/%7Edbgroup/assets/pdfpubls/AG16a.pdf}, projects = {GProM; Reenactment}, title = {Reenactment for Read-Committed Snapshot Isolation (long version)}, venueshort = {Techreport}, year = {2016} }
-
Reenacting Transactions to Compute their Provenance
Bahareh Arab, Dieter Gawlick, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
Technical Report #IIT/CS-DB-2014-02
Illinois Institute of Technology.@techreport{AG14a, author = {Arab, Bahareh and Gawlick, Dieter and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh and Glavic, Boris}, date-added = {2014-09-17 20:07:29 +0000}, date-modified = {2014-09-17 20:09:08 +0000}, institution = {Illinois Institute of Technology}, keywords = {Provenance; Concurrency Control; Reenactment; GProM}, number = {IIT/CS-DB-2014-02}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AD14.pdf}, projects = {GProM; Reenactment}, title = {Reenacting Transactions to Compute their Provenance}, venueshort = {Techreport}, year = {2014}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AD14.pdf} }
-
A Generic Provenance Middleware for Database Queries, Updates, and Transactions
Bahareh Arab, Dieter Gawlick, Venkatesh Radhakrishnan, Hao Guo and Boris Glavic
Proceedings of the 6th USENIX Workshop on the Theory and Practice of Provenance (2014).@inproceedings{AG14, author = {Arab, Bahareh and Gawlick, Dieter and Radhakrishnan, Venkatesh and Guo, Hao and Glavic, Boris}, booktitle = {Proceedings of the 6th USENIX Workshop on the Theory and Practice of Provenance}, isworkshop = {true}, keywords = {Reenactment; Provenance; Concurrency Control; GProM}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG14.pdf}, projects = {GProM}, slideurl = {http://www.slideshare.net/lordPretzel/tapp-2014-talk-boris}, title = {A Generic Provenance Middleware for Database Queries, Updates, and Transactions}, venueshort = {TaPP}, year = {2014}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/AG14.pdf} }
We present an architecture and prototype implementation for a generic provenance database middleware (GProM) that is based on the concept of query rewrites, which are applied to an algebraic graph representation of database operations. The system supports a wide range of provenance types and representations for queries, updates, transactions, and operations spanning multiple transactions. GProM supports several strategies for provenance generation, e.g., on-demand, rule-based, and “always on”. To the best of our knowledge, we are the first to present a solution for computing the provenance of concurrent database transactions. Our solution can retroactively trace transaction provenance as long as an audit log and time travel functionality are available (both are supported by most DBMS). Other noteworthy features of GProM include: extensibility through a declarative rewrite rule specification language, support for multiple database backends, and an optimizer for rewritten queries.