Xing Niu, Ph.D. Student (Alumnus)
I received my B.Sc in Computer Science from Henan University and M.S. degree in Computer Science in from Henan University. I worked in Institute Of Software Chinese Academy Of Sciences for one year. Currently, I am a Ph.D. student at the IIT DBGroup.
Teaching
I have been TA for the following courses:- 2015 Fall: CS525 - Advanced Database Organization
- 2014 Fall: CS425 - Database Organization
- 2017 Spring: CS425 - Database Organization
- 2018 Spring: CS520 - Data Integration, Warehousing, and Provenance
Research Projects
I have been involved in the following research projects:- GProM - A database-independent middleware for computing the provenance of queries, updates, and transactions
- Provenace for Updates and Transactions - In this project, we study provenance models for update and transactions and their implementation through reenactment, a declarative replay technique which utilizes audit logs and temporal database technologies.
- Relevance-based Data Management - We use provenance to determine what data is relevant for which task and then exploit this information to improve a wide range of data management tasks.
Publications
-
Oracle PBDS Experiments
Boris Glavic, Xing Niu, Pengyuan Li and Ziyu Liu
Technical Report #IIT/Cs-db-2022-01
Illinois Institute of Technology.@techreport{GN22, author = {Glavic, Boris and Niu, Xing and Li, Pengyuan and Liu, Ziyu}, title = {Oracle PBDS Experiments}, institution = {Illinois Institute of Technology}, year = {2022}, number = {IIT/Cs-db-2022-01}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/GN22.pdf}, projects = {Relevance-based Data Management}, keywords = {Provenance, Relevance-based Data Management}, venueshort = {Techreport} }
-
Integrating Provenance Management and Query Optimization
Xing Niu
Illinois Institute of Technology.@phdthesis{N21, venueshort = {PhD Thesis}, author = {Niu, Xing}, keywords = {Provenance, Cost-based optimization, Data Skipping, Relevance-based Data Management}, month = dec, pdfurl = {https://media.proquest.com/media/hms/PFT/2/W8M8M?cit%3Aauth=Niu%2C+Xing&cit%3Atitle=Integrating+Provenance+Management+and+Query+Optimization&cit%3Apub=ProQuest+Dissertations+and+Theses&cit%3Avol=&cit%3Aiss=&cit%3Apg=&cit%3Adate=2021&ic=true&cit%3Aprod=ProQuest&_a=ChgyMDIyMDgzMDIwMjQxNzgzNDoxOTYzOTYSBTg0NTY5GgpPTkVfU0VBUkNIIg0yNC4xMzYuMjQuMjIyKgUxODc1MDIKMjYyMjkzMDc1MDoNRG9jdW1lbnRJbWFnZUIBMFIGT25saW5lWgJGVGIDUEZUagoyMDIxLzAxLzAxcgoyMDIxLzEyLzMxegCCASlQLTEwMDg3NTItMjgzNzctQ1VTVE9NRVItMTAwMDAyMDUtNDM1NTkyMpIBBk9ubGluZcoBVE1vemlsbGEvNS4wIChNYWNpbnRvc2g7IEludGVsIE1hYyBPUyBYIDEwLjE1OyBydjoxMDMuMCkgR2Vja28vMjAxMDAxMDEgRmlyZWZveC8xMDMuMNIBFkRpc3NlcnRhdGlvbnMgJiBUaGVzZXOaAgdQcmVQYWlkqgIrT1M6RU1TLU1lZGlhTGlua3NTZXJ2aWNlLWdldE1lZGlhVXJsRm9ySXRlbcoCE0Rpc3NlcnRhdGlvbi9UaGVzaXPSAgFZ8gIA%2BgIBToIDA1dlYooDHENJRDoyMDIyMDgzMDIwMjQxNzgzNDo3OTIwNDY%3D&_s=oFyeSaHKMuftg4s4wmEJRSOs%2B4U%3D}, projects = {GProM; Relevance-based Data Management}, school = {Illinois Institute of Technology}, title = {Integrating Provenance Management and Query Optimization}, year = {2021} }
-
Provenance-based Data Skipping
Xing Niu, Ziyu Liu, Pengyuan Li, Boris Glavic, Dieter Gawlick, Vasudha Krishnaswamy, Zhen Hua Liu and Danica Porobic
Proceedings of the VLDB Endowment. 15, 3 (2021) , 451–464.@article{NL21, author = {Niu, Xing and Liu, Ziyu and Li, Pengyuan and Glavic, Boris and Gawlick, Dieter and Krishnaswamy, Vasudha and Liu, Zhen Hua and Porobic, Danica}, keywords = {Provenance, Data Skipping, Relevance-based Data Management}, title = {Provenance-based Data Skipping}, journal = {Proceedings of the VLDB Endowment}, projects = {Relevance-based Data Management}, pages = {451 - 464}, volume = {15}, issue = {3}, year = {2021}, doi = {10.14778/3494124.3494130}, venueshort = {{PVLDB}}, pdfurl = {https://vldb.org/pvldb/vol15/p451-niu.pdf}, longversionurl = {https://arxiv.org/pdf/2104.12815} }
Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.
-
Snapshot Semantics for Temporal Multiset Relations
Anton Dignös, Boris Glavic, Xing Niu, Michael H. Böhlen and Johann Gamper
Proceedings of the VLDB Endowment. 12, 6 (2019) , 639–652.@article{DG19, author = {Dign\"{o}s, Anton and Glavic, Boris and Niu, Xing and B\"{o}hlen, Michael H. and Gamper, Johann}, journal = {Proceedings of the VLDB Endowment}, keywords = {Temporal Databases; Annotations}, longversionurl = {https://arxiv.org/pdf/1902.04938}, reproducibility = {https://github.com/IITDBGroup/2019-PVLDB-Reproducibility-Snapshot-Semantics-For-Temporal-Multiset-Relations}, projects = {Snapshot Semantics for Temporal Databases}, number = {6}, pages = {639--652}, pdfurl = {http://www.vldb.org/pvldb/vol12/p639-dignoes.pdf}, title = {{Snapshot Semantics for Temporal Multiset Relations}}, venueshort = {PVLDB}, volume = {12}, year = {2019} }
Snapshot semantics is widely used for evaluating queries over temporal data: temporal relations are seen as sequences of snapshot relations, and queries are evaluated at each snapshot. In this work, we demonstrate that current approaches for snapshot semantics over interval-timestamped multiset relations are subject to two bugs regarding snapshot aggregation and bag difference. We introduce a novel temporal data model based on K-relations that overcomes these bugs and prove it to correctly encode snapshot semantics. Furthermore, we present an efficient implementation of our model as a database middleware and demonstrate experimentally that our approach is competitive with native implementations.
-
Heuristic and Cost-based Optimization for Diverse Provenance Tasks
Xing Niu, Raghav Kapoor, Boris Glavic, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy and Venkatesh Radhakrishnan
IEEE Transactions on Knowledge and Data Engineering. 31, 7 (2019) , 1267–1280.@article{NK18, author = {Niu, Xing and Kapoor, Raghav and Glavic, Boris and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh}, doi = {10.1109/TKDE.2018.2827074}, journal = {IEEE Transactions on Knowledge and Data Engineering}, keywords = {Provenance; Optimization; GProM}, longversionurl = {https://arxiv.org/pdf/1804.07156.pdf}, number = {7}, pages = {1267--1280}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/NX18.pdf}, projects = {GProM}, title = {Heuristic and Cost-based Optimization for Diverse Provenance Tasks}, venueshort = {TKDE}, volume = {31}, year = {2019} }
-
GProM - A Swiss Army Knife for Your Provenance Needs
Bahareh Arab, Su Feng, Boris Glavic, Seokki Lee, Xing Niu and Qitian Zeng
IEEE Data Engineering Bulletin. 41, 1 (2018) , 51–62.@article{AF18, author = {Arab, Bahareh and Feng, Su and Glavic, Boris and Lee, Seokki and Niu, Xing and Zeng, Qitian}, bibsource = {dblp computer science bibliography, https://dblp.org}, biburl = {https://dblp.org/rec/bib/journals/debu/ArabFGLNZ17}, journal = {{IEEE} Data Engineering Bulletin}, keywords = {GProM; Provenance; Annotations}, number = {1}, pages = {51--62}, pdfurl = {http://sites.computer.org/debull/A18mar/p51.pdf}, projects = {GProM; Reenactment}, timestamp = {Fri, 02 Mar 2018 18:50:49 +0100}, title = {{GProM} - {A} Swiss Army Knife for Your Provenance Needs}, venueshort = {Data Eng. Bull.}, volume = {41}, year = {2018}, bdsk-url-1 = {http://sites.computer.org/debull/A18mar/p51.pdf} }
-
Snapshot Semantics for Temporal Multiset Relations (extended version)
Anton Dignös, Boris Glavic, Xing Niu, Michael H. Böhlen and Johann Gamper
Technical Report #IIT/CS-DB-2018-03
Illinois Institute of Technology.@techreport{DG18, author = {Dign\"{o}s, Anton and Glavic, Boris and Niu, Xing and B\"{o}hlen, Michael H. and Gamper, Johann}, institution = {Illinois Institute of Technology}, keywords = {Temporal Databases}, number = {IIT/CS-DB-2018-03}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/DG18.pdf}, title = {Snapshot Semantics for Temporal Multiset Relations (extended version)}, venueshort = {Techreport}, projects = {Snapshot Semantics for Temporal Databases}, year = {2018} }
-
Debugging Transactions and Tracking their Provenance with Reenactment
Xing Niu, Boris Glavic, Seokki Lee, Bahareh Arab, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Su Feng and Xun Zou
Proceedings of the VLDB Endowment (Demonstration Track). 10, 12 (2017) , 1857–1860.@article{NG17, author = {Niu, Xing and Glavic, Boris and Lee, Seokki and Arab, Bahareh and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Feng, Su and Zou, Xun}, journal = {Proceedings of the VLDB Endowment (Demonstration Track)}, keywords = {Provenance; GProM; Reenactment; Debugging; Concurrency Control; Reenactment}, number = {12}, pages = {1857--1860}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XG17.pdf}, projects = {GProM; Reenactment}, title = {Debugging Transactions and Tracking their Provenance with Reenactment}, venueshort = {PVLDB}, volume = {10}, year = {2017}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XG17.pdf} }
-
Provenance-aware Query Optimization
Xing Niu, Raghav Kapoor, Boris Glavic, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy and Venkatesh Radhakrishnan
Proceedings of the 33rd IEEE International Conference on Data Engineering (2017), pp. 473–484.@inproceedings{XN17, author = {Niu, Xing and Kapoor, Raghav and Glavic, Boris and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh}, booktitle = {Proceedings of the 33rd IEEE International Conference on Data Engineering}, keywords = {Provenance; Cost-based optimization; Query instrumentation; Annotation propagation; GProM}, pages = {473-484}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XN17.pdf}, projects = {GProM}, title = {Provenance-aware Query Optimization}, venueshort = {ICDE}, year = {2017} }
-
Adaptive Schema Databases
William Spoth, Bahareh Arab, Eric S. Chan, Dieter Gawlick, Adel Ghoneimy, Boris Glavic, Beda Hammerschmidt, Oliver Kennedy, Seokki Lee, Zhen Hua Liu, Xing Niu and Ying Yang
Proceedings of the 8th Biennial Conference on Innovative Data Systems (2017).@inproceedings{SA17, author = {Spoth, William and Arab, Bahareh and Chan, Eric S. and Gawlick, Dieter and Ghoneimy, Adel and Glavic, Boris and Hammerschmidt, Beda and Kennedy, Oliver and Lee, Seokki and Liu, Zhen Hua and Niu, Xing and Yang, Ying}, booktitle = {Proceedings of the 8th Biennial Conference on Innovative Data Systems}, keywords = {Schema Evolution; Data Integration}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/SA17.pdf}, projects = {Vizier}, title = {{Adaptive Schema Databases}}, venueshort = {CIDR}, year = {2017}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/SA17.pdf} }
-
Integrating Approximate Summarization with Provenance Capture
Seokki Lee, Xing Niu, Bertram Ludäscher and Boris Glavic
Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance (2017).@inproceedings{SN17, author = {Lee, Seokki and Niu, Xing and Lud\"{a}scher, Bertram and Glavic, Boris}, booktitle = {Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance}, isworkshop = {true}, keywords = {Provenance; Datalog; GProM; Missing Answers; Game Provenance; PUGS}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/SN17.pdf}, projects = {GProM; PUGS}, title = {Integrating Approximate Summarization with Provenance Capture}, venueshort = {TaPP}, year = {2017}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/SN17.pdf} }
-
Optimizing Provenance Capture and Queries - Algebraic Transformations and Cost-based Optimization
Xing Niu and Boris Glavic
Technical Report #IIT/CS-DB-2016-02
Illinois Institute of Technology.@techreport{XN16a, author = {Niu, Xing and Glavic, Boris}, date-added = {2016-09-17 20:07:29 +0000}, date-modified = {2016-09-17 20:09:08 +0000}, institution = {Illinois Institute of Technology}, keywords = {Provenance; Query Optimization; GProM}, number = {IIT/CS-DB-2016-02}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XN16a.pdf}, projects = {GProM}, title = {Optimizing Provenance Capture and Queries - Algebraic Transformations and Cost-based Optimization}, venueshort = {Techreport}, year = {2016}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XN16a.pdf} }
-
Provenance-aware Versioned Dataworkspaces
Xing Niu, Bahareh Arab, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Oliver Kennedy and Boris Glavic
Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance (2016).@inproceedings{XN16, author = {Niu, Xing and Arab, Bahareh and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Kennedy, Oliver and Glavic, Boris}, booktitle = {Proceedings of the 8th USENIX Workshop on the Theory and Practice of Provenance}, isworkshop = {true}, keywords = {Provenance; GProM; Data Cleaning}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/XN16.pdf}, projects = {GProM}, title = {Provenance-aware Versioned Dataworkspaces}, venueshort = {TaPP}, year = {2016} }
-
Interoperability for Provenance-aware Databases using PROV and JSON
Xing Niu, Raghav Kapoor, Dieter Gawlick, Zhen Hua Liu, Vasudha Krishnaswamy, Venkatesh Radhakrishnan and Boris Glavic
Proceedings of the 7th USENIX Workshop on the Theory and Practice of Provenance (2015).@inproceedings{PJ15, author = {Niu, Xing and Kapoor, Raghav and Gawlick, Dieter and Liu, Zhen Hua and Krishnaswamy, Vasudha and Radhakrishnan, Venkatesh and Glavic, Boris}, booktitle = {Proceedings of the 7th USENIX Workshop on the Theory and Practice of Provenance}, isworkshop = {true}, keywords = {Provenance;JSON;GProM;PROV}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/PJ15.pdf}, projects = {GProM}, slideurl = {http://www.slideshare.net/lordPretzel/2015-tapp}, title = {Interoperability for Provenance-aware Databases using PROV and JSON}, venueshort = {TaPP}, year = {2015}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/PJ15.pdf} }
-
Heuristic and Cost-based Optimization for Provenance Computation
Xing Niu, Raghav Kapoor and Boris Glavic
Proceedings of the 7th USENIX Workshop on the Theory and Practice of Provenance (Poster) (2015).@inproceedings{NK15, author = {Niu, Xing and Kapoor, Raghav and Glavic, Boris}, booktitle = {Proceedings of the 7th USENIX Workshop on the Theory and Practice of Provenance (Poster)}, isworkshop = {true}, keywords = {Provenance; Query Optimization; GProM}, pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/NK15.pdf}, projects = {GProM}, title = {{Heuristic and Cost-based Optimization for Provenance Computation}}, venueshort = {TaPP}, year = {2015}, bdsk-url-1 = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/NK15.pdf} }