Pli

Pengyuan Li, Ph.D. Student

I received my B.Sc in 2013 from Chongqing University. I received my M.S. degree in Computer Science in 2016 from Illinois Institute of Technology. I started my Ph.D. at the IIT DBGroup in 2018.

Teaching

I have been TA for the following courses:

2018 Fall: -
2018 Fall: -

Research Projects

I am involved in the following research projects:

GProM - A database-independent middleware for computing the provenance of queries, updates, and transactions

Collaborators

I am collaborating with:

Dieter Gawlick - Oracle
Oliver Kennedy - SUNY Buffalo
Vasudha Krishnaswamy - Oracle
Venkatesh Radhakrishnan
Zhen Hua Liu - Oracle

Publications

Self-tuning Database Operations by Assessing the Importance of Data
Boris Glavic, Pengyuan Li and Ziyu Liu
Technical Report #IIT/CS-DB-2023-01
Illinois Institute of Technology.

pdf
details

@techreport{GL23,
  author = {Glavic, Boris and Li, Pengyuan and Liu, Ziyu},
  title = {Self-tuning Database Operations by Assessing the Importance of Data},
  institution = {Illinois Institute of Technology},
  year = {2023},
  number = {IIT/CS-DB-2023-01},
  pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/GL23.pdf},
  projects = {Relevance-based Data Management},
  keywords = {Provenance, Relevance-based Data Management},
  venueshort = {Techreport}
}

details

Oracle PBDS Experiments
Boris Glavic, Xing Niu, Pengyuan Li and Ziyu Liu
Technical Report #IIT/Cs-db-2022-01
Illinois Institute of Technology.

pdf
details

@techreport{GN22,
  author = {Glavic, Boris and Niu, Xing and Li, Pengyuan and Liu, Ziyu},
  title = {Oracle PBDS Experiments},
  institution = {Illinois Institute of Technology},
  year = {2022},
  number = {IIT/Cs-db-2022-01},
  pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/GN22.pdf},
  projects = {Relevance-based Data Management},
  keywords = {Provenance, Relevance-based Data Management},
  venueshort = {Techreport}
}

details

Provenance-based Data Skipping
Xing Niu, Ziyu Liu, Pengyuan Li, Boris Glavic, Dieter Gawlick, Vasudha Krishnaswamy, Zhen Hua Liu and Danica Porobic
Proceedings of the VLDB Endowment. 15, 3 (2021) , 451–464.
- doi
- pdf
- extended version
- details
```
@article{NL21,
  author = {Niu, Xing and Liu, Ziyu and Li, Pengyuan and Glavic, Boris and Gawlick, Dieter and Krishnaswamy, Vasudha and Liu, Zhen Hua and Porobic, Danica},
  keywords = {Provenance, Data Skipping, Relevance-based Data Management},
  title = {Provenance-based Data Skipping},
  journal = {Proceedings of the VLDB Endowment},
  projects = {Relevance-based Data Management},
  pages = {451 - 464},
  volume = {15},
  issue = {3},
  year = {2021},
  doi = {10.14778/3494124.3494130},
  venueshort = {{PVLDB}},
  pdfurl = {https://vldb.org/pvldb/vol15/p451-niu.pdf},
  longversionurl = {https://arxiv.org/pdf/2104.12815}
}
```
Database systems use static analysis to determine upfront which data is needed for answering a query and use indexes and other physical design techniques to speed-up access to that data. However, for important classes of queries, e.g., HAVING and top-k queries, it is impossible to determine up-front what data is relevant. To overcome this limitation, we develop provenance-based data skipping (PBDS), a novel approach that generates provenance sketches to concisely encode what data is relevant for a query. Once a provenance sketch has been captured it is used to speed up subsequent queries. PBDS can exploit physical design artifacts such as indexes and zone maps.
details