IIT Database Group

header bar

Data Debugging and Exploration with Vizier

Authors

Materials

Abstract

We present Vizier, a multi-modal data exploration and debugging tool. The system supports a wide range of operations by seamlessly integrating Python, SQL, and automated data curation and debugging methods. Using Spark as an execution backend, Vizier handles large datasets in multiple formats. Ease-of-use is attained through integration of a notebook with a spreadsheet-style interface and with visualizations that guide and support the user in the loop. In addition, native support for provenance and versioning enable collaboration and uncertainty management. In this demonstration we will illustrate the diverse features of the system using several realistic data science tasks based on real data.

bibtex

@inproceedings{BB19,
  author = {Brachmann, Mike and Bautista, Carlos and Castelo, Sonia and Feng, Su and Freire, Juliana and Glavic, Boris and Kennedy, Oliver and M{\"u}ller, Heiko and Rampin, R{\'e}mi and Spoth, William and Yang, Ying},
  booktitle = {Proceedings of the 44th International Conference on Management of Data (Demonstration Track)},
  date-modified = {2019-04-04 12:25:42 -0500},
  keywords = {Vizier},
  pages = {1877-1880},
  pdfurl = {http://cs.iit.edu/%7edbgroup/assets/pdfpubls/BB19.pdf},
  projects = {Vizier},
  video = {https://www.youtube.com/watch?v=c3ICB-17kRY&t=4s},
  doi = {10.1145/3299869.3320246},
  title = {Data Debugging and Exploration with Vizier},
  venueshort = {SIGMOD},
  year = {2019}
}

Reference

Data Debugging and Exploration with Vizier Mike Brachmann, Carlos Bautista, Sonia Castelo, Su Feng, Juliana Freire, Boris Glavic, Oliver Kennedy, Heiko Müller, Rémi Rampin, William Spoth and Ying Yang Proceedings of the 44th International Conference on Management of Data (Demonstration Track) (2019), pp. 1877–1880.