PyData Tel Aviv 2022

Quibbler - an open source package for inherently interactive data exploration
12-13, 13:30–14:00 (Asia/Jerusalem), Track 1

Interactivity, traceability, transparency and efficiency are becoming increasingly important, yet challenging, in today’s data-rich analysis applications. Inevitably, data analysis pipelines are often heavily parametrized, and we lack good ways to trace which specific parameters affect a focal downstream result and to evaluate the effects of changing parameters in ways that are interactive, transparent and computationally efficient. In the talk, we will introduce “Quibbler” - a new open source, pure-python package for building inherently interactive, yet traceable, transparent and efficient data analysis applications. Founded on a data-flow paradigm, Quibbler allows processing data through any series of analysis steps, while automatically tracking functional relationships between downstream results and upstream parameters. Quibbler facilitates and embraces human interventions as an inherent part of the analysis pipeline: input parameters, as well as algorithmic exceptions and overrides, can be specified interactively, and any such interventions are automatically recorded and documented. Changes to upstream parameters propagate downstream, pinpointing which specific data items, or even slices thereof, are affected, thereby vastly saving unnecessary recalculations. Importantly, Quibbler does not require learning any new programming syntax; it seamlessly integrates into any standard Python analysis code. We are just launching Quibbler as an open-source project, and are eager to see it being used and integrated within a range of data science applications. We are of course also looking for feedback, suggestions and help.


The talk will follow this structure:

  1. Describe key common challenges in building data analysis pipelines: traceability, transparency, interactivity and efficiency. (5 minutes)

  2. Present the basic idea of Quibbler and show specific implementation examples. (10 minutes)
    In particular, we will show how several different analysis codes essentially “come to life” with Quibbler; how Quibbler brings interactivity and efficiency to otherwise completely standard Python analysis codes.

  3. Demonstrate how Quibbler can reveal and graphically depict the network of functional relationships among data items produced in any standard Python analysis pipeline. (3 minutes)

  4. Discuss what happens “under the hood”. (5 minutes)

  5. Sum up (2 minutes)

  6. Q&A (5 minutes).

Prerequisite knowledge. Basic Python programming knowledge and some acquaintance with data analysis needs and tools.

References:

Quibbler GitHub (currently private. will become public before the meeting).
Quibbler ReadTheDocs, "https://kishony-lab-pyquibbler.readthedocs-hosted.com/en/latest/"

Prof. Roy Kishony

Marilyn and Henry Taub Professor of Life Sciences

Faculty of Biology and Faculty of Computer Science

Technion - Israel Institute of Technology