WhyFlow is an interrogative debugging tool for taint analysis that enables developers to ask why, why-not, and what-if questions about dataflows. This artifact accompanies our ICSE 2026 paper: "WhyFlow: Interrogative Debugger for Sensemaking Taint Analysis".
WhyFlow addresses the challenge of making sense of taint analysis results by providing:
- Interrogative Debugging: Ask questions about the existence or absence of specific dataflows
- Speculative Analysis: Explore the impact of different third-party library models and configurations
- Visual Sensemaking: Graph-based visualization with color-coded annotations for global connectivity reasoning
- Interactive Q&A Interface: Template-based queries with contextualized selections for sources, sinks, and APIs
- Interactive question-answer debugging interface for taint analysis
- Support for why, why-not, and what-if queries about dataflows
- Integration with CodeQL and Souffle Datalog for static analysis
- Visual graph representation of taint flows with color-coded paths
- Efficient handling of large-scale analysis results using MongoDB
- User study data and statistical analysis scripts included
WhyFlow/
├── taint_debug_app/ # Main WhyFlow application
│ ├── taint_debug/ # Meteor web application
│ │ ├── client/ # Frontend UI components
│ │ ├── server/ # Backend API and data loading
│ │ └── imports/ # Shared code and collections
│ ├── analysis_files/ # Analysis data and fact files
│ ├── app_souffle_queries/ # Souffle Datalog query files
│ └── souffle_output/ # Generated query outputs
├── Subject_Prog_CodeQL_Taint/# Subject program and CodeQL results
│ ├── src/ # Source code (Apache Dubbo)
│ ├── codeql-custom-queries-java/ # Custom CodeQL queries
│ └── *.json, *.csv # CodeQL analysis results
├── statistical_tests/ # User study statistical analysis
│ ├── statistical_tests.py # Python scripts for analysis
│ └── *.csv # User study data and results
├── data/ # User study materials
│ ├── data/ # Questionnaire responses
│ ├── extension_queries/ # Additional query examples
│ ├── tutorials/ # Tutorial materials
│ └── *.png, *.ipynb # Plots and analysis notebooks
└── souffle_output/ # Additional Souffle outputs
- Meteor (v2.13 or higher)
- Node.js (v14 or higher)
- MongoDB (installed with Meteor)
- Souffle (optional, for running custom Datalog queries)
- CodeQL (optional, for analyzing new programs)
# macOS/Linux
curl https://install.meteor.com/ | sh
# Windows
# Download installer from https://www.meteor.com/installgit clone https://github.com/yourusername/WhyFlow.git
cd WhyFlow# Install root-level dependencies
npm install
# Install WhyFlow app dependencies
cd taint_debug_app/taint_debug
meteor npm install
cd ../..cd taint_debug_app/taint_debug
meteor runThe application will be available at http://localhost:3000
Set these variables for custom configurations:
export PWD=/path/to/WhyFlow
export SOURCE_CODE_ROOT_DIR=/path/to/subject/program- Access the Interface: Open
http://localhost:3000in your browser - Select Query Type: Choose from templated why, why-not, or what-if questions
- Contextualize Query: Select specific sources, sinks, and third-party APIs from dropdowns
- View Results: Explore results in the graph view with color-coded annotations
- Iterate: Refine queries based on initial results for deeper investigation
The statistical_tests/ directory contains all user study data and analysis scripts:
cd statistical_tests
python3 statistical_tests.pyThis will regenerate the statistical test results reported in the paper.
Generate plots from the user study data:
cd data
jupyter notebook plots.ipynbPlace Souffle Datalog query files in taint_debug_app/app_souffle_queries/
- Run CodeQL analysis on your target program
- Export results in JSON/CSV format
- Place results in
Subject_Prog_CodeQL_Taint/ - Update paths in the Meteor application configuration
Modify the Meteor application in taint_debug_app/taint_debug/:
client/- Frontend React componentsserver/- Backend API methodsimports/- Shared collections and utilities
This repository includes:
- ✅ User study questionnaire responses
- ✅ Statistical analysis scripts and results
- ✅ Subject program (Apache Dubbo) with CodeQL results
- ✅ Tutorial materials and task descriptions
- ✅ NASA-TLX and accuracy data
If you use WhyFlow in your research, please cite our paper:
@inproceedings{yetistiren2026whyflow,
title={WhyFlow: Interrogative Debugger for Sensemaking Taint Analysis},
author={Yetiştiren, Burak and Kang, Hong Jin and Kim, Miryung},
booktitle={Proceedings of the 48th International Conference on Software Engineering},
year={2026},
organization={ACM}
}This project is licensed under the MIT License - see the LICENSE file for details.
For questions or issues, please:
- Open an issue on GitHub
- Contact: burak@cs.ucla.edu
This work is supported by the National Science Foundation under grant numbers 2426162, 2106838, and 2106404, with additional support from Amazon and Samsung.