Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 4 additions & 43 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,7 @@
# querychat: Chat with your data in any language
# querychat <a href="https://posit-dev.github.io/querychat/"><img src="pkg-r/man/figures/logo.png" align="right" height="138" alt="querychat website" /></a>

querychat is a multilingual package that allows you to chat with your data using natural language queries. It's available for:
QueryChat facilitates safe and reliable natural language exploration of tabular data, powered by SQL and large language models (LLMs).

- [R - Shiny](pkg-r/README.md)
- [Python - Shiny for Python](pkg-py/README.md)
To get started, see the [official website](https://posit-dev.github.io/querychat/).

## Overview

Imagine typing questions like these directly into your dashboard, and seeing the results in realtime:

* "Show only penguins that are not species Gentoo and have a bill length greater than 50mm."
* "Show only blue states with an incidence rate greater than 100 per 100,000 people."
* "What is the average mpg of cars with 6 cylinders?"

querychat is a drop-in component for Shiny that allows users to query a data frame using natural language. The results are available as a reactive data frame, so they can be easily used from Shiny outputs, reactive expressions, downloads, etc.

| ![Animation of a dashboard being filtered by a chatbot in the sidebar](animation.gif) |
|-|

[Live demo](https://jcheng.shinyapps.io/sidebot/)

**This is not as terrible an idea as you might think!** We need to be very careful when bringing LLMs into data analysis, as we all know that they are prone to hallucinations and other classes of errors. querychat is designed to excel in reliability, transparency, and reproducibility by using this one technique: denying it raw access to the data, and forcing it to write SQL queries instead.

## How it works

### Powered by LLMs

querychat's natural language chat experience is powered by LLMs (like GPT-4o, Claude 3.5 Sonnet, etc.) that support function/tool calling capabilities.

### Powered by SQL

querychat doesn't send the raw data to the LLM, asking it to guess summary statistics. Instead, the LLM generates precise SQL queries to filter the data or directly calculate statistics. This is crucial for ensuring relability, transparency, and reproducibility:

- **Reliability:** Today's LLMs are excellent at writing SQL, but bad at direct calculation.
- **Transparency:** querychat always displays the SQL to the user, so it can be vetted instead of blindly trusted.
- **Reproducibility:** The SQL query can be easily copied and reused.

Currently, querychat uses DuckDB for its SQL engine when working with data frames. For database sources, it uses the native SQL dialect of the connected database.

## Language-specific Documentation

For detailed information on how to use querychat in your preferred language, see the language-specific READMEs:

- [R Documentation](pkg-r/README.md)
- [Python Documentation](pkg-py/README.md)
Or, the README for [R](pkg-r/README.md) and [Python](pkg-py/README.md).
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@
<h1 class="package-title">querychat</h1>
<p class="package-subtitle">Chat with your data in any language</p>
<p class="package-description">
A drop-in component for Shiny that allows you to chat with your data using natural language queries.
querychat facilitates safe and reliable natural language exploration of tabular data, powered by SQL and large language models (LLMs).
Available for both R and Python.
</p>
<img src="animation.gif"
Expand Down
46 changes: 38 additions & 8 deletions pkg-py/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,50 @@
# querychat for Python
# querychat <a href="https://posit-dev.github.io/querychat/py/"><img src="https://posit-dev.github.io/querychat/images/querychat.png" align="right" height="138" alt="querychat website" /></a>

Please see [the package documentation site](https://posit-dev.github.io/querychat/py/index.html) for installation, setup, and usage.
<p>
<!-- badges start -->
<a href="https://pypi.org/project/querychat/"><img alt="PyPI" src="https://img.shields.io/pypi/v/querychat?logo=python&logoColor=white&color=orange"></a>
<a href="https://choosealicense.com/licenses/mit/"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="MIT License"></a>
<a href="https://pypi.org/project/querychat"><img src="https://img.shields.io/pypi/pyversions/querychat.svg" alt="versions"></a>
<a href="https://github.com/posit-dev/querychat"><img src="https://github.com/posit-dev/querychat/actions/workflows/test.yml/badge.svg?branch=main" alt="Python Tests"></a>
<!-- badges end -->
</p>

If you are looking for querychat python examples,
you can find them in the `examples/` directory.

QueryChat facilitates safe and reliable natural language exploration of tabular data, powered by SQL and large language models (LLMs). For analysts, it offers an intuitive web application where they can quickly ask questions of their data and receive verifiable data-driven answers. For software developers, QueryChat provides a comprehensive Python API to access core functionality -- including chat UI, generated SQL statements, resulting data, and more. This capability enables the seamless integration of natural language querying into bespoke data applications.

## Installation

You can install the package from PyPI using pip:
Install the latest stable release [from PyPI](https://pypi.org/project/querychat/):

```bash
pip install querychat
```

Or you can install querychat directly from GitHub:
## Quick start

```bash
pip install "querychat @ git+https://github.com/posit-dev/querychat"
The main entry point is the [`QueryChat` class](https://posit-dev.github.io/querychat/py/reference/QueryChat.html). It requires a [data source](https://posit-dev.github.io/querychat/py/data-sources.html) (e.g., pandas, polars, etc) and a name for the data.

```python
from querychat import QueryChat
from querychat.data import titanic

qc = QueryChat(titanic(), "titanic")
app = qc.app()
# app.run()
```

<p align="center">
<img src="docs/images/quickstart.png" alt="QueryChat interface showing natural language queries" width="85%">
</p>

## Custom apps

Build your own custom web apps with natural language querying capabilities, such as [this one](https://github.com/posit-conf-2025/llm/blob/main/_solutions/25_querychat/25_querychat_02-end-app.R) which provides a bespoke interface for exploring Airbnb listings:

<p align="center">
<img src="docs/images/airbnb.png" alt="A custom app for exploring Airbnb listings, powered by QueryChat." width="85%">
</p>

## Learn more

See the [website](https://posit-dev.github.io/querychat/py) to learn more.
31 changes: 31 additions & 0 deletions pkg-py/docs/_examples/multiple-datasets.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
from querychat.data import titanic
from querychat.express import QueryChat
from seaborn import load_dataset
from shiny.express import render, ui

penguins = load_dataset("penguins")

qc_titanic = QueryChat(titanic(), "titanic")
qc_penguins = QueryChat(penguins, "penguins")

with ui.sidebar():
with ui.panel_conditional("input.navbar == 'Titanic'"):
qc_titanic.ui()
with ui.panel_conditional("input.navbar == 'Penguins'"):
qc_penguins.ui()

with ui.nav_panel("Titanic"):
@render.data_frame
def titanic_table():
return qc_titanic.df()

with ui.nav_panel("Penguins"):
@render.data_frame
def penguins_table():
return qc_penguins.df()

ui.page_opts(
id="navbar",
title="Multiple Datasets with querychat",
fillable=True,
)
83 changes: 83 additions & 0 deletions pkg-py/docs/_examples/titanic-dashboard.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
import plotly.express as px
from faicons import icon_svg
from querychat.data import titanic
from querychat.express import QueryChat
from shiny.express import render, ui
from shinywidgets import render_plotly

qc = QueryChat(titanic(), "titanic")
qc.sidebar()

with ui.layout_column_wrap(fill=False):
with ui.value_box(showcase=icon_svg("users")):
"Passengers"

@render.text
def count():
return str(len(qc.df()))

with ui.value_box(showcase=icon_svg("heart")):
"Survival Rate"

@render.text
def survival():
rate = qc.df()["survived"].mean() * 100
return f"{rate:.1f}%"

with ui.value_box(showcase=icon_svg("coins")):
"Avg Fare"

@render.text
def fare():
avg = qc.df()["fare"].mean()
return f"${avg:.2f}"

with ui.layout_columns():
with ui.card():
with ui.card_header():
"Data Table"

@render.text
def table_title():
return f" - {qc.title()}" if qc.title() else ""

@render.data_frame
def data_table():
return qc.df()

with ui.card():
ui.card_header("Survival by Class")

@render_plotly
def survival_by_class():
df = qc.df()
summary = df.groupby("pclass")["survived"].mean().reset_index()
return px.bar(
summary,
x="pclass",
y="survived",
labels={"pclass": "Class", "survived": "Survival Rate"},
)

with ui.layout_columns():
with ui.card():
ui.card_header("Age Distribution")

@render_plotly
def age_dist():
df = qc.df()
return px.histogram(df, x="age", nbins=30)

with ui.card():
ui.card_header("Fare by Class")

@render_plotly
def fare_by_class():
df = qc.df()
return px.box(df, x="pclass", y="fare", color="survived")

ui.page_opts(
title="Titanic Survival Analysis",
fillable=True,
class_="bslib-page-dashboard",
)
2 changes: 1 addition & 1 deletion pkg-py/docs/_quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ quartodoc:
sidebar: reference/_sidebar.yml
css: reference/_styles-quartodoc.css
sections:
- title: The Querychat class
- title: The QueryChat class
desc: The starting point for any QueryChat session
contents:
- name: QueryChat
Expand Down
Loading
Loading