Curious Frame

Curious Frame is a Python application designed for the Gemma3n Impact Challenge. It serves as an interactive, offline-ready learning experience for curious kids supporting multi-languages. The application uses a camera to identify objects and "action cards" presented within a physical frame, and then uses the Gemma3n model to provide educational information about the objects through speech.

Check out the demo:

Note

Latest version can use ministral3 as single model for snapshot analysis and object description. Moreover it can directly reply in French. So this allows to reduce the lantency by leveraging only the GPU and avoiding some additional call to translate to French. To use that model, you need to change the command line for curious_frame in docker-compose.yaml to
python3 -m curious_frame --capture-dir /app/snapshots --multilanguage --vlm-model "ministral-3:3b" --llm-model "ministral-3:3b"
And don't forget to pull the model with ollama prior to executing the code: ollama pull ministral-3:3b.
This was demonstrated on December 10th, 2025 at Python Rennes Meetup

How to Install

To run this project, you'll need a NVIDIA® Jetson Orin Nano™ Developer Kit with JetPack 6.2.1 installed; see Initial Setup Guide.

Then set up the NVMe SSD (strongly recommended) as explained in that Jetson AI Lab tutorial. And optimize the configuration for the RAM by disabling the desktop GUI and mounting SWAP to the NVMe SSD; see that tutorial.

With that setup, it is highly recommended to connect to the Jetson Nano using SSH rather than screen and keyboard.

You will need a cardboard frame. The one I did has the following dimensions:

The action cards can be found in ./assets/illustrations.odt - to be printed on a standard paper printer.

Get the code and setup the nano, running the following commands:

git clone https://github.com/webscit/curious-frame.git
cd curious-frame
bash ./setup.sh

From now on, the application should start automatically when the Jetson Nano is started. The application will shutdown after 10 minutes if no new objects are detected.

Here are some usefull commands in case you are logged in to the Jetson Nano:

# Stop the service
sudo systemctl stop curious_frame.service
# Start the service
sudo systemctl start curious_frame.service
# Display the service log
journalctl -u curious_frame.service -f

What got installed:

ollama on the host machine - it is installed to start automatically with the Jetson Nano
a custom service curious_frame to start the containers automatically with the Jetson Nano

That's it! Everything else is within Docker containers.

The container will use the host network. So be careful as the configuration is for development only as its missing security features.

How to uninstall

Uninstall the curious_frame service by executing the command:

sudo systemctl stop curious_frame
sudo systemctl disable curious_frame
sudo rm /etc/systemd/system/curious_frame.service
sudo rm /usr/local/bin/curious_frame.sh

To uninstall ollama, look at the official documentation

FAQ

My camera is not recognized.

This is one of the trickiest part of this project. There are no easy answer and it depends if you are using a USB or a CSI camera. In any case, you can look at the code in src/curious_frame/camera.py and try to adapt it for your case.
Of note, you can change the camera id, its resolution and its frame rate through command line; see src/curious_frame/main.py. The command line for the curious-frame can then be updated in the docker-compose.yaml file. Then execute the following command:

sudo systemctl stop curious_frame
docker compose build
sudo systemctl start curious_frame

No sound is emitted.

The sound is emitted using alsa tool aplay. It defaults to the sysdefault device. You can change that by using the command line argument --audio-device <device>. And the list of devices can be found by executing aplay -L (you may need to install it using sudo apt install alsa-utils).

The disk is full.

The docker compose is parametrized to mount the local folders snapshots and audio_cache in the cloned repository. The first one is storing the snapshots taken by the application and a CSV file with the VLM and LLM responses. The second folder caches the WAV files generated by piper. If you are running out of space, you can safely remove them.

Architecture

The application is composed of three main services orchestrated by Docker Compose:

curious-frame: The main application service, written in Python. It captures video from the camera, processes the images to detect objects and action cards, and interacts with the other services to generate a response.
ollama: This service runs the Gemma3n large language model. The curious-frame service sends requests to this service to get information about the objects identified in the camera stream and to translate the text if the appropriate language.
piper: This service is responsible for text-to-speech (TTS). The curious-frame service sends the text generated by the ollama service to piper to be converted into audio, which is then played back to the user.

Here is a diagram of the architecture:

graph TD
    subgraph Jetson Orin Nano
        subgraph Docker Compose
            A[curious-frame] -->|Sends text for TTS| C(piper)
            C -->|Returns audio| A
        end
        A -->|Sends text prompt| B(ollama - gemma3n)
        B -->|Returns text response| A
    end

    D(Camera) --> A
    A --> E(Speakers)

Here is the sequence diagram:

sequenceDiagram
    loop exit if objects does not change for 10 min
        curious-frame->>+Camera: Request snapshot
        Camera-->>-curious-frame: 
        curious-frame->>+curious-frame: Search for objects
        curious-frame->>+curious-frame: Switch to French if flag found
        curious-frame->>+ollama Gemma3n: Request a description of the objects
        ollama Gemma3n-->>-curious-frame: Get description
        opt If use French
            curious-frame->>+ollama Gemma3n: Request translation
            ollama Gemma3n-->>-curious-frame: 
        end
        curious-frame->>+piper: Generate phonemes
        piper-->>-curious-frame: 
        curious-frame->>Speakers: Play sound
    end

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
.clinerules		.clinerules
.github		.github
assets		assets
memory-bank		memory-bank
ollama		ollama
piper_server		piper_server
scripts		scripts
src/curious_frame		src/curious_frame
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Python_Rennes_20251210_edge_ai.pdf		Python_Rennes_20251210_edge_ai.pdf
README.md		README.md
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Curious Frame

How to Install

How to uninstall

FAQ

Architecture

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Curious Frame

How to Install

How to uninstall

FAQ

Architecture

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages