AudioMarkGenerator is an opinionated desktop Python tool that builds a per-book SQLite search database from an EPUB3 file with SMIL media overlays.
The generated database is designed specifically for use with: Audio Mark (Android Viewer)
Tested With Storyteller EPUBs
AudioMarkGenerator is developed and tested primarily against EPUB3 files with SMIL media overlays generated using the Storyteller application from GitLab.
If you are generating your EPUB + audio alignment using Storyteller, AudioMarkGenerator should work as expected.
Other EPUB3 + SMIL workflows may work, but Storyteller-generated EPUBs are the reference implementation used during development.
This repository contains only the database generator.
It does not include:
- Any Android application
- Any audio player
- Any EPUB editor
- Any alignment tool
To use the generated database, you must install: Audio Mark (Android Viewer)
- Python 3.11+
- A valid EPUB3 file with SMIL media overlays
git clone https://github.com/ujwalnk/AudioMarkGenerator.git
cd AudioMarkGeneratorpython3 -m venv ./.virtActivate it:
source ./.virt/bin/activate.\.virt\Scripts\activatepip install -r requirements.txtpython3 build_index.py path/to/book.epubExample:
python3 build_index.py ./TheBook.epubA SQLite database will be created next to the EPUB file:
<BookTitle>.db
If the EPUB contains a <dc:title>, that will be used as the filename.
Otherwise, the EPUB filename is used.
You can now import this .db file into Audio Mark.
Given an EPUB3 file with SMIL media overlays, AudioMarkGenerator:
-
Extracts metadata (title, author)
-
Reads the spine (reading order)
-
Parses the TOC (toc.ncx or nav.xhtml)
-
Resolves every SMIL
<par>entry -
Extracts:
- 📖 Chapter title
- 🎧 Audio file path
- ⏱ Timestamp (HH:MM:SS)
- 📝 Exact inner text of the referenced HTML fragment
-
Writes everything into a single SQLite
.dbfile
Each generated .db file represents exactly one book.
The EPUB must contain:
META-INF/container.xml- A valid
content.opf - SMIL media overlays declared in the OPF manifest
- HTML/XHTML files with proper fragment IDs referenced by SMIL
- A TOC (
toc.ncxornav.xhtml) for chapter title extraction
If these are missing or malformed, the generator will abort with a clear error.
Each row in the generated database corresponds to exactly one SMIL <par> entry.
Search in the Android app uses exact substring matching:
SELECT chapter_title, audio_file, timestamp
FROM paragraphs
WHERE text LIKE '%<selected_text>%'No normalization. No lowercasing. No fuzzy matching.
The extracted HTML text is stored exactly as-is.
Each book uses one SQLite file containing:
CREATE TABLE metadata (
id INTEGER PRIMARY KEY,
title TEXT,
author TEXT
);CREATE TABLE paragraphs (
id INTEGER PRIMARY KEY,
chapter_title TEXT NOT NULL,
audio_file TEXT NOT NULL,
timestamp TEXT NOT NULL,
text TEXT NOT NULL
);An index on text supports search queries.
The following SMIL clipBegin formats are supported:
12.34s75sHH:MM:SS(.fff)MM:SS(.fff)- Plain seconds
All timestamps are stored as:
HH:MM:SS
Fractional seconds are dropped.
- Strict requirement: SMIL-based EPUBs only.
- No partial builds.
- No silent fallbacks.
- No text normalization.
- No ranking.
- One SQLite database per book.
- Built specifically for integration with Audio Mark.
This tool is intentionally opinionated and optimized for self-hosted audiobook workflows.
- Align audiobook + EPUB using Storyteller.
- Export EPUB with SMIL overlays.
- Run AudioMarkGenerator on the EPUB.
- Import the generated
.dbinto Audio Mark. - Select text → jump to audiobook timestamp.
- ✅ macOS
- ✅ Linux
- ✅ Windows (Python 3.11+)
You are free to use, modify, and distribute this software under the GNUGPLv3 license.