UDAV is designed to enable different disciplines to display their automatic pre-processing results in a schema-based and reproducible, dynamic and interactive way without the need to hard-code manual and user-defined visualizations for each new project.
- Dynamic and interactive charts
- Visual editor
- Different export options
Tip
Please consult the documentation page for a more detailled and customizable setup documentation.
- Java version 21 or higher
-
Clone the repository:
git clone https://github.com/texttechnologylab/Unified-Dynamic-Annotation-Visualizer.git -
In the root folder, create an
.envfile that holds the following environment variables:DB_URL=jdbc:postgresql://postgres:5432/udav DB_USER=postgres DB_PASS=postgres DB_SCHEMA=public DB_DIALECT=POSTGRES # Batch size for database inserts (default: 5000) # Higher = fewer DB roundtrips, more memory. Range: 1000-15000 DB_BATCH_SIZE=5000 # Max identifier length (PostgreSQL: 63, MySQL: 64, MSSQL: 128) DB_MAX_IDENT=255 # Enable/disable DUUI importer DUUI_IMPORTER=false # Path to input files DUUI_IMPORTER_PATH=/app/data/input # File extension: .xmi (uncompressed) or .gz (gzip compressed) DUUI_IMPORTER_FILE_ENDING=.xmi # Number of parallel workers (default: 4, rule: 1 per CPU core) DUUI_IMPORTER_WORKERS=4 # UIMA CAS pool size (default: 2×workers) DUUI_IMPORTER_CAS_POOL_SIZE=8 # Optional: External TypeSystem XML file path (auto-detected from XMI if not set) DUUI_IMPORTER_TYPE_SYSTEM_PATH= PIPELINE_IMPORTER=true PIPELINE_IMPORTER_FOLDER=/app/data/pipelines PIPELINE_IMPORTER_REPLACE_IF_DIFFERENT=false SROUCE_BUILDER=false JAVA_OPTS=-Xmx2048m -Xms1024m
-
Run the File Importer to import the annotation data
-
Start the
App.javafile
Note
The webpage, by deafult, is reachable under: http://localhost:8080. If you're looking for a small demo without creating it yourself, please check our open demo.
This project is published under the AGPL-3.0 license.
If you want to use the project please quote this as follows:
Thiemo Dahmann, Julian Schneider, Philipp Stephan, Giuseppe Abrami and Alexander Mehler. 2026. "Towards the Generation and Application of Dynamic Web-Based Visualization of UIMA-based Annotations for Big-Data Corpora with the Help of Unified Dynamic Annotation Visualizer". Proceedings of the 15th International Conference on Language Resources and Evaluation (LREC 2026). accepted.
@inproceedings{Dahmann:et:al:2026,
title = {Towards the Generation and Application of Dynamic Web-Based Visualization
of UIMA-based Annotations for Big-Data Corpora with the Help of
Unified Dynamic Annotation Visualizer},
booktitle = {Proceedings of the 15th International Conference on Language Resources
and Evaluation (LREC 2026)},
year = {2026},
author = {Dahmann, Thiemo and Schneider, Julian and Stephan, Philipp and Abrami, Giuseppe
and Mehler, Alexander},
keywords = {NLP, UIMA, Annotations, dynamic visualization, uce},
abstract = {The automatic and manual annotation of unstructured corpora is
a daily task in various scientific fields, which is supported
by a variety of existing software solutions. Despite this variety,
there are currently only limited solutions for visualizing annotations,
especially with regard to dynamic generation and interaction.
To bridge this gap and to visualize and provide annotated corpora
based on user-, project- or corpus-specific aspects, Unified Dynamic
Annotation Visualizer (UDAV) was developed. UDAV is designed as
a web-based solution that implements a number of essential features
which comparable tools do not support to enable a customizable
and extensible toolbox for interacting with annotations, allowing
the integration into existing big data frameworks.},
note = {accepted}
}