A comprehensive collection of Chinese video captions from Youku (δΌι ·), featuring:
- πΉ Videos: 31,466 complete short videos
- βοΈ Captions: 311,921 Chinese captions
- πΊ Language: Chinese
- π± Source: Youku Platform (δΌι ·)
The dataset is available for download from ModelScope.
# Install Git LFS
git lfs install
# Clone the dataset
git lfs clone https://oauth2:your_git_token@www.modelscope.cn/datasets/os_ai/Youku_Dense_Caption.gitπ Get Token: Visit https://modelscope.cn/my/myaccesstoken
π¦ ROOT
βββ π benchmark_files/
β βββ π generation.json # Test set for caption generation
β βββ π grounding.json # Test set for video moment retrieval
β
βββ π meta_files/
β βββ π Agriculture.csv # Video file paths and Complete captions in the agriculture category
β βββ π Children.csv
| βββ π [Other Categories].csv
β
βββ π data_files/
βββ π Agriculture/ # Agriculture videos
β βββ π¦ train/ # Training set (zipped)
β βββ π¦ val/ # Validation set (zipped)
β βββ π test/ # Test set (preview ready)
β
βββ π Children/ # Children videos
β βββ π¦ train/
β βββ π¦ val/
β βββ π test/
β
βββ π [Other Categories]/ # Other categories
βββ π¦ train/
βββ π¦ val/
βββ π test/
-
After Download:
- Navigate to target category folder
- Example:
cd data_files/Agriculture
-
Data Preparation:
- Unzip files in train/ and val/ directories
- Files in test/ directory are ready to use
β οΈ Important Notes:
- train and val data are stored in compressed format, requiring extraction
- test data is directly accessible for preview and testing
π‘ For questions, please refer to project documentation or submit an Issue
If you use this dataset in your research, please cite:
@inproceedings{xiong2025youku,
title={Youku Dense Caption: A Large-scale Chinese Video Dense Caption Dataset and Benchmarks},
author={Zixuan Xiong, Guangwei Xu, Wenkai Zhang, Yuan Miao, Xuan Wu, LinHai, Ruijie Guo, Hai-Tao Zheng},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=vvi5OjPhbu}
}This dataset is released under the CC BY-NC-SA 4.0 license.
β Star us on GitHub if you find this dataset useful! β
