A Comprehensive Dataset for Video Comprehension and Multimodal AI Research
π° Research Paper Β· π Interactive Notebook
The CLIP-CC Dataset is a carefully curated collection of 200 YouTube video links with human-written summaries, specifically designed for research and experimentation in multimodal AI tasks. This dataset addresses the growing need for high-quality video comprehension benchmarks that can effectively evaluate the narrative understanding capabilities of Large Video Language Models (LVLMs).
|
Source: YouTube's Rotten Tomatoes Movieclips channel Selection Criteria:
Output: ~90 second clips |
|
π Dataset Statistics
Sourced from Rotten Tomatoes Movieclips β’ 3-step curation process |
Perfect for evaluating video captioning models, multimodal language models, and narrative understanding systems using advanced metrics like VCS (Video Comprehension Score).
π Example Entry (metadata.jsonl):
{
"id": "001",
"file_link": "https://www.youtube.com/watch?v=abc123",
"summary": "A man explains the basics of machine learning with real-world examples."
}Choose the installation method that best fits your workflow. All methods provide access to the complete CLIP-CC dataset with 200 video entries and human-written summaries.
|
β‘ Quick Start Tip: Most research only needs metadata - video downloading is completely optional! |
|
Full control and development π±οΈ Click to expand stepsTerminal: git clone https://github.com/hdubey-debug/CLIP-CC.git
cd CLIP-CC
pip install .Colab/Jupyter: !git clone https://github.com/hdubey-debug/CLIP-CC.git
%cd CLIP-CC
!pip install .π§ Full source access |
Quick and simple π±οΈ Click to expand stepsTerminal: pip install git+https://github.com/hdubey-debug/CLIP-CC.gitColab/Jupyter: !pip install git+https://github.com/hdubey-debug/CLIP-CC.gitβ‘ 30-second setup |
Seamless dataset integration π±οΈ Click to expand stepsTerminal: from datasets import load_dataset
# Load the dataset
dataset = load_dataset("IVSL-SDSU/Clip-CC")
# Access a sample entry
print(dataset["train"][0])Colab/Jupyter: # If path issues occur, upgrade packages:
!pip install --upgrade datasets fsspec
from datasets import load_dataset
dataset = load_dataset("IVSL-SDSU/Clip-CC")
print(dataset["train"][0])π€ Native HF integration |
Once installed, start using CLIP-CC in seconds:
# Import CLIP-CC (works for both manual and pip installation)
from clip_cc.loader import load_metadata
# Load dataset
data = load_metadata()
# Explore
print(f"Dataset size: {len(data)} videos")
print(f"Sample: {data[0]}")
# Access summaries
for entry in data[:3]:
print(f"Video {entry['id']}: {entry['summary']}") |
# Load with Hugging Face
from datasets import load_dataset
dataset = load_dataset("IVSL-SDSU/Clip-CC")
# Explore
print(f"Dataset: {dataset}")
print(f"Sample: {dataset['train'][0]}")
# Access summaries
for i in range(3):
entry = dataset['train'][i]
print(f"Video {entry['id']}: {entry['summary']}") |
|
|
|
|
1. Install Dependencies:
pip install yt-dlp ffmpeg-python2. Check FFmpeg:
ffmpeg -version3. Download & Clip Videos:
from clip_cc.downloader import download_and_clip_dataset
# Download one specific video (ID: 001)
download_and_clip_dataset(
output_dir="downloads/clips",
target_ids={"001"},
use_browser_cookies=True, # Uses Chrome/Firefox cookies directly
clip_duration=90
)
print("β
Videos downloaded and clipped!")Parameters:
output_dir: Where to save the clipped videostarget_ids: Specific video ID(s) to download (useNoneto download all 200 videos)use_browser_cookies: Use browser cookies directly (recommended for age-restricted videos)cookiefile_path: Alternative cookie file path (if browser cookies don't work)clip_duration: Video length in seconds (default: 90)
What this does:
- Downloads videos using
yt-dlpto a temporary folder - Clips the first 90 seconds using
ffmpeg - Saves final clipped videos to
output_dir - Automatically cleans up intermediate files
Age-Restricted Videos (Most Common Issue)
Problem: Sign in to confirm your age error
Solution: Use Browser Cookies (Terminal Only)
Method 1: Browser cookies (Recommended)
# Uses your browser's cookies directly (Chrome/Firefox)
# Make sure you're signed into YouTube in your browser first
download_and_clip_dataset(
output_dir="clips",
target_ids={"002"}, # Or None for all videos
use_browser_cookies=True, # Uses Chrome or Firefox cookies directly
clip_duration=90
)Method 2: Export cookies file (Alternative)
-
Export cookies from your browser:
- Install browser extension "Get cookies.txt LOCALLY" (Chrome/Firefox)
- Visit YouTube and sign in to your account
- Click the extension and download
cookies.txtfile - Important: Make sure it's in Netscape format
-
Use cookies in download:
download_and_clip_dataset( output_dir="clips", target_ids={"002"}, # Or None for all videos cookiefile_path="cookies.txt", clip_duration=90 )
Note: Method 1 (browser cookies) works much more reliably than cookie files.
Other Common Issues
FFmpeg Issues:
- Ensure
ffmpegis installed and in PATH - Test with:
ffmpeg -version
Network Issues:
- Check internet connection for YouTube access
- Some videos may be region-restricted or removed
Video Not Found:
- Video may have been removed from YouTube
- The download will skip missing videos and continue with others
CLIP-CC is designed to work seamlessly with VCS (Video Comprehension Score) for comprehensive video understanding evaluation.
|
π Perfect Integration: CLIP-CC + VCS
|
If you use CLIP-CC Dataset in your research, please cite:
@software{vcs_metrics_2024,
title = {VCS Metrics: Video Comprehension Score for Text Similarity Evaluation},
author = {Dubey, Harsh and Ali, Mukhtiar and Mishra, Sugam and Pack, Chulwoo},
year = {2024},
institution = {South Dakota State University},
url = {https://github.com/Multimodal-Intelligence-Lab/Video-Comprehension-Score},
note = {Python package for narrative similarity evaluation}
}This project is licensed under the MIT License - see the LICENSE file for details.
Authors: Harsh Dubey, Mukhtiar Ali, Sugam Mishra, and Chulwoo Pack
Institution: South Dakota State University
Year: 2024
β Star this repo β’ π Report Bug β’ π‘ Request Feature β’ π¬ Community Q&A
