This project is a Python-based pipeline to automatically detect faces in images using the YOLOv8-face face detection model, crop them, and save both:
- The cropped face image (if exactly one face is detected)
- The original image with the bounding box drawn on the detected face
The goal is to generate a dataset of clean face images from a batch of images.
face-dataset-generator/ ├── input_images/ # Place your input images here (e.g., .jpg, .png) ├── croped_images/ # Auto-generated folder for cropped face images ├── bounding_box_image/ # Auto-generated folder for images with bounding boxes drawn ├── main.py # Python script for face detection and cropping ├── requirements.txt # Python dependencies └── README.md # This file
- The script loads all images from the
images/folder. - It runs YOLOv8 inference to detect faces (using a general object detector).
- If exactly one face is detected in an image:
- The face is cropped and saved in the
cropped_faces/folder.
- The face is cropped and saved in the
- For all images (whether cropped or not):
- It draws the detected bounding boxes on the original image and saves them in the
bbox_images/folder.
- It draws the detected bounding boxes on the original image and saves them in the
- You will also see print statements in your terminal showing what’s happening (e.g., detections, saves, skips).
git clone https://github.com/your-username/face-dataset-generator.git
cd face-dataset-generator
2️⃣ Install dependencies
bash
Copy
Edit
pip install -r requirements.txt
Note:
This project uses YOLOv8n-face from ultralytics, along with OpenCV, Pillow, and tqdm.
3️⃣ Add your images
Place all your input images inside the images/ folder.
Example:
Copy
Edit
images/
├── img1.jpg
├── img2.png
└── img3.jpg
4️⃣ Run the script
bash
Copy
Edit
python main.py
📝 Output Example
After running the script:
Cropped faces (when exactly one face is detected) will be saved inside the cropped_faces/ folder.
The original images with green bounding boxes will be saved inside the bbox_images/ folder.
Example Output:
bash
Copy
Edit
cropped_faces/
├── img1.jpg
└── img3.jpg
bbox_images/
├── img1.jpg # has bounding box drawn
├── img2.png # may have 0 or multiple boxes
└── img3.jpg # has bounding box drawn
⚠️ Edge Cases Handled:
If an image cannot be read (e.g., corrupted), it will be skipped.
If no face or more than one face is detected, cropping will be skipped, but the bounding box image will still be saved.
Only detections with a confidence score > 0.5 are considered valid.