CrowdHuman Dataset

v5

2023-01-11 10:06am

Generated on Jan 11, 2023

Popular Download Formats

Pascal VOC XML
Common XML annotation format for local data munging (pioneered by ImageNet).
PaliGemma
PaliGemma JSONL format used for fine-tuning PaliGemma, Google's open multimodal vision model.
CreateML JSON
CreateML JSON format is used with Apple's CreateML and Turi Create tools.
Other Formats
Choose another format.

Dataset Split

Train Set 6%
99Images
Valid Set 1%
22Images
Test Set 92%
1491Images

Preprocessing

Auto-Orient: Applied
Resize: Stretch to 256x256

Augmentations

Outputs per training example: 1
Flip: Horizontal
Rotation: Between -12° and +12°
Shear: ±10° Horizontal, ±10° Vertical
Grayscale: Apply to 14% of images
Brightness: Between -15% and +15%
Blur: Up to 1.25px
Noise: Up to 2% of pixels