Dataset Versions

v15

2024-05-23 1:46pm

Generated on May 23, 2024

Popular Download Formats

Pascal VOC XML
Common XML annotation format for local data munging (pioneered by ImageNet).
PaliGemma
PaliGemma JSONL format used for fine-tuning PaliGemma, Google's open multimodal vision model.
CreateML JSON
CreateML JSON format is used with Apple's CreateML and Turi Create tools.
Other Formats
Choose another format.

Dataset Split

Train Set 9%
819Images
Valid Set 1%
69Images
Test Set 0%
44Images

Preprocessing

Auto-Orient: Applied
Resize: Fit within 640x640
Grayscale: Applied
Filter Null: Require at least 20% of images to contain annotations.

Augmentations

Outputs per training example: 3
Rotation: Between -15° and +15°
Grayscale: Apply to 23% of images
Saturation: Between -25% and +25%