Dataset Versions

v3

2023-09-05 3:51am

Generated on Sep 4, 2023

Popular Download Formats

Pascal VOC XML
Common XML annotation format for local data munging (pioneered by ImageNet).
PaliGemma
PaliGemma JSONL format used for fine-tuning PaliGemma, Google's open multimodal vision model.
CreateML JSON
CreateML JSON format is used with Apple's CreateML and Turi Create tools.
Other Formats
Choose another format.

Preprocessing

Auto-Orient: Applied
Isolate Objects: Applied
Static Crop: 25-75% Horizontal Region, 25-75% Vertical Region
Resize: Stretch to 640x640
Auto-Adjust Contrast: Using Contrast Stretching
Grayscale: Applied
Tile: 1 rows x 1 columns
Modify Classes: 11 remapped, 2 dropped
Filter Null: Require all images to contain annotations.

Augmentations

Outputs per training example: 3
Crop: 0% Minimum Zoom, 15% Maximum Zoom
Grayscale: Apply to 25% of images
Hue: Between -47° and +47°
Saturation: Between -57% and +57%
Brightness: Between -24% and +24%
Exposure: Between -8% and +8%
Blur: Up to 1.5px
Noise: Up to 2% of pixels
Cutout: 2 boxes with 11% size each
Mosaic: Applied
Bounding Box: Flip: Horizontal
Bounding Box: 90° Rotate: Clockwise, Counter-Clockwise, Upside Down
Bounding Box: Crop: 0% Minimum Zoom, 50% Maximum Zoom
Bounding Box: Rotation: Between -31° and +31°
Bounding Box: Shear: ±0° Horizontal, ±0° Vertical
Bounding Box: Exposure: Between -38% and +38%
Bounding Box: Blur: Up to 0.75px
Bounding Box: Noise: Up to 4% of pixels

Similar Projects

See More