Browse » Benchmark » Object Detection Benchmark
Top Object Detection Benchmark Datasets
Roboflow hosts the most popular computer and machine vision benchmarking and transfer learning datasets. Datasets in this category include Microsoft COCO, Pascal VOC (object detection), and more.
About this Dataset
The Oxford Pets dataset (also known as the "dogs vs cats" dataset) is a collection of images and annotations labeling various breeds of dogs and cats. There are approximately 100 examples of each of the 37 breeds. This dataset contains the object detection portion of the original dataset with bounding boxes around the animals' heads.
Origin
This dataset was collected by the Visual Geometry Group (VGG) at the University of Oxford.
This dataset is a copy of a subset of the full Stanford Cars dataset
The original dataset contained 16,185 images of 196 classes of cars.
The classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe in the original dataset, and in this subset of the full dataset (v3
, TestData and v4
, original_raw-images).
v4
(original_raw-images) contains a generated version of the original, raw images, without any modified classes
v8
(classes-Modified_raw-images) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:
bike
,moped
--remapped to-->motorbike
cng
,leguna
,easybike
,smart fortwo Convertible 2012
, and all other specific car makes with named classes (such asAcura TL Type-S 2008
) --remapped to-->vehicle
rickshaw
,boat
,bicycle
--> omitted
v9
(FAST-model_mergedAllClasses-augmented_by3x) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:
bike
,moped
--remapped to-->motorbike
cng
,leguna
,easybike
,smart fortwo Convertible 2012
, and all other specific car makes with named classes (such asAcura TL Type-S 2008
) --remapped to-->vehicle
rickshaw
,boat
,bicycle
--> omitted
v10
(ACCURATE-model_mergedAllClasses-augmented_by3x) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:
bike
,moped
--remapped to-->motorbike
cng
,leguna
,easybike
,smart fortwo Convertible 2012
, and all other specific car makes with named classes (such asAcura TL Type-S 2008
) --remapped to-->vehicle
rickshaw
,boat
,bicycle
--> omitted
Citation:
3D Object Representations for Fine-Grained Categorization Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei 4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013. pdf BibTex slides
This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.
COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.
Overview
This dataset contains 581 images of various shellfish classes for object detection. These images are derived from the Open Images open source computer vision datasets.
This dataset only scratches the surface of the Open Images dataset for shellfish!
Use Cases
- Train object detector to differentiate between a lobster, shrimp, and crab.
- Train object dector to differentiate between shellfish
- Object detection dataset across different sub-species
- Object detection among related species
- Test object detector on highly related objects
- Train shellfish detector
- Explore the quality and range of Open Image dataset
Tools Used to Derive Dataset
These images were gathered via the OIDv4 Toolkit This toolkit allows you to pick an object class and retrieve a set number of images from that class with bound box lables.
We provide this dataset as an example of the ability to query the OID for a given subdomain. This dataset can easily be scaled up - please reach out to us if that interests you.
Overview
This dataset contains 2986 images and 3448 labels across a single annotation class: pistols. Images are wide-ranging: pistols in-hand, cartoons, and staged studio quality images of guns.
The dataset was originally released by the University of Grenada , duplicates removed, and rehosted by a Roboflow user.
Use Cases
One can create a gun object detection model to monitor security camera footage for the presence of guns, perhaps in places where they should not be. Alaa Senjab built on Roboflow to achieve this goal. He's also open sourced much of his work in this tutorial .
Overview
The PlantDoc dataset was originally published by researchers at the Indian Institute of Technology, and described in depth in their paper. One of the paper’s authors, Pratik Kayal, shared the object detection dataset available on GitHub.
PlantDoc is a dataset of 2,569 images across 13 plant species and 30 classes (diseased and healthy) for image classification and object detection. There are 8,851 labels. Read more about how the version available on Roboflow improves on the original version here.
And here's an example image:
Fork
this dataset (upper right hand corner) to receive the raw images, or (to save space) grab the 416x416 export.
Use Cases
As the researchers from IIT stated in their paper, “plant diseases alone cost the global economy around US$220 billion annually.” Training models to recognize plant diseases earlier dramatically increases yield potential.
The dataset also serves as a useful open dataset for benchmarks. The researchers trained both object detection models like MobileNet and Faster-RCNN and image classification models like VGG16, InceptionV3, and InceptionResnet V2.
The dataset is useful for advancing general agriculture computer vision tasks, whether that be health crop classification, plant disease classification, or plant disease objection.
Using this Dataset
This dataset follows Creative Commons 4.0 protocol. You may use it commercially without Liability, Trademark use, Patent use, or Warranty.
Provide the following citation for the original authors:
@misc{singh2019plantdoc,
title={PlantDoc: A Dataset for Visual Plant Disease Detection},
author={Davinder Singh and Naman Jain and Pranjali Jain and Pratik Kayal and Sudhakar Kumawat and Nipun Batra},
year={2019},
eprint={1911.10317},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
About Roboflow
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.
Overview
The original Udacity Self Driving Car Dataset is missing labels for thousands of pedestrians, bikers, cars, and traffic lights. This will result in poor model performance. When used in the context of self driving cars, this could even lead to human fatalities.
We re-labeled the dataset to correct errors and omissions. We have provided convenient downloads in many formats including VOC XML, COCO JSON, Tensorflow Object Detection TFRecords, and more.
Some examples of labels missing from the original dataset:
Stats
The dataset contains 97,942 labels across 11 classes and 15,000 images. There are 1,720 null examples (images with no labels).
All images are 1920x1200 (download size ~3.1 GB). We have also provided a version downsampled to 512x512 (download size ~580 MB) that is suitable for most common machine learning models (including YOLO v3, Mask R-CNN, SSD, and mobilenet).
Annotations have been hand-checked for accuracy by Roboflow.
Annotation Distribution:
Use Cases
Udacity is building an open source self driving car! You might also try using this dataset to do person-detection and tracking.
Using this Dataset
Our updates to the dataset are released under the MIT License (the same license as the original annotations and images).
Note: the dataset contains many duplicated bounding boxes for the same subject which we have not corrected. You will probably want to filter them by taking the IOU for classes that are 100% overlapping or it could affect your model performance (expecially in stoplight detection which seems to suffer from an especially severe case of duplicated bounding boxes).
About Roboflow
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility. :fa-spacer:
Pascal VOC 2012 is common benchmark for object detection. It contains common objects that one might find in images on the web.
Note: the test set is witheld, as is common with benchmark datasets.
You can think of it sort of like a baby COCO.
This is the labled dataset for the 5 person, 5 ball video provided by the LPCV challenge.
Training Set has 131 unique images Validation Set has 37 unique images Test Set has 19 unique images
There are 10 classes as follows:
Person 1 (Brown shirt) Person 2 (Dark blue shirt) Person 3 (Light blue shirt) Person 4 (Pink shirt) Person 5 (White shirt)
Ball 1 (Blue) Ball 2 (Orange) Ball 3 (Purple) Ball 4 (Red) Ball 5 (Yellow)
Data_V1: First instance of the dataset, not completely labled Data_V2: Completely labled with lables such as (ball 1, person 1, etc) Data_V3: Completely labled with lables such as (ball_1, person_1, etc) Data_V4: Completely labled with only ball and person Data_V5: Same as 4 but less augmentation