Browse » Object Detection

Object Detection

Open source object detection computer vision datasets, pre-trained models, and APIs.

Mountain Dew Commercial

Overview

Mountain Dew is running a $1,000,000 counting contest. Computer Vision can help you win.

:fa-spacer:

Watch our video explaining how to use this dataset.

Mountain Dew

During Super Bowl LV, Mountain Dew sponsored an ad that encourages viewers to count all unique occurrences of Mountain Dew bottles. You can watch the full ad here. The first person to tweet the exactly correct count at Mountain Dew is eligible to win $1 million (see rules here).

Counting things is a perfect place for where computer vision can help.

We uploaded the Mountain Dew video to Roboflow, created three images per each second of the commercial (91 images from ~30 seconds of commercial), and annotated all bottles we could see. This dataset is the result.

We trained a model to recognize the Mountain Dew bottles, and then ran the original commercial back through this model. This helps identify Mountain Dew bottles that the human eye may have missed when completing counts.

Image example

Getting Started

Click "Fork" in the upper right hand corner or download the raw annotations in your desired format.

Note that while the images are property of PepsiCo, we are using them here as fair-use for educational purposes and have released the annotations under a Creative Commons license.

About Roboflow

Roboflow enables teams to use computer vision.
:fa-spacer:
Our end-to-end platform enables developers to collect, organize, annotate, train, deploy, and improve their computer vision models -- all without needing to hire a new ML engineering team.
:fa-spacer:

Roboflow Wordmark

American Sign Language Letters

Overview

The American Sign Language Letters dataset is an object detection dataset of each ASL letter with a bounding box. David Lee, a data scientist focused on accessibility, curated and released the dataset for public use.

Example Image

Use Cases

One could build a model that reads letters in sign language. For example, Roboflow user David Lee wrote about how he made the model demonstrated above in this blog post

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings, or additional augmentations to make your model generalize better.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers build computer vision models faster and more accurately with Roboflow.

Roboflow Workmark

Digits

Project Overview:

The original goal was to use this model to monitor my rowing workouts and learn more about computer vision. To monitor the workouts, I needed the ability to identify the individual digits on the rowing machine. With the help of Roboflow's computer vision tools, such as assisted labeling, I was able to more quickly prepare, test, deploy and improve my YOLOv5 model.
Example Annotated Image from the Dataset

Inference on a Test Image using the rfWidget

Roboflow's Upload API, which is suitable for uploading images, video, and annotations, worked great with a custom app I developed to modify the predictions from the deployed model, and export them in a format that could be uploaded to my workspace on Roboflow.

What took me weeks to develop can now be done with the help of a single click utilize Roboflow Train, and the Upload API for Active Learning (dataset and model improvement).
Training Results - Roboflow FAST Model

Dataset Classes:

  • 1, 2, 3, 4, 5, 6, 7, 8, 9, 90 (class "90" is a stand-in for the digit, zero)

This dataset consits of 841 images. There are images from a different rowing machine and also from this repo. Some scenes are illuminated with sunlight. Others have been cropped to include only the LCD. Digits like 7, 8, and 9 are underrepresented.

For more information:

Weeds

Overview

The Weeds dataset is a collection of garden weeds that can easily confuse object detection models due to similiarity of the weeds compared to its surroundings. This dataset was used with YOLOR for object detection to detect weeds in complex backgrounds.

Example Footage!

Weeds Detection

Training and Deployment

The weeds model has been trained in Roboflow, available for inference on the Dataset tab.

One could also build a Weeds Detector using YOLOR. This is achieved using the Roboflow Platform which you can deploy the model for robust and real-time detections. You can learn more here: https://augmentedstartups.info/YOLOR-Get-Started

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. With over 92k subscribers on YouTube, we embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality

EgoHands Public

EgoHands Dataset

About this dataset

The EgoHands dataset is a collection of 4800 annotated images of human hands from a first-person view originally collected and labeled by Sven Bambach, Stefan Lee, David Crandall, and Chen Yu of Indiana University.

The dataset was captured via frames extracted from video recorded through head-mounted cameras on a Google Glass headset while peforming four activities: building a puzzle, playing chess, playing Jenga, and playing cards. There are 100 labeled frames for each of 48 video clips.

Our modifications

The original EgoHands dataset was labeled with polygons for segmentation and released in a Matlab binary format. We converted it to an object detection dataset using a modified version of this script from @molyswu and have archived it in many popular formats for use with your computer vision models.

After converting to bounding boxes for object detection, we noticed that there were several dozen unlabeled hands. We added these by hand and improved several hundred of the other labels that did not fully encompass the hands (usually to include omitted fingertips, knuckles, or thumbs). In total, 344 images' annotations were edited manually.

We chose a new random train/test split of 80% training, 10% validation, and 10% testing. Notably, this is not the same split as in the original EgoHands paper.

There are two versions of the converted dataset available:

  • specific is labeled with four classes: myleft, myright, yourleft, yourright representing which hand of which person (the viewer or the opponent across the table) is contained in the bounding box.
  • generic contains the same boxes but with a single hand class.

Using this dataset

The authors have graciously allowed Roboflow to re-host this derivative dataset. It is released under a Creative Commons by Attribution 4.0 license. You may use it for academic or commercial purposes but must cite the original paper.

Please use the following Bibtext:

@inproceedings{egohands2015iccv,
                                title = {Lending A Hand: Detecting Hands and Recognizing Activities in Complex Egocentric Interactions},
                                author = {Sven Bambach and Stefan Lee and David Crandall and Chen Yu},
                                booktitle = {IEEE International Conference on Computer Vision (ICCV)},
                                year = {2015}
                            }
                            

Brackish Underwater

Example image from the dataset

Dataset Information

This dataset contains 14,674 images (12,444 of which contain objects of interest with bounding box annotations) of fish, crabs, and other marine animals. It was collected with a camera mounted 9 meters below the surface on the Limfjords bridge in northern Denmark by Aalborg University.

Composition

Roboflow has extracted and processed the frames from the source videos and converted the annotations for use with many popular computer vision models. We have maintained the same 80/10/10 train/valid/test split as the original dataset.

The class balance in the annotations is as follows:
Class Balance

Most of the identified objects are congregated towards the bottom of the frames.

Annotation Heatmap

More Information

For more information, see the Detection of Marine Animals in a New Underwater Dataset with Varying Visibility paper.

If you find the dataset useful, the authors request that you please cite their paper:

@InProceedings{pedersen2019brackish,
                                title={Detection of Marine Animals in a New Underwater Dataset with Varying Visibility},
                                author={Pedersen, Malte and Haurum, Joakim Bruslund and Gade, Rikke and Moeslund, Thomas B. and Madsen, Niels},
                                booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
                                month = {June},
                                year = {2019}
                            }
                            

Blood Cell Detection

Overview

This is a dataset of blood cells photos, originally open sourced by cosmicad and akshaylambda.

There are 364 images across three classes: WBC (white blood cells), RBC (red blood cells), and Platelets. There are 4888 labels across 3 classes (and 0 null examples).

Here's a class count from Roboflow's Dataset Health Check:

BCCD health

And here's an example image:

Blood Cell Example

Fork this dataset (upper right hand corner) to receive the raw images, or (to save space) grab the 500x500 export.

Use Cases

This is a small scale object detection dataset, commonly used to assess model performance. It's a first example of medical imaging capabilities.

Using this Dataset

We're releasing the data as public domain. Feel free to use it for any purpose.

It's not required to provide attribution, but it'd be nice! :)

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

This project was created by downloading the GTSDB German Traffic Sign Detection Benchmark

dataset from Kaggle and importing the annotated training set files (images and annotation files)

to Roboflow.

https://www.kaggle.com/datasets/safabouguezzi/german-traffic-sign-detection-benchmark-gtsdb

The annotation files were adjusted to conform to the YOLO Keras TXT format prior to upload, as the original format did not include a label map file.

v1 contains the original imported images, without augmentations. This is the version to download and import to your own project if you'd like to add your own augmentations.

v2 contains an augmented version of the dataset, with annotations. This version of the project was trained with Roboflow's "FAST" model.

v3 contains an augmented version of the dataset, with annotations. This version of the project was trained with Roboflow's "ACCURATE" model.

Draughts Board

This dataset was created by Harry Field and contains the labelled images for capturing the game state of a draughts/checkers 8x8 board.

This was a fun project to develop a mobile draughts applciation enabling users to interact with draughts-based software via their mobile device's camera.

The data captured consists of:

  • White Pieces
  • White Kings
  • Black Pieces
  • Black Kings
  • Bottom left corner square
  • Top left corner square
  • Top right corner square
  • Bottom right corner square

Corner squares are captured so the board locations of the detected pieces can be estimated.

Results of Yolov5 model after training with this dataset

From this data, the locations of other squares can be estimated and game state can be captured. The image below shows the data of a different board configuration being captured. Blue circles refer to squares, numbers refer to square index and the coloured circles refer to pieces.

Once game state is captured, integration with other software becomes possible. In this example, I created a simple move suggestion mobile applciation seen working here.

The developed application is a proof of concept and is not available to the public. Further development is required in training the model accross multiple draughts boards and implementing features to add vlaue to the physical draughts game.

The dataset consists of 759 images and was trained using Yolov5 with a 70/20/10 split.

The output of Yolov5 was parsed and filtered to correct for duplicated/overlapping detections before game state could be determined.

I hope you find this dataset useful and if you have any questions feel free to drop me a message on LinkedIn as per the link above.

Aquarium Combined

CreateML Output

Dataset Details

This dataset consists of 638 images collected by Roboflow from two aquariums in the United States: The Henry Doorly Zoo in Omaha (October 16, 2020) and the National Aquarium in Baltimore (November 14, 2020). The images were labeled for object detection by the Roboflow team (with some help from SageMaker Ground Truth). Images and annotations are released under a Creative Commons By-Attribution license. You are free to use them for any purposes personal, commercial, or academic provided you give acknowledgement of their source.

Projects Using this Dataset:

No-Code Object Detection Tutorial
No-Code Object Detection Tutorial

Class Breakdown

The following classes are labeled: fish, jellyfish, penguins, sharks, puffins, stingrays, and starfish. Most images contain multiple bounding boxes.

Class Balance

Usage

The dataset is provided in many popular formats for easily training machine learning models. We have trained a model with CreateML (see gif above).

This dataset could be used for coral reef conservation, environmental health monitoring, swimmer safety, pet analytics, automated feeding, and much more. We're excited to see what you build!

Excavators

This project is trying to create an efficient computer or machine vision model to detect different kinds of construction equipment in construction sites and we are starting with three classes which are excavators, trucks, and wheel loaders.

The dataset is provided by Mohamed Sabek, a Spring 2022 Master of Science graduate from Arizona State University in Construction Management and Technology.

The raw images (v1) contains:

  1. 1,532 annotated examples of "excavators"
  2. 1,269 annotated examples of "dump truck"
  3. 1,080 annotated examples of "wheel loader"

Note: versions 2 and 3 (v2 and v3) contain the raw images resized at 416 by 416 (stretch to) and 640 by 640 (stretch to) without any augmentations.

Apple Sorting

This project was created by Arfiani Nur Sayidah and is for sorting "apples" from "damaged apples."

The classes are "apple" and "damaged_apples"
Original Class Balance:

  1. apple: 2,152
  2. damaged_apple: 708

Chess Pieces

Overview

This is a dataset of Chess board photos and various pieces. All photos were captured from a constant angle, a tripod to the left of the board. The bounding boxes of all pieces are annotated as follows: white-king, white-queen, white-bishop, white-knight, white-rook, white-pawn, black-king, black-queen, black-bishop, black-knight, black-rook, black-pawn. There are 2894 labels across 292 images.

Chess Example

Follow this tutorial to see an example of training an object detection model using this dataset or jump straight to the Colab notebook.

Use Cases

At Roboflow, we built a chess piece object detection model using this dataset.

ChessBoss

You can see a video demo of that here. (We did struggle with pieces that were occluded, i.e. the state of the board at the very beginning of a game has many pieces obscured - let us know how your results fare!)

Using this Dataset

We're releasing the data free on a public license.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.

Roboflow Workmark

Kernels counter

  • Public
  • Soybeans-kernels Dataset
  • 840 images

Soybeans kernels counter

Table Extraction PDF

The dataset comes from Devashish Prasad, Ayan Gadpal, Kshitij Kapadni, Manish Visave, and Kavita Sultanpure - creators of CascadeTabNet.

Depending on the dataset version downloaded, the images will include annotations for 'borderless' tables, 'bordered' tables', and 'cells'. Borderless tables are those in which every cell in the table does not have a border. Bordered tables are those in which every cell in the table has a border, and the table is bordered. Cells are the individual data points within the table.

A subset of the full dataset, the ICDAR Table Cells Dataset, was extracted and imported to Roboflow to create this hosted version of the Cascade TabNet project. All the additional dataset components used in the full project are available here: All Files.

Versions:

  1. Version 1, raw-images : 342 raw images of tables. No augmentations, preprocessing step of auto-orient was all that was added.
  2. Version 2, tableBordersOnly-rawImages : 342 raw images of tables. This dataset version contains the same images as version 1, but with the caveat of Modify Classes being applied to omit the 'cell' class from all images (rendering these images to be apt for creating a model to detect 'borderless' tables and 'bordered' tables.

For the versions below: Preprocessing step of Resize (416by416 Fit within-white edges) was added along with more augmentations to increase the size of the training set and to make our images more uniform. Preprocessing applies to all images whereas augmentations only apply to training set images.
3. Version 3, augmented-FAST-model : 818 raw images of tables. Trained from Scratch (no transfer learning) with the "Fast" model from Roboflow Train. 3X augmentation (generated images).
4. Version 4, augmented-ACCURATE-model : 818 raw images of tables. Trained from Scratch with the "Accurate" model from Roboflow Train. 3X augmentation.
5. Version 5, tableBordersOnly-augmented-FAST-model : 818 raw images of tables. 'Cell' class ommitted with Modify Classes. Trained from Scratch with the "Fast" model from Roboflow Train. 3X augmentation.
6. Version 6, tableBordersOnly-augmented-ACCURATE-model : 818 raw images of tables. 'Cell' class ommitted with Modify Classes. Trained from Scratch with the "Accurate" model from Roboflow Train. 3X augmentation.

Example Image from the DatasetExample Image from the Dataset

Cascade TabNet in ActionCascade TabNet in Action
CascadeTabNet is an automatic table recognition method for interpretation of tabular data in document images. We present an improved deep learning-based end to end approach for solving both problems of table detection and structure recognition using a single Convolution Neural Network (CNN) model. CascadeTabNet is a Cascade mask Region-based CNN High-Resolution Network (Cascade mask R-CNN HRNet) based model that detects the regions of tables and recognizes the structural body cells from the detected tables at the same time. We evaluate our results on ICDAR 2013, ICDAR 2019 and TableBank public datasets. We achieved 3rd rank in ICDAR 2019 post-competition results for table detection while attaining the best accuracy results for the ICDAR 2013 and TableBank dataset. We also attain the highest accuracy results on the ICDAR 2019 table structure recognition dataset.

From the Original Authors:

If you find this work useful for your research, please cite our paper:
@misc{ cascadetabnet2020,
title={CascadeTabNet: An approach for end to end table detection and structure recognition from image-based documents},
author={Devashish Prasad and Ayan Gadpal and Kshitij Kapadni and Manish Visave and Kavita Sultanpure},
year={2020},
eprint={2004.12629},
archivePrefix={arXiv},
primaryClass={cs.CV}
}

Cottontail-Rabbits

Streamlit-Logos

About Streamlit

Streamlit is the fastest way to build and share data apps natively in Python.

About This Dataset

This dataset contains images of Streamlit logos that were contibuted by the Streamlit team and Creators 🎈.

Albeit small, this dataset does a great job of detecting Streamlit logos in real life.

Streamlit Logos

To take training even further, this dataset was mixed with the Roboflow logos dataset, which includes null examples of other logos, preventing the model from over predicting 🚀

Projects Using This Dataset

This dataset is a partner project between Roboflow and Streamlit to show how to automatically detect Streamlit logos in a browser application. The application uses Streamlit's python based front end and makes post requests to this model's Roboflow Inference API.

The full open sourced code for this project will be released at the upcoming Streamlit + Roboflow webinar! Stay tuned!

Bike Helmet Detection

Background Information

This dataset was curated and annotated by Syed Salman Reza. A custom dataset composed of two classes (With Helmet, Without Helmet). Main objetive is to identify if a Biker wearing Helmet or not.

The original custom dataset (v1) is composed of 1,371 images of people with and without bike helmets.

The dataset is available under the Public License.

Example of an Annotated Image from the Dataset

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 1 (v1) - 1,371 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: Augmentations applied prior to import - Bounding Box Blur (up to 10px)
  • Training Metrics: Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow
    • mAP = 74.4%, precision = 54.0%, recall = 77.0%

Version 2 (v2) - 3,735 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: Augmentations applied prior to import - Bounding Box Blur.
    • New augmentations:
      Outputs per training example: 3
      Rotation: Between -30° and +30°
      Shear: ±15° Horizontal, ±15° Vertical
      Blur: Up to 1.5px
      Mosaic: Applied
  • Training Metrics: Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow
    • mAP = 91.5%, precision = 65.1%, recall = 92.8%

Syed Salman Reza - Github

vectorCompleteDataset

Background

The Anki Vector robot (assets currently owned by Digital Dream Labs LLC which bought Anki assets in 2019) was first introduced in 2018. In my opinion, the Vector robot has been the cheapest fully functional autonomous robot that has ever been built. The Vector robot can be trained to recognize people; however Vector does not have the ability to recognize another Vector. This dataset has been designed to allow one to train a model which can detect a Vector robot in the camera feed of another Vector robot.

Details
Pictures were taken with Vector’s camera with another Vector facing it and had this other Vector could move freely. This allowed pictures to be captured from different angles. These pictures were then labeled by marking the rectangular regions around Vector in all the images with the help of a free Linux utility called labelImg. Different backgrounds and lighting conditions were used to take the pictures. There is also a collection of pictures without Vector.

Example
An example use case is available in my Google Colab notebook, a version of which can be found in my Git.

More
More details are available in this article on my blog.
If you are new to Computer Vision/ Deep Learning/ AI, you can consider my course on 'Learn AI with a Robot' which attempts to teach AI based on the AI4K12.org curriculum. There are more details available in this post.

Face Detection

Background Information

This dataset was curated and annotated by Mohamed Traore and Justin Brady after forking the raw images from the Roboflow Universe Mask Wearing dataset and remapping the mask and no-mask classes to face.

Example Image from the Dataset

The main objective is to identify human faces in images or video. However, this model could be used for privacy purposes with changing the output of the bounding boxes to blur the detected face or fill it with a black box.

The original custom dataset (v1) is composed of 867 unaugmented (raw) images of people in various environments. 55 of the images are marked as Null to help with feature extraction and reducing false detections.

Version 2 (v2) includes the augmented and trained version of the model. This version is trained from the COCO model checkpoint to take advantage of transfer learning and improve initial model training results.

Model Updates:

After a few trainings, and running tests with Roboflow's webcam model and Roboflow's video inference repo, it was clear that edge cases like hands sometimes recognized as faces was an issue. I grabbed some images from Alex Wong's Hand Signs dataset (96 images from the dataset) and added them to the project. I uploaded the images, without the annotation files, labeled all the faces, and retrained the model (version 5).

The dataset is available under the CC BY 4.0 license.

Stanford_Car

  • Openglpro
  • Labeled-all-the-cars Dataset
  • 12654 images

This dataset is a copy of a subset of the full Stanford Cars dataset

The original dataset contained 16,185 images of 196 classes of cars.

The classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe in the original dataset, and in this subset of the full dataset (v3, TestData and v4, original_raw-images).

v4 (original_raw-images) contains a generated version of the original, raw images, without any modified classes

v8 (classes-Modified_raw-images) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:

  1. bike, moped --remapped to--> motorbike
  2. cng, leguna, easybike, smart fortwo Convertible 2012, and all other specific car makes with named classes (such as Acura TL Type-S 2008) --remapped to--> vehicle
  3. rickshaw, boat, bicycle --> omitted

v9 (FAST-model_mergedAllClasses-augmented_by3x) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:

  1. bike, moped --remapped to--> motorbike
  2. cng, leguna, easybike, smart fortwo Convertible 2012, and all other specific car makes with named classes (such as Acura TL Type-S 2008) --remapped to--> vehicle
  3. rickshaw, boat, bicycle --> omitted

v10 (ACCURATE-model_mergedAllClasses-augmented_by3x) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:

  1. bike, moped --remapped to--> motorbike
  2. cng, leguna, easybike, smart fortwo Convertible 2012, and all other specific car makes with named classes (such as Acura TL Type-S 2008) --remapped to--> vehicle
  3. rickshaw, boat, bicycle --> omitted
Citation:

3D Object Representations for Fine-Grained Categorization
Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.
pdf BibTex slides

Cash Counter

This projects combines the Dollar Bill Detection project from Alex Hyams (v13 of the project was exported in COCO JSON format for import to this project) and the Final Counter, or Coin Counter, project from Dawson Mcgee (v6 of the project was exported in COCO JSON format for import to this project).

v1 contains the original imported images, without augmentations. This is the version to download and import to your own project if you'd like to add your own augmentations.

Chicken Detection and Tracking

Background Information

This dataset was curated and annotated by Mohamed Traore from the Roboflow Team. A custom dataset composed of one class (chicken). The main objective is to identify chicken(s) and perform object-tracking on chicken(s) using Roboflow's "zero shot object tracking."

The original video is from Wendy Thomas (Description: "Definitive proof that the chicken crossed the road to get to the other side.")

The original custom dataset (v1) is composed of 106 images of chickens and their surrounding environment.

The dataset is available under the Public License.

Zero Shot Object Tracking

Example - Zero Shot Object Tracking

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 1 (v1) - 106 images

  • Preprocessing: Auto-Orient
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 2 (v2) - 106 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 3 (v3), "v1-augmented-COCO-transferLearning" - 254 images

Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow

  • 3x image generation

Version 11 (v11), "v1-augmented-trainFromScratch" - 463 images

Trained from the Version 3 training checkpoint.

  • Modify Classes was applied to remap the "chickens" class to "rooster" (meaning "rooster" will show up for the bounding boxes when running inference).
  • 3x image generation

Version 12 (v12) - 185 images

  • Preprocessing: Auto-Orient, Modify Classes (remap the "chickens" class to "rooster")
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Mohamed Traore - LinkedIn

Boxpunch Detector

Boxpunch Detector

Onboarding project for Roboflow

This project captures punch types thrown during boxing training

Apple Vision

The Apple Vision annotated data set contains over 350 images of naturally growing apples on an apple tree. Unlike other existing sets, this set attempted to capture apples growing on trees with different exposures of natural light during the daytime.

The training data was comprised of 77 photos taken of Peter Bloch’s home apple tree. These images were shot between July and September of 2021 on an iPhone 11 camera. After the photos were taken, they were sliced into multiple smaller images with a resolution of 360 × 640 pixels per image. This number was selected as the lowest natural resolution for a CV camera later used in this project.

This set was originally created for the ECE 31 Capstone project at Oregon State University.

Dollar Bill Detection

v12 contains the original, raw images, with annotations. It includes the following classes:

  • one-front, one-back, five-front, five-back, ten-front, ten-back, twenty-front, twenty-back, fifty-front, fifty-back

v13 contains the original, raw images, with annotations and Modified Classes. It includes the following classes:

  • one, five, ten, twenty, fifty

The Dreidel Project

  • Salo Levy
  • Hebrew-Letters Dataset
  • 575 images

When learning to play Dreidel, I would sometimes forget what the names of each character are and what action they correspond to in the game. I thought it’d be fun to create a computer vision model that could understand what each symbol on a Dreidel is, making it easier to learn to play the game.

This model tracks the dreidel as it spins and detects the letters that are on the four sided dreidel.

How to Play Dreidel

Rules:
1. The players are dealt gelt (chocolate wrapped in gold paper made to look like a coin)
2. Each player takes a turn at spinning the Dreidel
3. The Dreidel has four sides that each prompt an action to take by the spinner
If נ‎ (nun) is facing up, the player does nothing.
If ג‎ (gimel) is facing up, the player gets everything in the pot.
If ה‎ (hay) is facing up, the player gets half of the pieces in the pool.
If ש‎ (shin) the player adds one of their gelt to the pot
4. The winner, of course, gets to eat all the gelt

Hopefully, with this application, one can create an application that teaches someone how to play dreidel.

Hard Hat Universe

Overview

The Hard Hat dataset is an object detection dataset of workers in workplace settings that require a hard hat. Annotations also include examples of just "person" and "head," for when an individual may be present without a hard hart.

Example Image:
Example Image

Use Cases

One could use this dataset to, for example, build a classifier of workers that are abiding safety code within a workplace versus those that may not be. It is also a good general dataset for practice.

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

Rapid React Balls

This dataset was prepared for First Robotics by the 2914 Robotics Team of Wilson High School.

The dataset contains labeled blue and red balls. The original dataset contains 987 blue annotated examples, and 731 red annotated examples of the balls.

The raw images are contained in version 10 (raw-images)

Honey_bees_dataset

Background Information

This dataset was curated and annotated by Ahmed Elmogtaba Abdelaziz.

The original dataset (v6) is composed of 204 images of honeybees present in a wide variety of scenes.
Example of an Annotated Image from the Dataset

The dataset is available under a Public License.

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 5 - 490 images

  • Preprocessing: Resize, 416 by 416
  • Augmentations:
  • 90° Rotate: Clockwise, Counter-Clockwise
    Rotation: Between -15° and +15°
    Saturation: Between -10% and +10%
    Brightness: Between -10% and +10%
    Blur: Up to 0.25px
    Mosaic: Applied
  • Output: 3x image generation

Car_Dent_Scratch_Detection(1)

  • Sindhu
  • Damage-Detection Dataset
  • 3072 images

Packages

About This Dataset

The Roboflow Packages dataset is a collection of packages located at the doors of various apartments and homes. Packages are flat envelopes, small boxes, and large boxes. Some images contain multiple annotated packages.

Usage

This dataset may be used as a good starter dataset to track and identify when a package has been delivered to a home. Perhaps you want to know when a package arrives to claim it quickly or prevent package theft.

If you plan to use this dataset and adapt it to your own front door, it is recommended that you capture and add images from the context of your specific camera position. You can easily add images to this dataset via the web UI or via the Roboflow Upload API.

About Roboflow

Roboflow enables teams to build better computer vision models faster. We provide tools for image collection, organization, labeling, preprocessing, augmentation, training and deployment.
:fa-spacer:
Developers reduce boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Aerial Maritime

Overview

Drone Example

This dataset contains 74 images of aerial maritime photographs taken with via a Mavic Air 2 drone and 1,151 bounding boxes, consisting of docks, boats, lifts, jetskis, and cars. This is a multi class problem. This is an aerial object detection dataset. This is a maritime object detection dataset.

The drone was flown at 400 ft. No drones were harmed in the making of this dataset.

This dataset was collected and annotated by the Roboflow team, released with MIT license.

Image example

Use Cases

  • Identify number of boats on the water over a lake via quadcopter.
  • Boat object detection dataset
  • Aerial Object Detection proof of concept
  • Identify if boat lifts have been taken out via a drone
  • Identify cars with a UAV drone
  • Find which lakes are inhabited and to which degree.
  • Identify if visitors are visiting the lake house via quad copter.
  • Proof of concept for UAV imagery project
  • Proof of concept for maritime project
  • Etc.

This dataset is a great starter dataset for building an aerial object detection model with your drone.

Getting Started

Fork or download this dataset and follow our How to train state of the art object detector YOLOv4 for more. Stay tuned for particular tutorials on how to teach your UAV drone how to see and comprable airplane imagery and airplane footage.

Annotation Guide

See here for how to use the CVAT annotation tool that was used to create this dataset.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
:fa-spacer:
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Aerial Docks and Boats

Overview

Drone Example

This dataset contains 74 images of aerial maritime photographs taken with via a Mavic Air 2 drone and 1,151 bounding boxes, consisting of docks, boats, lifts, jetskis, and cars. This is a multi class problem. This is an aerial object detection dataset. This is a maritime object detection dataset.

The drone was flown at 400 ft. No drones were harmed in the making of this dataset.

This dataset was collected and annotated by the Roboflow team, released with MIT license.

Image example

Use Cases

  • Identify number of boats on the water over a lake via quadcopter.
  • Boat object detection dataset
  • Aerial Object Detection proof of concept
  • Identify if boat lifts have been taken out via a drone
  • Identify cars with a UAV drone
  • Find which lakes are inhabited and to which degree.
  • Identify if visitors are visiting the lake house via quad copter.
  • Proof of concept for UAV imagery project
  • Proof of concept for maritime project
  • Etc.

This dataset is a great starter dataset for building an aerial object detection model with your drone.

Getting Started

Fork or download this dataset and follow our How to train state of the art object detector YOLOv4 for more. Stay tuned for particular tutorials on how to teach your UAV drone how to see and comprable airplane imagery and airplane footage.

Annotation Guide

See here for how to use the CVAT annotation tool that was used to create this dataset.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
:fa-spacer:
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

American Mushrooms

This Dataset contains images of popular North American mushrooms, Chicken of the Woods and Chanterelle, differentiating between the two species.

This dataset is an example of an object detection task that is possible via custom training with Roboflow.

Two versions are listed. "416x416" is a 416 resolution version that contains the base images in the dataset. "416x416augmented" contains the same images with various image augmentations applied to build a more robust model.

North American Mushrooms

This Dataset contains images of popular North American mushrooms, Chicken of the Woods and Chanterelle, differentiating between the two species.

This dataset is an example of an object detection task that is possible via custom training with Roboflow.

Two versions are listed. "416x416" is a 416 resolution version that contains the base images in the dataset. "416x416augmented" contains the same images with various image augmentations applied to build a more robust model.

Ocean Dataset

About Scubotics

Scubotics created https://www.namethatfish.com/. We are a startup dedicated to helping people better understand the ocean, one fish at a time.

About this dataset

The Ocean dataset contains images of ocean imagery depicting a few different species of fish.

Example Footage

Models trained on images like this dataset empower fish identification like the following:

Scubotics

Clash of Clans

Background Information

This dataset was curated and annotated by Find This Base. A custom dataset composed of 16 classes from the popular mobile game, Clash of Clans.

  • Classes: Canon, WizzTower, Xbow, AD, Mortar, Inferno, Scattershot, AirSweeper, BombTower, ClanCastle, Eagle, KingPad, QueenPad, RcPad, TH13 and WardenPad.

Find This Base

How to Use Find This Base
How to Use Find This Base

The original custom dataset (v1) is composed of 125 annotated images.

The dataset is available under the CC BY 4.0 license.

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 1 (v1) - 125 images

  • Preprocessing - Auto-Orient and Resize: Fit (black edges) to 640x640
  • Augmentations - No augmentations applied
  • Training Metrics - Trained from Scratch (no checkpoint used) on Roboflow
    • mAP = 83.1%, precision = 43.0%, recall = 99.1%

Version 4 (v4) - 301 images

  • Preprocessing - Auto-Orient and Resize: Fit (black edges) to 640x640
  • Augmentations - Mosaic
  • Generated Images - Outputs per training example: 3
  • Training Metrics - Trained from Scratch (no checkpoint used) on Roboflow
    • mAP = %, precision = %, recall = %

Find This Base: Official Website | How to Use Find This Base | Discord | Patreon

Thermal Cheetah

Thermal Cheetahs

About this Dataset

This is a collection of images and video frames of cheetahs at the Omaha Henry Doorly Zoo taken in October, 2020. The capture device was a SEEK Thermal Compact XR connected to an iPhone 11 Pro. Video frames were sampled and labeled by hand with bounding boxes for object detection using Robofow.

Using this Dataset

We have provided the dataset for download under a creative commons by-attribution license. You may use this dataset in any project (including for commercial use) but must cite Roboflow as the source.

Example Use Cases

This dataset could be used for conservation of endangered species, cataloging animals with a trail camera, gathering statistics on wildlife behavior, or experimenting with other thermal and infrared imagery.

About Roboflow

Roboflow creates tools that make computer vision easy to use for any developer, even if you're not a machine learning expert. You can use it to organize, label, inspect, convert, and export your image datasets. And even to train and deploy computer vision models with no code required.

futbol players

Background Information

This dataset was curated and annotated by Ilyes Talbi, Head of La revue IA, a French publication focused on stories of machine learning applications.

Main objetive is to identify if soccer (futbol) players, the referree and the soccer ball (futbol).

The original custom dataset (v1) is composed of 163 images.

  • Class 0 = players
  • Class 1 = referree
  • Class 2 = soccer ball (or futbol)

The dataset is available under the Public License.

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 7 (v7) - 163 images (raw images)

  • Preprocessing: Auto-Orient, Modify Classes: 3 remapped, 0 dropped
    • Modified Classes: Class 0 = players, Class 1 = referree, Class 2 = futbol
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 2 (v2) - 163 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 3 (v3) - 391 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416), Modify Classes: 3 remapped, 0 dropped
    • Modified Classes: Class 0 = players, Class 1 = referree, Class 2 = futbol
  • Augmentations:
    • Outputs per training example: 3
    • Rotation: Between -25° and +25°
    • Shear: ±15° Horizontal, ±15° Vertical
    • Brightness: Between -25% and +25%
    • Blur: Up to 0.75px
    • Noise: Up to 1% of pixels
    • Bounding Box: Blur: Up to 0.5px
  • Training Metrics: 86.4%mAP, 51.8% precision, 90.4% recall

Version 4 (v4) - 391 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416), Modify Classes: 3 remapped, 0 dropped
    • Modified Classes: Class 0 = players, Class 1 = referree, Class 2 = futbol
  • Augmentations:
    • Outputs per training example: 3
    • Rotation: Between -25° and +25°
    • Shear: ±15° Horizontal, ±15° Vertical
    • Brightness: Between -25% and +25%
    • Blur: Up to 0.75px
    • Noise: Up to 1% of pixels
    • Bounding Box: Blur: Up to 0.5px
  • Training Metrics: 84.6% mAP, 52.3% precision, 85.3% recall

Version 5 (v5) - 391 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416), Modify Classes: 3 remapped, 2 dropped
    • Modified Classes: Class 0 = players, Class 1 = referree, Class 2 = futbol
      • Only Class 0, which was remapped to players was included in this version
  • Augmentations:
    • Outputs per training example: 3
    • Rotation: Between -25° and +25°
    • Shear: ±15° Horizontal, ±15° Vertical
    • Brightness: Between -25% and +25%
    • Blur: Up to 0.75px
    • Noise: Up to 1% of pixels
    • Bounding Box: Blur: Up to 0.5px
  • Training Metrics: Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow
    • 98.8%mAP, 76.3% precision, 99.2% recall

Version 6 (v6) - 391 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416), Modify Classes: 3 remapped, 2 dropped
    • Modified Classes: Class 0 = players, Class 1 = referree, Class 2 = futbol
      • Only Class 0, which was remapped to players was included in this version
  • Augmentations:
    • Outputs per training example: 3
    • Rotation: Between -25° and +25°
    • Shear: ±15° Horizontal, ±15° Vertical
    • Brightness: Between -25% and +25%
    • Blur: Up to 0.75px
    • Noise: Up to 1% of pixels
    • Bounding Box: Blur: Up to 0.5px
  • Training Metrics: Trained from Scratch (no transfer learning employed)
    • 95.5%mAP, 67.8% precision, 95.5% recall

Ilyes Talbi - LinkedIn | La revue IA

Raccoon

Overview

This dataset contains 196 images of raccoons and 213 bounding boxes (some images have two raccoons). This is a single class problem, and images vary in dimensions. It's a great first dataset for getting started with object detection.

This dataset was originally collected by Dat Tran, released with MIT license, and posted here with his permission.

Raccoon Example

Per Roboflow's Dataset Health Check, here's how images vary in size:

Raccoon Aspect Ratio

Use Cases

Find raccoons!

This dataset is a great starter dataset for building an object detection model. Dat has written a comprehensive tutorial here.

Getting Started

Fork or download this dataset and follow Dat's tutorial for more.

Thermal Dogs and People

About This Dataset

The Roboflow Thermal Dogs and People dataset is a collection of 203 thermal infrared images captured at various distances from people and dogs in a park and near a home. Some images are deliberately unannotated as they do not contain a person or dog (see the Dataset Health Check for more). Images were captured both portrait and landscape. (Roboflow auto-orient assures the annotations align regardless of the image orientation.)

Thermal images were captured using the Seek Compact XR Extra Range Thermal Imaging Camera for iPhone. The selected color palette is Spectra.

Example

This is an example image and annotation from the dataset:
Man and Dog

Usage

Thermal images have a wide array of applications: monitoring machine performance, seeing in low light conditions, and adding another dimension to standard RGB scenarios. Infrared imaging is useful in security, wildlife detection,and hunting / outdoors recreation.

This dataset serves as a way to experiment with infrared images in Roboflow. (Or, you could build your own night time pet finder!)

Collecting Custom Data

Roboflow is happy to improve your operations with infrared imaging and computer vision. Services range from data collection to building automated monitoring systems leveraging computer vision. Reach out for more.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
:fa-spacer:
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Aerial Airport

  • GDIT
  • planes Dataset
  • 338 images

Overview

The GDIT Aerial Airport dataset consists of aerial images containing instances of parked airplanes. All plane types have been grouped into a single classification named "airplane".

Example Image

Aerial View

Forklift

About this Dataset

This dataset was created by exporting images from images.cv and labeling them as an object detection dataset. The dataset contains 421 raw images (v1 - raw-images) and labeled classes include:

  • forklift
  • person

Example annotated image from the dataset from the dataset

Pill Detection

Background Information

This dataset was curated and annotated by Mohamed Attia.

The original dataset (v1) is composed of 451 images of various pills that are present on a large variety of surfaces and objects.
Example of an Annotated Image from the Dataset

The dataset is available under the Public License.

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 1 (v1) - 451 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 2 (v2) - 1,083 images

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416), all classes remapped (Modify Classes) to "pill"
  • Augmentations:
    90° Rotate: Clockwise, Counter-Clockwise, Upside Down
    Crop: 0% Minimum Zoom, 77% Maximum Zoom
    Rotation: Between -45° and +45°
    Shear: ±15° Horizontal, ±15° Vertical
    Hue: Between -22° and +22°
    Saturation: Between -27% and +27%
    Brightness: Between -33% and +33%
    Exposure: Between -25% and +25%
    Blur: Up to 3px
    Noise: Up to 5% of pixels
    Cutout: 3 boxes with 10% size each
    Mosaic: Applied
    Bounding Box: Brightness: Between -25% and +25%
  • Training Metrics: Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow
    • mAP = 91.4%, precision = 61.1%, recall = 93.9%

Version 3 (v3) - 1,083 images

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416), all classes remapped (Modify Classes) to "pill"
  • Augmentations:
    90° Rotate: Clockwise, Counter-Clockwise, Upside Down
    Crop: 0% Minimum Zoom, 77% Maximum Zoom
    Rotation: Between -45° and +45°
    Shear: ±15° Horizontal, ±15° Vertical
    Hue: Between -22° and +22°
    Saturation: Between -27% and +27%
    Brightness: Between -33% and +33%
    Exposure: Between -25% and +25%
    Blur: Up to 3px
    Noise: Up to 5% of pixels
    Cutout: 3 boxes with 10% size each
    Mosaic: Applied
    Bounding Box: Brightness: Between -25% and +25%
  • Training Metrics: Trained from "scratch" (no transfer learning employed) on Roboflow
    • mAP = 84.3%, precision = 53.2%, recall = 86.7%

Version 4 (v4) - 451 images

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416), all classes remapped (Modify Classes) to "pill"
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 5 (v5) - 496 images

Mohamed Attia - LinkedIn

aicook

Background Information

This dataset was curated and annotated by - Karel Cornelis.

The original dataset (v1) is composed of 516 images of various ingredients inside a fridge. The project was created as part of a groupwork for a postgraduate applied AI at Erasmus Brussels - we made an object detection model to identify ingredients in a fridge.

From the recipe dataset we used (which is a subset of the recipe1M dataset) we distilled the top50 ingredients and used 30 of those to randomly fill our fridge.

Read this blog post to learn more about the model production process: How I Used Computer Vision to Make Sense of My Fridge

Watch this video to see the model in action: AICook

The dataset is available under the MIT License.

Getting Started

You can download this dataset for use within your own project, fork it into a workspace on Roboflow to create your own model, or test one of the trained versions within the app.

Dataset Versions

Version 1 (v1) - 516 images (original-images)

  • Preprocessing: Auto-Orient
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 2 (v2) - 3,050 images (aicook-augmented-trainFromCOCO)

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416)
  • Augmentations:
    • Outputs per training example: 8
      Rotation: Between -3° and +3°
      Exposure: Between -20% and +20%
      Blur: Up to 3px
      Noise: Up to 5% of pixels
      Cutout: 12 boxes with 10% size each
  • Training Metrics: Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow
    • mAP = 97.6%, precision = 86.9%, recall = 98.5%

Version 3 (v3) - 3,050 images (aicook-augmented-trainFromScratch)

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416)
  • Augmentations:
    • Outputs per training example: 8
      Rotation: Between -3° and +3°
      Exposure: Between -20% and +20%
      Blur: Up to 3px
      Noise: Up to 5% of pixels
      Cutout: 12 boxes with 10% size each
  • Training Metrics: Trained from "scratch" (no transfer learning employed) on Roboflow
    • mAP = 97.9%, precision = 79.6%, recall = 98.6%

Version 4 (v4) - 3,050 images images (aicook-augmented)

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416)
  • Augmentations:
    • Outputs per training example: 8
      Rotation: Between -3° and +3°
      Exposure: Between -20% and +20%
      Blur: Up to 3px
      Noise: Up to 5% of pixels
      Cutout: 12 boxes with 10% size each
  • Training Metrics: This version of the dataset was not trained

Karel Cornelis - LinkedIn

Overview

The Drowsiness dataset is a collection of images of a person in a vehicle (Ritesh Kanjee, of Augmented Startups) simulating "drowsy" and "awake" facial postures. This dataset can easily be used as a benchmark for a "driver alertness" or "driver safety" computer vision model.

Example Footage!

Distracted Driver Model - Example Footage

Training and Deployment

The Drowsiness model has been trained with Roboflow Train, and available for inference on the Dataset tab. We have also trained a YOLOR model for robust detection and tracking of a fatigued driver. You can learn more here: https://augmentedstartups.info/YOLOR-Get-Started

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. With over 94k subscribers on YouTube, we embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality.

Shellfish-OpenImages

Image example

Overview

This dataset contains 581 images of various shellfish classes for object detection. These images are derived from the Open Images open source computer vision datasets.

This dataset only scratches the surface of the Open Images dataset for shellfish!

Image example

Use Cases

  • Train object detector to differentiate between a lobster, shrimp, and crab.
  • Train object dector to differentiate between shellfish
  • Object detection dataset across different sub-species
  • Object detection among related species
  • Test object detector on highly related objects
  • Train shellfish detector
  • Explore the quality and range of Open Image dataset

Tools Used to Derive Dataset

Image example

These images were gathered via the OIDv4 Toolkit This toolkit allows you to pick an object class and retrieve a set number of images from that class with bound box lables.

We provide this dataset as an example of the ability to query the OID for a given subdomain. This dataset can easily be scaled up - please reach out to us if that interests you.

Pothole

Pothole Dataset

Example Image

This is a collection of 665 images of roads with the potholes labeled. The dataset was created and shared by Atikur Rahman Chitholian as part of his undergraduate thesis and was originally shared on Kaggle.

Note: The original dataset did not contain a validation set; we have re-shuffled the images into a 70/20/10 train-valid-test split.

Usage

This dataset could be used for automatically finding and categorizing potholes in city streets so the worst ones can be fixed faster.

The dataset is provided in a wide variety of formats for various common machine learning models.

Wildfire Smoke

Detecting Wildfire Smoke with Computer Vision

This dataset is released by AI for Mankind in collaboration with HPWREN under a Creative Commons by Attribution Non-Commercial Share Alike license. The original dataset (and additional images without bounding boxes) can be found in their GitHub repo.

We have mirrored the dataset here for ease of download in a variety of common computer vision formats.

To learn more about this dataset and its possible applications in fighting wildfires, see this case study of Abhishek Ghosh's wildfire detection model.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Rock Paper Scissors

Kaylee from Team Roboflow demos how to train a rock paper scissors object detector with this dataset.

Try it out on your Webcam! And remember, if it doesn't work so well, to make it better, you can upload and annotate new images of yourself doing rock paper scissors to futher educate the model.

Mask Wearing

Overview

The Mask Wearing dataset is an object detection dataset of individuals wearing various types of masks and those without masks. The images were originally collected by Cheng Hsun Teng from Eden Social Welfare Foundation, Taiwan and relabled by the Roboflow team.

Example image (some with masks, some without):
Example Image

Use Cases

One could use this dataset to build a system for detecting if an individual is wearing a mask in a given photo.

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

Cats

About this Dataset

This dataset was created by exporting the Oxford Pets dataset from Roboflow Universe, generating a version with Modify Classes to drop all of the classes for the labeled dog breeds and consolidating all cat breeds under the label, "cat." The bounding boxes were also modified to incude the entirety of the cats within the images, rather than only their faces/heads.

Annotated image of a cat from the dataset

Oxford Pets

  • The Oxford Pets dataset (also known as the "dogs vs cats" dataset) is a collection of images and annotations labeling various breeds of dogs and cats. There are approximately 100 examples of each of the 37 breeds. This dataset contains the object detection portion of the original dataset with bounding boxes around the animals' heads.

  • Origin: This dataset was collected by the Visual Geometry Group (VGG) at the University of Oxford.

WeedCrop

This dataset is derived by the following publication:

Kaspars Sudars, Janis Jasko, Ivars Namatevs, Liva Ozola, Niks Badaukis,
Dataset of annotated food crops and weed images for robotic computer vision control,
Data in Brief,
Volume 31,
2020,
105833,
ISSN 2352-3409,
https://doi.org/10.1016/j.dib.2020.105833.
(https://www.sciencedirect.com/science/article/pii/S2352340920307277)
Abstract: Weed management technologies that can identify weeds and distinguish them from crops are in need of artificial intelligence solutions based on a computer vision approach, to enable the development of precisely targeted and autonomous robotic weed management systems. A prerequisite of such systems is to create robust and reliable object detection that can unambiguously distinguish weed from food crops. One of the essential steps towards precision agriculture is using annotated images to train convolutional neural networks to distinguish weed from food crops, which can be later followed using mechanical weed removal or selected spraying of herbicides. In this data paper, we propose an open-access dataset with manually annotated images for weed detection. The dataset is composed of 1118 images in which 6 food crops and 8 weed species are identified, altogether 7853 annotations were made in total. Three RGB digital cameras were used for image capturing: Intel RealSense D435, Canon EOS 800D, and Sony W800. The images were taken on food crops and weeds grown in controlled environment and field conditions at different growth stages
Keywords: Computer vision; Object detection; Image annotation; Precision agriculture; Crop growth and development

Website Screenshots

About This Dataset

The Roboflow Website Screenshots dataset is a synthetically generated dataset composed of screenshots from over 1000 of the world's top websites. They have been automatically annotated to label the following classes:
:fa-spacer:

  • button - navigation links, tabs, etc.
  • heading - text that was enclosed in <h1> to <h6> tags.
  • link - inline, textual <a> tags.
  • label - text labeling form fields.
  • text - all other text.
  • image - <img>, <svg>, or <video> tags, and icons.
  • iframe - ads and 3rd party content.

Example

This is an example image and annotation from the dataset:
WIkipedia Screenshot

Usage

Annotated screenshots are very useful in Robotic Process Automation. But they can be expensive to label. This dataset would cost over $4000 for humans to label on popular labeling services. We hope this dataset provides a good starting point for your project. Try it with a model from our model library.

Collecting Custom Data

Roboflow is happy to provide a custom screenshots dataset to meet your particular needs. We can crawl public or internal web applications. Just reach out and we'll be happy to provide a quote!

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
:fa-spacer:
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Aerial Sheep

  • Riis
  • sheep Dataset
  • 1727 images

Overview

The Aerial Sheep dataset contains images taken from a birds-eye view with instances of sheep in them. Images do not differentiate between gender or breed of sheep, instead grouping them into a single class named "sheep".

Example Footage

Aerial Sheep

See RIIS's sheep counter application for additional use case examples.
Link - https://riis.com/blog/counting-sheep-using-drones-and-ai/

About RIIS

https://riis.com/about/

Open Poetry Vision

Overview

The Open Poetry Vision dataset is a synthetic dataset created by Roboflow for OCR tasks.

It combines a random image from the Open Images Dataset with text primarily sampled from Gwern's GPT-2 Poetry project. Each image in the dataset contains between 1 and 5 strings in a variety of fonts and colors randomly positioned in the 512x512 canvas. The classes correspond to the font of the text.

Example Image:
Example Image

Use Cases

A common OCR workflow is to use a neural network to isolate text for input into traditional optical character recognition software. This dataset could make a good starting point for an OCR project like business card parsing or automated paper form-processing.

Alternatively, you could try your hand using this as a neural font identification dataset. Nvidia, amongst others, have had success with this task.

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

Version 5 of this dataset (classes_all_text-raw-images) has all classes remapped to be labeled as "text." This was accomplished by using Modify Classes as a preprocessing step.

Version 6 of this dataset (classes_all_text-augmented-FAST) has all classes remapped to be labeled as "text." and was trained with Roboflow's Fast Model.

Version 7 of this dataset (classes_all_text-augmented-ACCURATE) has all classes remapped to be labeled as "text." and was trained with Roboflow's Accurate Model.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

OnePetri

Background Information

This dataset was created by Michael Shamash and contains the images used to train the OnePetri plaque detection model (plaque detection model v1.0).

In microbiology, a plaque is defined as a “clear area on an otherwise opaque field of bacteria that indicates the inhibition or dissolution of the bacterial cells by some agent, either a virus or an antibiotic. Plaques are a sensitive laboratory indicator of the presence of some anti-bacterial factor.”
When working with bacteriophages (phages), viruses which can only infect and kill bacteria, scientists often need to perform the time-intensive monotonous task of counting plaques on Petri dishes. To help solve this problem I developed OnePetri, a set of machine learning models and a mobile phone application (currently iOS-only) that accelerates common microbiological Petri dish assays using AI.

A task that once took microbiologists several minutes to do per Petri dish (adds up quickly considering there are often tens of Petri dishes to analyze at a time!) could now be mostly automated thanks to computer vision, and completed in a matter of seconds.

App in Action

Video Clip

Petri Dish

Example Image

Plaque Detection

A total of 43 source images were used in this dataset with the following split: 29 training, 9 validation, 5 testing (2505 images after preprocessing and augmentations are applied).

OnePetri is a mobile phone application (currently iOS-only) which accelerates common microbiological Petri dish assays using AI. OnePetri's YOLOv5s plaque detection model was trained on a diverse set of images from the HHMI's SEA-PHAGES program, many of which are included in this dataset. This project wouldn't be possible without their support!

The following pre-processing options were applied:

  1. Auto-orient
  2. Tile image into 5 rows x 5 columns
  3. Resize tiles to 416px x 416px

The following augmentation options were applied:

  1. Grayscale (35% of images)
  2. Hue shift (-45deg to +45deg)
  3. Blur up to 2px
  4. Mosaic

OnePetri App In Action

For more information and to download OnePetri please visit: https://onepetri.ai/.

smdComponents

  • Dainius
  • electronic-components Dataset
  • 2997 images

This project was created for research work by

Dainius Varna and Vytautas Abromavičius of Vilnius Gediminias Technical University in Lithuania.

The dataset consists of images of SMD-type electronic components, which are moving on a conveyor belt. There are four types of components in the collected dataset:

  1. Capacitors
  2. Resistors
  3. Diodes
  4. Transistors

This is the initial dataset that was augmented to create the final model. Download the raw-images dataset version (v2) of this project to start your own custom project.

Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), 03227 Vilnius, Lithuania; vgtu@vgtu.lt

Dataset Collection

The dataset was collected using Nvidia Data Capture Control.

Figure 3 from the Paper - Example Image of each Component Type
Figure 3. Four types of electronic components used for the dataset. (a) capacitor, (b) resistor, (c) diode, (d) transistor.

Abstract:

The presented research addresses the real-time object detection problem with small and moving objects, specifically the surface-mount component on a conveyor.

Detecting and counting small moving objects on the assembly line is a challenge. In order to meet the requirements of real-time applications, state-of-the-art electronic component detection and classification algorithms are implemented into powerful hardware systems.

This work proposes a low-cost system with an embedded microcomputer to detect surface-mount components on a conveyor belt in real time. The system detects moving, packed, and unpacked surface-mount components.

The system’s performance was experimentally investigated by implementing several object-detection algorithms. The system’s performance with different algorithm implementations was compared using mean average precision and inference time. The results of four different surface-mount components showed average precision scores of 97.3% and 97.7% for capacitor and resistor detection.

The findings suggest that the system with the implemented YOLOv4-tiny algorithm on the Jetson Nano 4 GB microcomputer achieves a mean average precision score of 88.03% with an inference time of 56.4 ms and 87.98% mean average precision with 11.2 ms inference time on the Tesla P100 16 GB platform.

Oxford Pets

Example Annotations

About this Dataset

The Oxford Pets dataset (also known as the "dogs vs cats" dataset) is a collection of images and annotations labeling various breeds of dogs and cats. There are approximately 100 examples of each of the 37 breeds. This dataset contains the object detection portion of the original dataset with bounding boxes around the animals' heads.

Origin

This dataset was collected by the Visual Geometry Group (VGG) at the University of Oxford.

Surfer Spotting

Overview

The Surfline Surfer Spotting dataset contains images with surfers floating on the coast. Each image contains one classification called "surfer" but may contain multiple surfers.

Example Footage

Surfers

Using this Dataset

There are several deployment options available, including inferring via API, webcam, and curl command.

Here is a code snippet for to hit the hosted inference API you can use. Here are code snippets for more languages

const axios = require("axios");
                            const fs = require("fs");
                            
                            const image = fs.readFileSync("YOUR_IMAGE.jpg", {
                                encoding: "base64"
                            });
                            
                            axios({
                                method: "POST",
                                url: "https://detect.roboflow.com/surfer-spotting/2",
                                params: {
                                    api_key: "YOUR_KEY"
                                },
                                data: image,
                                headers: {
                                    "Content-Type": "application/x-www-form-urlencoded"
                                }
                            })
                            .then(function(response) {
                                console.log(response.data);
                            })
                            .catch(function(error) {
                                console.log(error.message);
                            });
                            

Download Dataset

On the versions tab you can select the version you like, and choose to download in 26 annotation formats.

Synthetic Fruit

About this dataset

This dataset contains 6,000 example images generated with the process described in Roboflow's How to Create a Synthetic Dataset tutorial.

The images are composed of a background (randomly selected from Google's Open Images dataset) and a number of fruits (from Horea94's Fruit Classification Dataset) superimposed on top with a random orientation, scale, and color transformation. All images are 416x550 to simulate a smartphone aspect ratio.

To generate your own images, follow our tutorial or download the code.

Example:
Example Image

Overview

The Numberplate Dataset is a collection of Licence Plates that can easily be used for Automatic Number Plate Detection.

Example Footage!

Licese Plate Detection

Training and Deployment

The Number Plate model has been trained in Roboflow, and available for inference on the Dataset tab.
One could also build a Automatic Number Plate Recognition [ANPR] App using YOLOR and EasyOCR. This is achieved using the Roboflow Platform which you can deploy the model for robust and real-time ANPR.

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. With over 92k subscribers on YouTube, we embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality

Uno Cards

Overview

This dataset contains 8,992 images of Uno cards and 26,976 labeled examples on various textured backgrounds.

This dataset was collected, processed, and released by Roboflow user Adam Crawshaw, released with a modified MIT license: https://firstdonoharm.dev/

Image example

Use Cases

Adam used this dataset to create an auto-scoring Uno application:

Getting Started

Fork or download this dataset and follow our How to train state of the art object detector YOLOv4 for more.

Annotation Guide

See here for how to use the CVAT annotation tool.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
:fa-spacer:
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Playing Cards

Overview

The Playing Cards dataset is a collection of synthetically generated cards blended into various types of backgrounds. You will be able to perform object detection to detect both number and suit of the cards.

Example Footage

Training and Deployment

The playing cards model has been trained in Roboflow, available for inference on the Dataset tab.

One could also build a Card Counting model for either Black Jack or Poker using YOLOR. This is achieved using the Roboflow Platform which you can deploy the model for robust and real-time detections. You can learn more here: https://augmentedstartups.info/YOLOR-Get-Started

Video Demo using YOLOR for training- https://youtu.be/2lGTZuaH4ec

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. With over 90k subscribers on YouTube, we embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality.

PKLot

Parking Lot Dataset

The PKLot dataset contains 12,416 images of parking lots extracted from surveilance camera frames. There are images on sunny, cloudy, and rainy days and the parking spaces are labeled as occupied or empty. We have converted the original annotations to a variety of standard object detection formats by enclosing a bounding box around the original dataset's rotated rectangle annotations.

Using this Dataset

The PKLot database is licensed under a Creative Commons Attribution 4.0 License and may be used provided you acknowledge the source by citing the PKLot paper in publications about your research:

Almeida, P., Oliveira, L. S., Silva Jr, E., Britto Jr, A., Koerich, A., PKLot – A robust dataset for parking lot classification, Expert Systems with Applications, 42(11):4937-4949, 2015.
                            

Drone Control

Overview

The Drone Gesture Control Dataset is an object detection dataset that mimicks DJI's air gesture capability. This dataset consists of hand and body gesture commands that you can command your drone to either ,'take-off', 'land' and'follow'.

Example Footage

Drone Control

Model Training and Inference

The model for this dataset has been trained on Roboflow the Dataset tab, with exports to the OpenCV AI Kit, which is running on the drone in this example.

One could also build a model using MobileNet SSD using the Roboflow Platform deploy it to the OpenCV AI Kit. Watch the full tutorial here: https://augmentedstartups.info/AI-Drone-Tutorial

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings, or additional augmentations to make your model generalize better.

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. We embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality.

Garage

Object tracking for cars in my garage for use in home automation.

https://github.com/brianegge/garbage_bin

face-features-test

A simple dataset for benchmarking CreateML object detection models. The images are sampled from COCO dataset with eyes and nose bounding boxes added. It’s not meant to be serious or useful in a real application. The purpose is to look at how long it takes to train CreateML models with varying dataset and batch sizes.

Training performance is affected by model configuration, dataset size and batch configuration. Larger models and batches require more memory. I used CreateML object detection project to compare the performance.

Hardware

M1 Macbook Air

  • 8 GPU
  • 4/4 CPU
  • 16G memory
  • 512G SSD

M1 Max Macbook Pro

  • 24 GPU
  • 2/8 CPU
  • 32G memory
  • 2T SSD

Small Dataset
Train: 144
Valid: 16
Test: 8

Results

batch M1 ET M1Max ET peak mem G
16 16 11 1.5
32 29 17 2.8
64 56 30 5.4
128 170 57 12

Larger Dataset
Train: 301
Valid: 29
Test: 18

Results

batch M1 ET M1Max ET peak mem G
16 21 10 1.5
32 42 17 3.5
64 85 30 8.4
128 281 54 16.5

CreateML Settings

For all tests, training was set to Full Network. I closed CreateML between each run to make sure memory issues didn't cause a slow down. There is a bug with Monterey as of 11/2021 that leads to memory leak. I kept an eye on the memory usage. If it looked like there was a memory leak, I restarted MacOS.

Observations

In general, more GPU and memory with MBP reduces the training time. Having more memory lets you train with larger datasets. On M1 Macbook Air, the practical limit is 12G before memory pressure impacts performance. On M1 Max MBP, the practical limit is 26G before memory pressure impacts performance. To work around memory pressure, use smaller batch sizes.

On the larger dataset with batch size 128, the M1Max is 5x faster than Macbook Air. Keep in mind a real dataset should have thousands of samples like Coco or Pascal. Ideally, you want a dataset with 100K images for experimentation and millions for the real training. The new M1 Max Macbooks is a cost effective alternative to building a Windows/Linux workstation with RTX 3090 24G. For most of 2021, the price of RTX 3090 with 24G is around $3,000.00. That means an equivalent windows workstation would cost the same as the M1Max Macbook pro I used to run the benchmarks.

Full Network vs Transfer Learning

As of CreateML 3, training with full network doesn't fully utilize the GPU. I don't know why it works that way. You have to select transfer learning to fully use the GPU. The results of transfer learning with the larger dataset. In general, the training time is faster and loss is better.

batch ET min Train Acc Val Acc Test Acc Top IU Train Top IU Valid Top IU Test Peak mem G loss
16 4 75 19 12 78 23 13 1.5 0.41
32 8 75 21 10 78 26 11 2.76 0.02
64 13 75 23 8 78 24 9 5.3 0.017
128 25 75 22 13 78 25 14 8.4 0.012

Github Project

The source code and full results are up on Github https://github.com/woolfel/createmlbench

Sub-transmission Asset

  • TIPQC
  • STA Dataset
  • 775 images

The dataset includes actual pictures of Transmission and Sub-transmission Electrical Structure captured during the official inspection of assets being amortized by the 17 Electric Cooperative of the Philippines to the National Transmission Corporation (TransCo). The pictures are captured in years 2018, 2019 and 2020. Below are the following Electric Cooperatives:

  1. Bukidnon Sub-transmission Corporation (BSTC)
  2. Northern Negros Electric Cooperative, Inc. (NONECO)
  3. South Cotabato 1 Electric Cooperative, Inc. (SOCOTECO 1)
  4. Cebu 2 Electric Cooperative, Inc. (CEBECO 2)
  5. Peninsula Electric Cooperative, Inc. (PENELCO)
  6. Misamis Oriental 2 Electric Cooperative, Inc. (MORESCO 2)
  7. Davao del Sur Electric Cooperative, Inc. (DASURECO)
  8. Camiguin Electric Cooperative, Inc. (CAMELCO)
  9. Iloilo 2 Electric Cooperative, Inc. (ILECO 2)
  10. Misamis Oriental 1 Electric Cooperative, Inc. (MORESCO 1)
  11. Davao Oriental Electric Cooperative, Inc. (DORECO)
  12. Isabela 1 Electric Cooperative, Inc. (ISELCO 1)
  13. Aklan Electric Cooperative, Inc. (AKELCO)
  14. Sultan Kudarat Electric Cooperative, Inc. (SUKELCO)
  15. Zamboanga City Electric Cooperative, Inc. (ZAMCELCO)
  16. South Cotabato 2 Electric Cooperative, Inc. (SOCOTECO 2)
  17. Camarines Norte Electric Cooperative, Inc. (CANORECO)

Boggle Boards

Overview

We have captured and annotated photos of the popular board game, Boggle. Images are predominantly from 4x4 Boggle with about 30 images from Big Boggle (5x5).

  • 357 images
  • 7110 annotated letter cubes

These images are released for you to use in training your machine learning models.

Use Cases

We used this dataset to create BoardBoss, an augmented reality board game helper app. You can download BoardBoss in the App Store for free to see the end result!
:fa-spacer:
BoardBoss
:fa-spacer:
The model trained from this dataset was paired with some heuristics to recreate the board state and overlay it with an AR representation. We then used a traditional recursive backtracking algorithm to find and show the best words on the board.

Using this Dataset

We're releasing the data as public domain. Feel free to use it for any purpose.
It's not required to provide attribution, but it'd be nice! :fal-smile-wink:

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Workmark

FinalGolf

Dice

Overview

We have captured and annotated photos of six-sided dice. There are 359 total images from a few sets:

  • 154 single dice of various styles on a white table
  • 388 Catan Dice (Red and Yellow, some rolled on a white table, 160 on top of or near the Catan board)
  • 13 mass groupings of dice in various styles

These images are released for you to use in training your machine learning models.
:fa-spacer:
Example Image
:fa-spacer:
Classes are generally balanced. Here's the output of Roboflow's Dataset Health check:
Class Balance

Use Cases

This would be a great dataset to test out different object detection models like YOLO v3, MaskRCNN, mobilenet, or others.

You could use it to create dice game helper apps (like a dice counter) or independent games.

Using this Dataset

We're releasing the data as public domain. Feel free to use it for any purpose.
It's not required to provide attribution, but it'd be nice! :fal-smile-wink:

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

BCCD

Overview

This is a dataset of blood cells photos, originally open sourced by cosmicad and akshaylambda.

There are 364 images across three classes: WBC (white blood cells), RBC (red blood cells), and Platelets. There are 4888 labels across 3 classes (and 0 null examples).

Here's a class count from Roboflow's Dataset Health Check:

BCCD health

And here's an example image:

Blood Cell Example

Fork this dataset (upper right hand corner) to receive the raw images, or (to save space) grab the 500x500 export.

Use Cases

This is a small scale object detection dataset, commonly used to assess model performance. It's a first example of medical imaging capabilities.

Using this Dataset

We're releasing the data as public domain. Feel free to use it for any purpose.

It's not required to provide attribution, but it'd be nice! :)

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

COCO 128

COCO 128 is a subset of 128 images of the larger COCO dataset. It reuses the training set for both validation and testing, with the purpose of proving that your training pipeline is working properly and can overfit this small dataset.

COCO 128 is a great dataset to use the first time you are testing out a new model.

mos

Vehicles-OpenImages

Image example

Overview

This dataset contains 627 images of various vehicle classes for object detection. These images are derived from the Open Images open source computer vision datasets.

This dataset only scratches the surface of the Open Images dataset for vehicles!

Image example

Use Cases

  • Train object detector to differentiate between a car, bus, motorcycle, ambulance, and truck.
  • Checkpoint object detector for autonomous vehicle detector
  • Test object detector on high density of ambulances in vehicles
  • Train ambulance detector
  • Explore the quality and range of Open Image dataset

Tools Used to Derive Dataset

Image example

These images were gathered via the OIDv4 Toolkit This toolkit allows you to pick an object class and retrieve a set number of images from that class with bound box lables.

We provide this dataset as an example of the ability to query the OID for a given subdomain. This dataset can easily be scaled up - please reach out to us if that interests you.

PlantDoc

Overview

The PlantDoc dataset was originally published by researchers at the Indian Institute of Technology, and described in depth in their paper. One of the paper’s authors, Pratik Kayal, shared the object detection dataset available on GitHub.

PlantDoc is a dataset of 2,569 images across 13 plant species and 30 classes (diseased and healthy) for image classification and object detection. There are 8,851 labels. Read more about how the version available on Roboflow improves on the original version here.

And here's an example image:

Tomato Blight

Fork this dataset (upper right hand corner) to receive the raw images, or (to save space) grab the 416x416 export.

Use Cases

As the researchers from IIT stated in their paper, “plant diseases alone cost the global economy around US$220 billion annually.” Training models to recognize plant diseases earlier dramatically increases yield potential.

The dataset also serves as a useful open dataset for benchmarks. The researchers trained both object detection models like MobileNet and Faster-RCNN and image classification models like VGG16, InceptionV3, and InceptionResnet V2.

The dataset is useful for advancing general agriculture computer vision tasks, whether that be health crop classification, plant disease classification, or plant disease objection.

Using this Dataset

This dataset follows Creative Commons 4.0 protocol. You may use it commercially without Liability, Trademark use, Patent use, or Warranty.

Provide the following citation for the original authors:

@misc{singh2019plantdoc,
                                title={PlantDoc: A Dataset for Visual Plant Disease Detection},
                                author={Davinder Singh and Naman Jain and Pranjali Jain and Pratik Kayal and Sudhakar Kumawat and Nipun Batra},
                                year={2019},
                                eprint={1911.10317},
                                archivePrefix={arXiv},
                                primaryClass={cs.CV}
                            }
                            

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

Pistols

Overview

This dataset contains 2986 images and 3448 labels across a single annotation class: pistols. Images are wide-ranging: pistols in-hand, cartoons, and staged studio quality images of guns.

The dataset was originally released by the University of Grenada , duplicates removed, and rehosted by a Roboflow user.
Example Image

Use Cases

One can create a gun object detection model to monitor security camera footage for the presence of guns, perhaps in places where they should not be. Alaa Senjab built on Roboflow to achieve this goal. He's also open sourced much of his work in this tutorial .

Realtime gun detection

Hard Hat Workers

Overview

The Hard Hat dataset is an object detection dataset of workers in workplace settings that require a hard hat. Annotations also include examples of just "person" and "head," for when an individual may be present without a hard hart.

Example Image:
Example Image

Use Cases

One could use this dataset to, for example, build a classifier of workers that are abiding safety code within a workplace versus those that may not be. It is also a good general dataset for practice.

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

PCBsmdComponents

This project was created for research work by

Dainius Varna and Vytautas Abromavičius of Vilnius Gediminias Technical University in Lithuania.

This dataset is an augmented version of the dataset.

Department of Electronic Systems, Vilnius Gediminas Technical University (VILNIUS TECH), 03227 Vilnius, Lithuania; vgtu@vgtu.lt

Dataset Collection

The dataset was collected using Nvidia Data Capture Control.
The dataset consists of images of SMD-type electronic components, which are moving on a conveyor belt. There are four types of components in the collected dataset:

  1. Capacitors
  2. Resistors
  3. Diodes
  4. Transistors

Figure 3 from the Paper - Example Image of each Component Type
Figure 3. Four types of electronic components used for the dataset. (a) capacitor, (b) resistor, (c) diode, (d) transistor.

Abstract:

The presented research addresses the real-time object detection problem with small and moving objects, specifically the surface-mount component on a conveyor.
Detecting and counting small moving objects on the assembly line is a challenge. In order to meet the requirements of real-time applications, state-of-the-art electronic component detection and classification algorithms are implemented into powerful hardware systems.
This work proposes a low-cost system with an embedded microcomputer to detect surface-mount components on a conveyor belt in real time. The system detects moving, packed, and unpacked surface-mount components.
The system’s performance was experimentally investigated by implementing several object-detection algorithms. The system’s performance with different algorithm implementations was compared using mean average precision and inference time. The results of four different surface-mount components showed average precision scores of 97.3% and 97.7% for capacitor and resistor detection.
The findings suggest that the system with the implemented YOLOv4-tiny algorithm on the Jetson Nano 4 GB microcomputer achieves a mean average precision score of 88.03% with an inference time of 56.4 ms and 87.98% mean average precision with 11.2 ms inference time on the Tesla P100 16 GB platform.

Self Driving Car

  • Roboflow
  • obstacles Dataset
  • 13280 images

Overview

The original Udacity Self Driving Car Dataset is missing labels for thousands of pedestrians, bikers, cars, and traffic lights. This will result in poor model performance. When used in the context of self driving cars, this could even lead to human fatalities.

We re-labeled the dataset to correct errors and omissions. We have provided convenient downloads in many formats including VOC XML, COCO JSON, Tensorflow Object Detection TFRecords, and more.

Some examples of labels missing from the original dataset:
Examples of Missing Labels

Stats

The dataset contains 97,942 labels across 11 classes and 15,000 images. There are 1,720 null examples (images with no labels).

All images are 1920x1200 (download size ~3.1 GB). We have also provided a version downsampled to 512x512 (download size ~580 MB) that is suitable for most common machine learning models (including YOLO v3, Mask R-CNN, SSD, and mobilenet).

Annotations have been hand-checked for accuracy by Roboflow.

Class Balance

Annotation Distribution:
Annotation Heatmap

Use Cases

Udacity is building an open source self driving car! You might also try using this dataset to do person-detection and tracking.

Using this Dataset

Our updates to the dataset are released under the MIT License (the same license as the original annotations and images).

Note: the dataset contains many duplicated bounding boxes for the same subject which we have not corrected. You will probably want to filter them by taking the IOU for classes that are 100% overlapping or it could affect your model performance (expecially in stoplight detection which seems to suffer from an especially severe case of duplicated bounding boxes).

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark

Pascal VOC 2012

Pascal VOC 2012 is common benchmark for object detection. It contains common objects that one might find in images on the web.

Image example

Note: the test set is witheld, as is common with benchmark datasets.

You can think of it sort of like a baby COCO.

VOT2015

VOT2015 Dataset

The dataset comprises 60 short sequences showing various objects in challenging backgrounds. The sequences were chosen from a large pool of sequences including the ALOV dataset, OTB2 dataset, non-tracking datasets, Computer Vision Online, Professor Bob Fisher’s Image Database, Videezy, Center for Research in Computer Vision, University of Central Florida, USA, NYU Center for Genomics and Systems Biology, Data Wrangling, Open Access Directory and Learning and Recognition in Vision Group, INRIA, France. The VOT sequence selection protocol was applied to obtain a representative set of challenging sequences. The dataset is automatically downloaded by the evaluation kit when needed, there is no need to separately download the sequences for the challenge.

Annotations
The sequences were annotated by the VOT committee using rotated bounding boxes in order to provide highly accurate ground truth values for comparing results. The annotations are stored in a text file with the format:

frameN: X1, Y1, X2, Y2, X3, Y3, X4, Y4
where Xi and Yi are the coordinates of corner i of the bounding box in frame N, the N-th row in the text file.

The bounding box was be placed on target such that at most ~30% of pixels within the bounding box corresponded to the background pixels, while containing most of the target. For example, in annotating a person with extended arms, the bounding box was placed such that the arms were not included. Note that in some sequences parts of objects rather than entire objects have been annotated. A rotated bounding box was used to address non-axis alignment of the target. The annotation guidelines have been applied at the judgement of the annotators.

Some targets were partially occluded or were partially out of the image frame. In these cases the bounding box were “inferred” by the annotator to fully contain the object, including the occluded part. For example, if a person’s legs were occluded, the bounding box should also include the non-visible legs.

The annotations have been conducted by three groups of annotators. Each annotator group annotated one third of the dataset and these annotations have been cross-checked by two other groups. The final annotations were checked by the coordinator of the annotation process. The final bounding box annotations have been automatically rectified by replacing a rotated bounding box by an axis-aligned if the ratio of the shortest and longest bounding-box side exceeded 0.95.

Annotators:

Gustavo Fernandez (coordinator)
Jingjing Xiao
Georg Nebehay
Roman Pflugfelder
Koray Aytac

https://www.votchallenge.net/vot2015/dataset.html

Drone Control

Overview

The Drone Gesture Control Dataset is an object detection dataset that mimicks DJI's air gesture capability. This dataset consists of hand and body gesture commands that you can command your drone to either ,'take-off', 'land' and'follow'.

Example Image

Drone Example

Use Cases

One could build a model using MobileNet SSD using the Roboflow Platform deploy it to the OpenCV AI Kit. Watch the full tutorial here: https://augmentedstartups.info/AI-Drone-Tutorial

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings, or additional augmentations to make your model generalize better.

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. We embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality.

Microsoft COCO

This is the full 2017 COCO object detection dataset (train and valid), which is a subset of the most recent 2020 COCO object detection dataset.

COCO is a large-scale object detection, segmentation, and captioning dataset of many object types easily recognizable by a 4-year-old. The data is initially collected and published by Microsoft. The original source of the data is here and the paper introducing the COCO dataset is here.