Browse » Transportation » Cars

Top Cars Datasets

Roboflow hosts the world's biggest set of open-source car datasets and pre-trained computer vision models. The category includes images of cars from around the world, curated and annotated by the Roboflow Community. These projects can help you get started with things like object speed calculation, object tracking, autonomous vehicles, and smart-city transportation innovations.

This dataset is a copy of a subset of the full Stanford Cars dataset

The original dataset contained 16,185 images of 196 classes of cars.

The classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe in the original dataset, and in this subset of the full dataset (v3, TestData and v4, original_raw-images).

v4 (original_raw-images) contains a generated version of the original, raw images, without any modified classes

v8 (classes-Modified_raw-images) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:

  1. bike, moped --remapped to--> motorbike
  2. cng, leguna, easybike, smart fortwo Convertible 2012, and all other specific car makes with named classes (such as Acura TL Type-S 2008) --remapped to--> vehicle
  3. rickshaw, boat, bicycle --> omitted

v9 (FAST-model_mergedAllClasses-augmented_by3x) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:

  1. bike, moped --remapped to--> motorbike
  2. cng, leguna, easybike, smart fortwo Convertible 2012, and all other specific car makes with named classes (such as Acura TL Type-S 2008) --remapped to--> vehicle
  3. rickshaw, boat, bicycle --> omitted

v10 (ACCURATE-model_mergedAllClasses-augmented_by3x) contains a generated version of the raw images, with the Modify Classes preprocessing feature used to remap or omit the following classes:

  1. bike, moped --remapped to--> motorbike
  2. cng, leguna, easybike, smart fortwo Convertible 2012, and all other specific car makes with named classes (such as Acura TL Type-S 2008) --remapped to--> vehicle
  3. rickshaw, boat, bicycle --> omitted
Citation:

3D Object Representations for Fine-Grained Categorization
Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei
4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13). Sydney, Australia. Dec. 8, 2013.
pdf BibTex slides

Background Information

This dataset was curated and annotated by Ishaan Singh, a high school student from India.

The original dataset (v1) is composed of 166 images of various cars present in a junkyard. Training Set: 116 images, Validation Set: 33 images, Testing Set: 17 images.

The dataset is available under the Public License.

Ishaan ultimately used this dataset to create a "Drone Surveillance" system to count the cars using YOLOv5 & Deep SORT (Simple Online and Realtime Tracking with a Deep Association Metric) for a contest organized by ComputerVisionZone.

Here is a video of his final submission for the contest:
Video of Ishaan's Final Model

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 1 (v1) - 166 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

KAIST dataset is originally based on Thermal Infrared and corresponding RGB image pairs. I have made a model that can convert night-time infrared to day-time RGB using TIC-cGAN (https://arxiv.org/pdf/1810.05399). I have trained the YOLOv5 model on daytime RGB images and tested on generated fake day-time images from night-time infrared input to check mAP of proposed approach.

This car dataset can be used to detect non-commercial four wheeled vehicles of different makes and models from multiple angles.

Use this cars dataset and detection api to create computer vision applications for car counting, traffic density, parking monitoring, and more!

Use your home security camera to detect when parking spots are available using code from this object detection project:
https://blog.roboflow.com/object-tracking-how-to/

Self-Driving Thermal Object-Detection

Overview

This model detects potentially moving objects (cars, bicycles, people, and dogs), to aid in self-driving and autonomous vehicles.

Dataset

The dataset is comprised of over twelve thousand thermal images, largely annotating cars.

Parking Lot Dataset

The PKLot dataset contains 12,416 images of parking lots extracted from surveilance camera frames. There are images on sunny, cloudy, and rainy days and the parking spaces are labeled as occupied or empty. We have converted the original annotations to a variety of standard object detection formats by enclosing a bounding box around the original dataset's rotated rectangle annotations.

Using this Dataset

The PKLot database is licensed under a Creative Commons Attribution 4.0 License and may be used provided you acknowledge the source by citing the PKLot paper in publications about your research:

Almeida, P., Oliveira, L. S., Silva Jr, E., Britto Jr, A., Koerich, A., PKLot – A robust dataset for parking lot classification, Expert Systems with Applications, 42(11):4937-4949, 2015.
                            

This dataset was originally created by Justin Henke and Reginald Viray. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/psi-dhxqe/psi-rossville-pano.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Yudha Bhakti Nugraha and Kris. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/7-class/11-11-2021-09.41.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

Occluded Object Dataset

Data Collection:

We took 3 different cars: A minivan, a wagon, and an SUV. Then, we recorded road scenes of dhaka at the same time from 3 different perspective. Each of the 3 cars maintained different lanes, had 3 camera angles, but cruised through the same same roads at the same time.
We carried this on for 1 hour and 15 minutes.

Image example

Overview

This dataset contains 627 images of various vehicle classes for object detection. These images are derived from the Open Images open source computer vision datasets.

This dataset only scratches the surface of the Open Images dataset for vehicles!

Image example

Use Cases

  • Train object detector to differentiate between a car, bus, motorcycle, ambulance, and truck.
  • Checkpoint object detector for autonomous vehicle detector
  • Test object detector on high density of ambulances in vehicles
  • Train ambulance detector
  • Explore the quality and range of Open Image dataset

Tools Used to Derive Dataset

Image example

These images were gathered via the OIDv4 Toolkit This toolkit allows you to pick an object class and retrieve a set number of images from that class with bound box lables.

We provide this dataset as an example of the ability to query the OID for a given subdomain. This dataset can easily be scaled up - please reach out to us if that interests you.

Overview

The original Udacity Self Driving Car Dataset is missing labels for thousands of pedestrians, bikers, cars, and traffic lights. This will result in poor model performance. When used in the context of self driving cars, this could even lead to human fatalities.

We re-labeled the dataset to correct errors and omissions. We have provided convenient downloads in many formats including VOC XML, COCO JSON, Tensorflow Object Detection TFRecords, and more.

Some examples of labels missing from the original dataset:
Examples of Missing Labels

Stats

The dataset contains 97,942 labels across 11 classes and 15,000 images. There are 1,720 null examples (images with no labels).

All images are 1920x1200 (download size ~3.1 GB). We have also provided a version downsampled to 512x512 (download size ~580 MB) that is suitable for most common machine learning models (including YOLO v3, Mask R-CNN, SSD, and mobilenet).

Annotations have been hand-checked for accuracy by Roboflow.

Class Balance

Annotation Distribution:
Annotation Heatmap

Use Cases

Udacity is building an open source self driving car! You might also try using this dataset to do person-detection and tracking.

Using this Dataset

Our updates to the dataset are released under the MIT License (the same license as the original annotations and images).

Note: the dataset contains many duplicated bounding boxes for the same subject which we have not corrected. You will probably want to filter them by taking the IOU for classes that are 100% overlapping or it could affect your model performance (expecially in stoplight detection which seems to suffer from an especially severe case of duplicated bounding boxes).

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
:fa-spacer:

Roboflow Wordmark