Browse » Agriculture

Top Agriculture Datasets

The use-cases for computer vision in agriculture are endless. From weed detection, to crop disease treatment, to automated spraying via drones, to autonomous tractors, to color sorting, to livestock monitoring, these datasets and pre-trained models can be used to optimize farmers' productivity, and boost yield, decrease costs, and increase profits.

For more information: https://roboflow.com/industries/agriculture

Top 6 agriculture datasets: https://blog.roboflow.com/top-agriculture-datasets-computer-vision/

PlantDoc dataset overview: https://blog.roboflow.com/introducing-an-improved-plantdoc-dataset-for-plant-disease-object-detection/

Overview

The Weeds dataset is a collection of garden weeds that can easily confuse object detection models due to similiarity of the weeds compared to its surroundings. This dataset was used with YOLOR for object detection to detect weeds in complex backgrounds.

Example Footage!

Weeds Detection

Training and Deployment

The weeds model has been trained in Roboflow, available for inference on the Dataset tab.

One could also build a Weeds Detector using YOLOR. This is achieved using the Roboflow Platform which you can deploy the model for robust and real-time detections. You can learn more here: https://augmentedstartups.info/YOLOR-Get-Started

About Augmented Startups

We are at the forefront of Artificial Intelligence in computer vision. With over 92k subscribers on YouTube, we embark on fun and innovative projects in this field and create videos and courses so that everyone can be an expert in this field. Our vision is to create a world full of inventors that can turn their dreams into reality

This project was created by Arfiani Nur Sayidah and is for sorting "apples" from "damaged apples."

The classes are "apple" and "damaged_apples"
Original Class Balance:

  1. apple: 2,152
  2. damaged_apple: 708

Soybeans kernels counter

Background Information

This dataset was curated and annotated by Mohamed Traore from the Roboflow Team. A custom dataset composed of one class (chicken). The main objective is to identify chicken(s) and perform object-tracking on chicken(s) using Roboflow's "zero shot object tracking."

The original video is from Wendy Thomas (Description: "Definitive proof that the chicken crossed the road to get to the other side.")

The original custom dataset (v1) is composed of 106 images of chickens and their surrounding environment.

The dataset is available under the Public License.

Zero Shot Object Tracking

Example - Zero Shot Object Tracking

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 1 (v1) - 106 images

  • Preprocessing: Auto-Orient
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 2 (v2) - 106 images

  • Preprocessing: Auto-Orient and Resize (Stretch to 416x416)
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 3 (v3), "v1-augmented-COCO-transferLearning" - 254 images

Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow

  • 3x image generation

Version 11 (v11), "v1-augmented-trainFromScratch" - 463 images

Trained from the Version 3 training checkpoint.

  • Modify Classes was applied to remap the "chickens" class to "rooster" (meaning "rooster" will show up for the bounding boxes when running inference).
  • 3x image generation

Version 12 (v12) - 185 images

  • Preprocessing: Auto-Orient, Modify Classes (remap the "chickens" class to "rooster")
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Mohamed Traore - LinkedIn

The Apple Vision annotated data set contains over 350 images of naturally growing apples on an apple tree. Unlike other existing sets, this set attempted to capture apples growing on trees with different exposures of natural light during the daytime.

The training data was comprised of 77 photos taken of Peter Bloch’s home apple tree. These images were shot between July and September of 2021 on an iPhone 11 camera. After the photos were taken, they were sliced into multiple smaller images with a resolution of 360 × 640 pixels per image. This number was selected as the lowest natural resolution for a CV camera later used in this project.

This set was originally created for the ECE 31 Capstone project at Oregon State University.

Beans, Strawberry and Tomato diseases - v1 2022-09-01

This dataset was exported via roboflow.com on Sep 2, 2022

Roboflow is an end-to-end computer vision platform that helps you

  • collaborate with your team on computer vision projects
  • collect & organize images
  • understand unstructured image data
  • annotate, and create datasets
  • export, train, and deploy computer vision models
  • use active learning to improve your dataset over time

It includes 5494 images.
Diseases are annotated in YOLO v7 PyTorch format.

The following pre-processing was applied to each image:

  • Auto-orientation of pixel data (with EXIF-orientation stripping)
  • Resize to 416x416 (Stretch)

No image augmentation techniques were applied.

Classes:
Strawberry
'Angular Leafspot'
'Anthracnose Fruit Rot'
'Blossom Blight'
'Gray Mold'
'Leaf Spot'
'Powdery Mildew Fruit'
'Powdery Mildew Leaf'
Tomato
'disease'
'leaf mold'
'spider mites'
Bean
'ALS'
'Bean Rust'

Purpose of the Project

This project started as a way to add real-time counts of bees with/without pollen entering my backyard beehive to append some additional information to a livestream of the hive, and to correlate behavior at the hive entrance to weather, temperature, etc. Since then, I've added additional training data not specific to my hive which accounts for classification of drones and queens in addition to bees/pollen bees. Currently the model generalizes reasonably well, but more training data is required.

Assessed Classes & Labeling Guidelines

  • bees (either workers or foragers)
  • bees carrying pollen
  • drones
  • queens

Labeling should cover the entire body of the bee, excluding the wings as per the following example:

For the class of bees carrying pollen, it is acceptable to extend the box to include the visible pollen packs to distinguish this from the bee class:

Sample Results

Sample video of backyard hive entrance with low to moderate level of activity:
https://www.youtube.com/watch?v=qZW5eYd0Yw8&t=2266s

Generic sample video of single bee:
https://www.youtube.com/watch?v=A1x6VA8TWCg

Latest YOLOv5 Weights Files

Latest weights files for use by others are posted on my github here: https://github.com/mattnudi/bee-detection

These files will be updated as more images are added to the dataset

Background Information

This dataset was curated and annotated by Ahmed Elmogtaba Abdelaziz.

The original dataset (v6) is composed of 204 images of honeybees present in a wide variety of scenes.
Example of an Annotated Image from the Dataset

The dataset is available under a Public License.

Getting Started

You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.

Dataset Versions

Version 5 - 490 images

  • Preprocessing: Resize, 416 by 416
  • Augmentations:
  • 90° Rotate: Clockwise, Counter-Clockwise
    Rotation: Between -15° and +15°
    Saturation: Between -10% and +10%
    Brightness: Between -10% and +10%
    Blur: Up to 0.25px
    Mosaic: Applied
  • Output: 3x image generation
ye

2022-10-19发现roboflow导出的图片旋转但标记不旋转的bug! auto-orient也没用

银耳缺陷数据集

缺陷可由以下情况各种组合 标记起来棘手

  • m :霉
  • l :烂
  • b:黑
  • q:切伤
  • g:根部
  • n:粘附
    w , h 小于0.04%算小目标了

e1 (功能上一定要识别出来的,不好标记,就标记了几个)

  • 极易区分的缺陷部分

e-m (AP .95)

  • 白块 (叶片白块化,单片、多片:伴有塌陷)
  • 长毛(白毛、绿毛)
  • 霉变:烂掉的霉变,烂绿
  • 霉变:菇头变霉变,黄绿

e-b

  • 纯粹胡须:一小撮、一片(通常环形分布) (面积不大没事)

e-b-l

  • 胡须加四周异常色:有点烂的颜色 (面积不大没事)

e-l (需要再细分,AP.27)

  • 小面积轻度烂面,发油偏向褐色
  • 大面积轻度烂面,发油面积大

e-l-q

  • 因烂面被切掉的

e-l-n

  • 轻度烂面+粘连污物

e-l-bb

  • 重度烂面(褐色、黑褐色)

e-l-g

  • 根部四周 e-l

e-n (数据太少)

  • 清洗后粘上尘土
  • 碰到烤箱壁后的

err

  • 不顺眼的:颜色不行
  • 其他类别也有被归为此类的

e-s 绳子(样本很少)

e-f (数据太少)

  • 被压扁的

This Dataset contains images of popular North American mushrooms, Chicken of the Woods and Chanterelle, differentiating between the two species.

This dataset is an example of an object detection task that is possible via custom training with Roboflow.

Two versions are listed. "416x416" is a 416 resolution version that contains the base images in the dataset. "416x416augmented" contains the same images with various image augmentations applied to build a more robust model.

This Dataset contains images of popular North American mushrooms, Chicken of the Woods and Chanterelle, differentiating between the two species.

This dataset is an example of an object detection task that is possible via custom training with Roboflow.

Two versions are listed. "416x416" is a 416 resolution version that contains the base images in the dataset. "416x416augmented" contains the same images with various image augmentations applied to build a more robust model.

Background Information

This dataset was curated and annotated by - Karel Cornelis.

The original dataset (v1) is composed of 516 images of various ingredients inside a fridge. The project was created as part of a groupwork for a postgraduate applied AI at Erasmus Brussels - we made an object detection model to identify ingredients in a fridge.

From the recipe dataset we used (which is a subset of the recipe1M dataset) we distilled the top50 ingredients and used 30 of those to randomly fill our fridge.

Read this blog post to learn more about the model production process: How I Used Computer Vision to Make Sense of My Fridge

Watch this video to see the model in action: AICook

The dataset is available under the MIT License.

Getting Started

You can download this dataset for use within your own project, fork it into a workspace on Roboflow to create your own model, or test one of the trained versions within the app.

Dataset Versions

Version 1 (v1) - 516 images (original-images)

  • Preprocessing: Auto-Orient
  • Augmentations: No augmentations applied
  • Training Metrics: This version of the dataset was not trained

Version 2 (v2) - 3,050 images (aicook-augmented-trainFromCOCO)

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416)
  • Augmentations:
    • Outputs per training example: 8
      Rotation: Between -3° and +3°
      Exposure: Between -20% and +20%
      Blur: Up to 3px
      Noise: Up to 5% of pixels
      Cutout: 12 boxes with 10% size each
  • Training Metrics: Trained from the COCO Checkpoint in Public Models ("transfer learning") on Roboflow
    • mAP = 97.6%, precision = 86.9%, recall = 98.5%

Version 3 (v3) - 3,050 images (aicook-augmented-trainFromScratch)

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416)
  • Augmentations:
    • Outputs per training example: 8
      Rotation: Between -3° and +3°
      Exposure: Between -20% and +20%
      Blur: Up to 3px
      Noise: Up to 5% of pixels
      Cutout: 12 boxes with 10% size each
  • Training Metrics: Trained from "scratch" (no transfer learning employed) on Roboflow
    • mAP = 97.9%, precision = 79.6%, recall = 98.6%

Version 4 (v4) - 3,050 images images (aicook-augmented)

  • Preprocessing: Auto-Orient, Resize (Stretch to 416x416)
  • Augmentations:
    • Outputs per training example: 8
      Rotation: Between -3° and +3°
      Exposure: Between -20% and +20%
      Blur: Up to 3px
      Noise: Up to 5% of pixels
      Cutout: 12 boxes with 10% size each
  • Training Metrics: This version of the dataset was not trained

Karel Cornelis - LinkedIn

Image example

Overview

This dataset contains 581 images of various shellfish classes for object detection. These images are derived from the Open Images open source computer vision datasets.

This dataset only scratches the surface of the Open Images dataset for shellfish!

Image example

Use Cases

  • Train object detector to differentiate between a lobster, shrimp, and crab.
  • Train object dector to differentiate between shellfish
  • Object detection dataset across different sub-species
  • Object detection among related species
  • Test object detector on highly related objects
  • Train shellfish detector
  • Explore the quality and range of Open Image dataset

Tools Used to Derive Dataset

Image example

These images were gathered via the OIDv4 Toolkit This toolkit allows you to pick an object class and retrieve a set number of images from that class with bound box lables.

We provide this dataset as an example of the ability to query the OID for a given subdomain. This dataset can easily be scaled up - please reach out to us if that interests you.

This dataset is derived by the following publication:

Kaspars Sudars, Janis Jasko, Ivars Namatevs, Liva Ozola, Niks Badaukis,
Dataset of annotated food crops and weed images for robotic computer vision control,
Data in Brief,
Volume 31,
2020,
105833,
ISSN 2352-3409,
https://doi.org/10.1016/j.dib.2020.105833.
(https://www.sciencedirect.com/science/article/pii/S2352340920307277)
Abstract: Weed management technologies that can identify weeds and distinguish them from crops are in need of artificial intelligence solutions based on a computer vision approach, to enable the development of precisely targeted and autonomous robotic weed management systems. A prerequisite of such systems is to create robust and reliable object detection that can unambiguously distinguish weed from food crops. One of the essential steps towards precision agriculture is using annotated images to train convolutional neural networks to distinguish weed from food crops, which can be later followed using mechanical weed removal or selected spraying of herbicides. In this data paper, we propose an open-access dataset with manually annotated images for weed detection. The dataset is composed of 1118 images in which 6 food crops and 8 weed species are identified, altogether 7853 annotations were made in total. Three RGB digital cameras were used for image capturing: Intel RealSense D435, Canon EOS 800D, and Sony W800. The images were taken on food crops and weeds grown in controlled environment and field conditions at different growth stages
Keywords: Computer vision; Object detection; Image annotation; Precision agriculture; Crop growth and development

Overview

The Aerial Sheep dataset contains images taken from a birds-eye view with instances of sheep in them. Images do not differentiate between gender or breed of sheep, instead grouping them into a single class named "sheep".

Example Footage

Aerial Sheep

See RIIS's sheep counter application for additional use case examples.
Link - https://riis.com/blog/counting-sheep-using-drones-and-ai/

About RIIS

https://riis.com/about/

Banana Ripening Process Dataset and Model

This dataset contains images of the classes below:

  • freshripe
  • freshunripe
  • overripe
  • ripe
  • rotten
  • unripe

Usage

This is an object detection model that can be used to possibly identify where in the Fruit Ripening Process fruit at stores are and when to take them off the shelves and put them in composting.

This dataset was originally created by Melanie S. Capalungan, "B-Jay" Daguio, Isaac Balbuena, Reanne Joy Rafael. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/molds-onbk3/peanuts-mckge/.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Roopa Shree, Shriya J. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/cotton-nqp2x/bt-cotton.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Arfiani Nur Sayidah. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/arfiani-nur-sayidah-9lizr/apple-sorting-2bfhk.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Nirmani. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/nirmani/yolo-custome-925.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Quandong Qian. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/quandong-qian/desease-cotton-plant.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Rinat Landman. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/lettucedetector/complete_dataset_0910.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by Jan Douwe. To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/jan-douwe/testbl.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

This dataset was originally created by 윤태원 (yuntaewon), 황혜윤 (hwanghyeyun), 김민서 (gimminseo), 김노현 (gimnohyeon) , 신다홍 (sindahong), 김성수 (gimseongsu). To see the current project, which may have been updated since this version, please go here: https://universe.roboflow.com/puri/puri4-ygapu.

This dataset is part of RF100, an Intel-sponsored initiative to create a new object detection benchmark for model generalizability.

Access the RF100 Github repo: https://github.com/roboflow-ai/roboflow-100-benchmark

We will try to retain the wide and short size of these images. This is how the Oak-D camera will send video to the NN model, wide and short, so that the weed spraying wand will be directed left or right only, for it dosn't have fore and aft movement yet. The machine moves forward, of course, and we plan that the speed will allow for good timing and spraying of the weeds. For this reason, only the strongest detection will be sent to Raspi and servo for moving the wand; there won't be time to spray two spots in an image. We will train in Yolo5 on Ultralytics, save to Openvino format, then convert that into the "blob" that Oak requires. The Raspi will also run the Python main script and control the relay and pump for herbicide application.

deepNIR: Dataset for generating synthetic NIR images and improved fruit detection system using deep learning techniques

This page introduces our Blueberry dataset (Bounding box) imported from deepFruits dataset and used for producing results. if this dataset is helpful for your research or application, please do consider to cite our paper as follow;

@Article{s22134721,
                            AUTHOR = {Sa, Inkyu and Lim, Jong Yoon and Ahn, Ho Seok and MacDonald, Bruce},
                            TITLE = {deepNIR: Datasets for Generating Synthetic NIR Images and Improved Fruit Detection System Using Deep Learning Techniques},
                            JOURNAL = {Sensors},
                            VOLUME = {22},
                            YEAR = {2022},
                            NUMBER = {13},
                            ARTICLE-NUMBER = {4721},
                            URL = {https://www.mdpi.com/1424-8220/22/13/4721},
                            PubMedID = {35808218},
                            ISSN = {1424-8220},
                            DOI = {10.3390/s22134721}
                            }
                            

You can download our published paper from:

https://www.mdpi.com/1424-8220/22/13/4721

Please contact us if you have questions, suggestions, or concerns to enddl22@gmail.com

deepNIR: Dataset for generating synthetic NIR images and improved fruit detection system using deep learning techniques

This page introduces our Avocado dataset (Bounding box) imported from deepFruits dataset and used for producing results. if this dataset is helpful for your research or application, please do consider to cite our paper as follow;

@Article{s22134721,
                            AUTHOR = {Sa, Inkyu and Lim, Jong Yoon and Ahn, Ho Seok and MacDonald, Bruce},
                            TITLE = {deepNIR: Datasets for Generating Synthetic NIR Images and Improved Fruit Detection System Using Deep Learning Techniques},
                            JOURNAL = {Sensors},
                            VOLUME = {22},
                            YEAR = {2022},
                            NUMBER = {13},
                            ARTICLE-NUMBER = {4721},
                            URL = {https://www.mdpi.com/1424-8220/22/13/4721},
                            PubMedID = {35808218},
                            ISSN = {1424-8220},
                            DOI = {10.3390/s22134721}
                            }
                            

You can download our published paper from:

https://www.mdpi.com/1424-8220/22/13/4721

Please contact us if you have questions, suggestions, or concerns to enddl22@gmail.com

Allergen30


About Allergen30

Allergen30 is created by Mayank Mishra, Nikunj Bansal, Tanmay Sarkar and Tanupriya Choudhury with a goal of building a robust detection model that can assist people in avoiding possible allergic reactions.

It contains more than 6,000 images of 30 commonly used food items which can cause an adverse reaction within a human body. This dataset is one of the first research attempts in training a deep learning based computer vision model to detect the presence of such food items from images. It also serves as a benchmark for evaluating the efficacy of object detection methods in learning the otherwise difficult visual cues related to food items.

Description of class labels

There are multiple food items pertaining to specific food intolerances which can trigger an allergic reaction. Such food intolerance primarily include Lactose, Histamine, Gluten, Salicylate, Caffeine and Ovomucoid intolerance.
Food intolerance

The following table contains the description relating to the 30 class labels in our dataset.

S. No. Allergen Food label Description
1 Ovomucoid egg Images of egg with yolk (e.g. sunny side up eggs)
2 Ovomucoid whole_egg_boiled Images of soft and hard boiled eggs
3 Lactose/Histamine milk Images of milk in a glass
4 Lactose icecream Images of icecream scoops
5 Lactose cheese Images of swiss cheese
6 Lactose/ Caffeine milk_based_beverage Images of tea/ coffee with milk in a cup/glass
7 Lactose/Caffeine chocolate Images of chocolate bars
8 Caffeine non_milk_based_beverage Images of soft drinks and tea/coffee without milk in a cup/glass
9 Histamine cooked_meat Images of cooked meat
10 Histamine raw_meat Images of raw meat
11 Histamine alcohol Images of alcohol bottles
12 Histamine alcohol_glass Images of wine glasses with alcohol
13 Histamine spinach Images of spinach bundle
14 Histamine avocado Images of avocado sliced in half
15 Histamine eggplant Images of eggplant
16 Salicylate blueberry Images of blueberry
17 Salicylate blackberry Images of blackberry
18 Salicylate strawberry Images of strawberry
19 Salicylate pineapple Images of pineapple
20 Salicylate capsicum Images of bell pepper
21 Salicylate mushroom Images of mushrooms
22 Salicylate dates Images of dates
23 Salicylate almonds Images of almonds
24 Salicylate pistachios Images of pistachios
25 Salicylate tomato Images of tomato and tomato slices
26 Gluten roti Images of roti
27 Gluten pasta Images of one serving of penne pasta
28 Gluten bread Images of bread slices
29 Gluten bread_loaf Images of bread loaf
30 Gluten pizza Images of pizza and pizza slices

Data collection

We used search engines (Google and Bing) to crawl and look for suitable images using JavaScript queries for each food item from the list created. The images with incomplete RGB channels were removed, and the images collected from different search engines were compiled. When downloading images from search engines, many images were irrelevant to the purpose, especially the ones with a lot of text in them. We deployed the EAST text detector to segregate such images. Finally, a comprehensive manual inspection was conducted to ensure the relevancy of images in the dataset.

Fair use

This dataset contains some copyrighted material whose use has not been specifically authorized by the copyright owners. In an effort to advance scientific research, we make this material available for academic research. If you wish to use copyrighted material in our dataset for purposes of your own that go beyond non-commercial research and academic purposes, you must obtain permission directly from the copyright owner. We believe this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit to those who have expressed a prior interest in receiving the included information for non-commercial research and educational purposes.(adapted from Christopher Thomas).

Citation

If you find our dataset useful, please cite us as:

@article{mishra2022allergen30,
                              title={Allergen30: Detecting Food Items with Possible Allergens Using Deep Learning-Based Computer Vision},
                              author={Mishra, Mayank and Sarkar, Tanmay and Choudhury, Tanupriya and Bansal, Nikunj and Smaoui, Slim and Rebezov, Maksim and Shariati, Mohammad Ali and Lorenzo, Jose Manuel},
                              journal={Food Analytical Methods},
                              pages={1--34},
                              year={2022},
                              publisher={Springer}
                            }
                            

Overview

The PlantDoc dataset was originally published by researchers at the Indian Institute of Technology, and described in depth in their paper. One of the paper’s authors, Pratik Kayal, shared the object detection dataset available on GitHub.

PlantDoc is a dataset of 2,569 images across 13 plant species and 30 classes (diseased and healthy) for image classification and object detection. There are 8,851 labels. Read more about how the version available on Roboflow improves on the original version here.

And here's an example image:

Tomato Blight

Fork this dataset (upper right hand corner) to receive the raw images, or (to save space) grab the 416x416 export.

Use Cases

As the researchers from IIT stated in their paper, “plant diseases alone cost the global economy around US$220 billion annually.” Training models to recognize plant diseases earlier dramatically increases yield potential.

The dataset also serves as a useful open dataset for benchmarks. The researchers trained both object detection models like MobileNet and Faster-RCNN and image classification models like VGG16, InceptionV3, and InceptionResnet V2.

The dataset is useful for advancing general agriculture computer vision tasks, whether that be health crop classification, plant disease classification, or plant disease objection.

Using this Dataset

This dataset follows Creative Commons 4.0 protocol. You may use it commercially without Liability, Trademark use, Patent use, or Warranty.

Provide the following citation for the original authors:

@misc{singh2019plantdoc,
                                title={PlantDoc: A Dataset for Visual Plant Disease Detection},
                                author={Davinder Singh and Naman Jain and Pranjali Jain and Pratik Kayal and Sudhakar Kumawat and Nipun Batra},
                                year={2019},
                                eprint={1911.10317},
                                archivePrefix={arXiv},
                                primaryClass={cs.CV}
                            }
                            

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark

Overview

The Fruits dataset is an image classification dataset of various fruits against white backgrounds from various angles, originally open sourced by GitHub user horea. This is a subset of that full dataset.

Example Image:
Example Image

Use Cases

Build a fruit classifier! This could be a just-for-fun project just as much as you could be building a color sorter for agricultural use cases before fruits make their way to market.

Using this Dataset

Use the fork button to copy this dataset to your own Roboflow account and export it with new preprocessing settings (perhaps resized for your model's desired format or converted to grayscale), or additional augmentations to make your model generalize better. This particular dataset would be very well suited for Roboflow's new advanced Bounding Box Only Augmentations.

About Roboflow

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Developers reduce 50% of their code when using Roboflow's workflow, automate annotation quality assurance, save training time, and increase model reproducibility.

Roboflow Workmark