Skip to content

ImageClassification

  • Number of tasks: 22

Birdsnap

Classifying bird images from 500 species.

Dataset: isaacchung/birdsnap • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{Berg_2014_CVPR,
  author = {Berg, Thomas and Liu, Jiongxin and Woo Lee, Seung and Alexander, Michelle L. and Jacobs, David W. and Belhumeur, Peter N.},
  booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  title = {Birdsnap: Large-scale Fine-grained Visual Categorization of Birds},
  year = {2014},
}

CIFAR10

Classifying images from 10 classes.

Dataset: uoft-cs/cifar10 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Web derived created
Citation
@techreport{Krizhevsky09learningmultiple,
  author = {Alex Krizhevsky},
  institution = {},
  title = {Learning multiple layers of features from tiny images},
  year = {2009},
}

CIFAR100

Classifying images from 100 classes.

Dataset: uoft-cs/cifar100 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to text (i2t) accuracy eng Web derived created
Citation
@techreport{Krizhevsky09learningmultiple,
  author = {Alex Krizhevsky},
  institution = {},
  title = {Learning multiple layers of features from tiny images},
  year = {2009},
}

Caltech101

Classifying images of 101 widely varied objects.

Dataset: mteb/Caltech101 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{1384978,
  author = {Li Fei-Fei and Fergus, R. and Perona, P.},
  booktitle = {2004 Conference on Computer Vision and Pattern Recognition Workshop},
  doi = {10.1109/CVPR.2004.383},
  keywords = {Bayesian methods;Testing;Humans;Maximum likelihood estimation;Assembly;Shape;Machine vision;Image recognition;Parameter estimation;Image databases},
  number = {},
  pages = {178-178},
  title = {Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories},
  volume = {},
  year = {2004},
}

Country211

Classifying images of 211 countries.

Dataset: clip-benchmark/wds_country211 • License: cc-by-sa-4.0 • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Scene derived created
Citation
@article{radford2021learning,
  author = {Radford, Alec and Kim, Jong Wook and Hallacy, Chris and Ramesh, Aditya and Goh, Gabriel and Agarwal, Sandhini and Sastry, Girish and Askell, Amanda and Mishkin, Pamela and Clark, Jack and others},
  journal = {arXiv preprint arXiv:2103.00020},
  title = {Learning Transferable Visual Models From Natural Language Supervision},
  year = {2021},
}

DTD

Describable Textures Dataset in 47 categories.

Dataset: tanganke/dtd • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{cimpoi14describing,
  author = {M. Cimpoi and S. Maji and I. Kokkinos and S. Mohamed and and A. Vedaldi},
  booktitle = {Proceedings of the {IEEE} Conf. on Computer Vision and Pattern Recognition ({CVPR})},
  title = {Describing Textures in the Wild},
  year = {2014},
}

EuroSAT

Classifying satellite images.

Dataset: timm/eurosat-rgb • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@article{8736785,
  author = {Helber, Patrick and Bischke, Benjamin and Dengel, Andreas and Borth, Damian},
  doi = {10.1109/JSTARS.2019.2918242},
  journal = {IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
  keywords = {Satellites;Earth;Remote sensing;Machine learning;Spatial resolution;Feature extraction;Benchmark testing;Dataset;deep convolutional neural network;deep learning;earth observation;land cover classification;land use classification;machine learning;remote sensing;satellite image classification;satellite images},
  number = {7},
  pages = {2217-2226},
  title = {EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification},
  volume = {12},
  year = {2019},
}

FER2013

Classifying facial emotions.

Dataset: clip-benchmark/wds_fer2013 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@misc{goodfellow2015explainingharnessingadversarialexamples,
  archiveprefix = {arXiv},
  author = {Ian J. Goodfellow and Jonathon Shlens and Christian Szegedy},
  eprint = {1412.6572},
  primaryclass = {stat.ML},
  title = {Explaining and Harnessing Adversarial Examples},
  url = {https://arxiv.org/abs/1412.6572},
  year = {2015},
}

FGVCAircraft

Classifying aircraft images from 41 manufacturers and 102 variants.

Dataset: HuggingFaceM4/FGVC-Aircraft • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@misc{maji2013finegrainedvisualclassificationaircraft,
  archiveprefix = {arXiv},
  author = {Subhransu Maji and Esa Rahtu and Juho Kannala and Matthew Blaschko and Andrea Vedaldi},
  eprint = {1306.5151},
  primaryclass = {cs.CV},
  title = {Fine-Grained Visual Classification of Aircraft},
  url = {https://arxiv.org/abs/1306.5151},
  year = {2013},
}

Food101Classification

Classifying food.

Dataset: ethz/food101 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Web derived created
Citation
@inproceedings{bossard14,
  author = {Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc},
  booktitle = {European Conference on Computer Vision},
  title = {Food-101 -- Mining Discriminative Components with Random Forests},
  year = {2014},
}

GTSRB

The German Traffic Sign Recognition Benchmark (GTSRB) is a multi-class classification dataset for traffic signs. It consists of dataset of more than 50,000 traffic sign images. The dataset comprises 43 classes with unbalanced class frequencies.

Dataset: clip-benchmark/wds_gtsrb • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Scene derived created
Citation
@inproceedings{6033395,
  author = {Stallkamp, Johannes and Schlipsing, Marc and Salmen, Jan and Igel, Christian},
  booktitle = {The 2011 International Joint Conference on Neural Networks},
  doi = {10.1109/IJCNN.2011.6033395},
  keywords = {Humans;Training;Image color analysis;Benchmark testing;Lead;Histograms;Image resolution},
  number = {},
  pages = {1453-1460},
  title = {The German Traffic Sign Recognition Benchmark: A multi-class classification competition},
  volume = {},
  year = {2011},
}

Imagenet1k

ImageNet, a large-scale ontology of images built upon the backbone of the WordNet structure.

Dataset: clip-benchmark/wds_imagenet1k • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Scene human-annotated created
Citation
@article{deng2009imagenet,
  author = {Deng, Jia and Dong, Wei and Socher, Richard and Li, Li-Jia and Li, Kai and Fei-Fei, Li},
  journal = {2009 IEEE Conference on Computer Vision and Pattern Recognition},
  organization = {Ieee},
  pages = {248--255},
  title = {ImageNet: A large-scale hierarchical image database},
  year = {2009},
}

MNIST

Classifying handwritten digits.

Dataset: ylecun/mnist • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@article{lecun2010mnist,
  author = {LeCun, Yann and Cortes, Corinna and Burges, CJ},
  journal = {ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
  title = {MNIST handwritten digit database},
  volume = {2},
  year = {2010},
}

OxfordFlowersClassification

Classifying flowers

Dataset: nelorth/oxford-flowers • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Reviews derived found
Citation
@inproceedings{4756141,
  author = {Nilsback, Maria-Elena and Zisserman, Andrew},
  booktitle = {2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing},
  doi = {10.1109/ICVGIP.2008.47},
  keywords = {Shape;Kernel;Distributed computing;Support vector machines;Support vector machine classification;object classification;segmentation},
  number = {},
  pages = {722-729},
  title = {Automated Flower Classification over a Large Number of Classes},
  volume = {},
  year = {2008},
}

OxfordPets

Classifying animal images.

Dataset: isaacchung/OxfordPets • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{6248092,
  author = {Parkhi, Omkar M and Vedaldi, Andrea and Zisserman, Andrew and Jawahar, C. V.},
  booktitle = {2012 IEEE Conference on Computer Vision and Pattern Recognition},
  doi = {10.1109/CVPR.2012.6248092},
  keywords = {Positron emission tomography;Image segmentation;Cats;Dogs;Layout;Deformable models;Head},
  number = {},
  pages = {3498-3505},
  title = {Cats and dogs},
  volume = {},
  year = {2012},
}

PatchCamelyon

Histopathology diagnosis classification dataset.

Dataset: clip-benchmark/wds_vtab-pcam • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Medical derived created
Citation
@inproceedings{10.1007/978-3-030-00934-2_24,
  abstract = {We propose a new model for digital pathology segmentation, based on the observation that histopathology images are inherently symmetric under rotation and reflection. Utilizing recent findings on rotation equivariant CNNs, the proposed model leverages these symmetries in a principled manner. We present a visual analysis showing improved stability on predictions, and demonstrate that exploiting rotation equivariance significantly improves tumor detection performance on a challenging lymph node metastases dataset. We further present a novel derived dataset to enable principled comparison of machine learning models, in combination with an initial benchmark. Through this dataset, the task of histopathology diagnosis becomes accessible as a challenging benchmark for fundamental machine learning research.},
  address = {Cham},
  author = {Veeling, Bastiaan S.
and Linmans, Jasper
and Winkens, Jim
and Cohen, Taco
and Welling, Max},
  booktitle = {Medical Image Computing and Computer Assisted Intervention -- MICCAI 2018},
  editor = {Frangi, Alejandro F.
and Schnabel, Julia A.
and Davatzikos, Christos
and Alberola-L{\'o}pez, Carlos
and Fichtinger, Gabor},
  isbn = {978-3-030-00934-2},
  pages = {210--218},
  publisher = {Springer International Publishing},
  title = {Rotation Equivariant CNNs for Digital Pathology},
  year = {2018},
}

RESISC45

Remote Sensing Image Scene Classification by Northwestern Polytechnical University (NWPU).

Dataset: timm/resisc45 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@article{7891544,
  author = {Cheng, Gong and Han, Junwei and Lu, Xiaoqiang},
  doi = {10.1109/JPROC.2017.2675998},
  journal = {Proceedings of the IEEE},
  keywords = {Remote sensing;Benchmark testing;Spatial resolution;Social network services;Satellites;Image analysis;Machine learning;Unsupervised learning;Classification;Benchmark data set;deep learning;handcrafted features;remote sensing image;scene classification;unsupervised feature learning},
  number = {10},
  pages = {1865-1883},
  title = {Remote Sensing Image Scene Classification: Benchmark and State of the Art},
  volume = {105},
  year = {2017},
}

STL10

Classifying 96x96 images from 10 classes.

Dataset: tanganke/stl10 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{pmlr-v15-coates11a,
  address = {Fort Lauderdale, FL, USA},
  author = {Coates, Adam and Ng, Andrew and Lee, Honglak},
  booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics},
  editor = {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav},
  month = {11--13 Apr},
  pages = {215--223},
  pdf = {http://proceedings.mlr.press/v15/coates11a/coates11a.pdf},
  publisher = {PMLR},
  series = {Proceedings of Machine Learning Research},
  title = {An Analysis of Single-Layer Networks in Unsupervised Feature Learning},
  url = {https://proceedings.mlr.press/v15/coates11a.html},
  volume = {15},
  year = {2011},
}

SUN397

Large scale scene recognition in 397 categories.

Dataset: dpdl-benchmark/sun397 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{5539970,
  author = {Xiao, Jianxiong and Hays, James and Ehinger, Krista A. and Oliva, Aude and Torralba, Antonio},
  booktitle = {2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition},
  doi = {10.1109/CVPR.2010.5539970},
  number = {},
  pages = {3485-3492},
  title = {SUN database: Large-scale scene recognition from abbey to zoo},
  volume = {},
  year = {2010},
}

StanfordCars

Classifying car images from 196 makes.

Dataset: isaacchung/StanfordCars • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Encyclopaedic derived created
Citation
@inproceedings{Krause2013CollectingAL,
  author = {Jonathan Krause and Jia Deng and Michael Stark and Li Fei-Fei},
  title = {Collecting a Large-scale Dataset of Fine-grained Cars},
  url = {https://api.semanticscholar.org/CorpusID:16632981},
  year = {2013},
}

UCF101

UCF101 is an action recognition data set of realistic action videos collected from YouTube, having 101 action categories. This version of the dataset does not contain images but images saved frame by frame. Train and test splits are generated based on the authors' first version train/test list.

Dataset: flwrlabs/ucf101 • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) accuracy eng Scene derived created
Citation
@misc{soomro2012ucf101dataset101human,
  archiveprefix = {arXiv},
  author = {Khurram Soomro and Amir Roshan Zamir and Mubarak Shah},
  eprint = {1212.0402},
  primaryclass = {cs.CV},
  title = {UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild},
  url = {https://arxiv.org/abs/1212.0402},
  year = {2012},
}

VOC2007

Classifying bird images from 500 species.

Dataset: HuggingFaceM4/pascal_voc • License: not specified • Learn more →

Task category Score Languages Domains Annotations Creators Sample Creation
image to category (i2c) lrap eng Encyclopaedic derived created
Citation
@article{Everingham10,
  author = {Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.},
  journal = {International Journal of Computer Vision},
  month = jun,
  number = {2},
  pages = {303--338},
  title = {The Pascal Visual Object Classes (VOC) Challenge},
  volume = {88},
  year = {2010},
}