ImageClassification¶
- Number of tasks: 22
Birdsnap¶
Classifying bird images from 500 species.
Dataset: isaacchung/birdsnap
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{Berg_2014_CVPR,
author = {Berg, Thomas and Liu, Jiongxin and Woo Lee, Seung and Alexander, Michelle L. and Jacobs, David W. and Belhumeur, Peter N.},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
title = {Birdsnap: Large-scale Fine-grained Visual Categorization of Birds},
year = {2014},
}
CIFAR10¶
Classifying images from 10 classes.
Dataset: uoft-cs/cifar10
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Web | derived | created |
Citation
@techreport{Krizhevsky09learningmultiple,
author = {Alex Krizhevsky},
institution = {},
title = {Learning multiple layers of features from tiny images},
year = {2009},
}
CIFAR100¶
Classifying images from 100 classes.
Dataset: uoft-cs/cifar100
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to text (i2t) | accuracy | eng | Web | derived | created |
Citation
@techreport{Krizhevsky09learningmultiple,
author = {Alex Krizhevsky},
institution = {},
title = {Learning multiple layers of features from tiny images},
year = {2009},
}
Caltech101¶
Classifying images of 101 widely varied objects.
Dataset: mteb/Caltech101
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{1384978,
author = {Li Fei-Fei and Fergus, R. and Perona, P.},
booktitle = {2004 Conference on Computer Vision and Pattern Recognition Workshop},
doi = {10.1109/CVPR.2004.383},
keywords = {Bayesian methods;Testing;Humans;Maximum likelihood estimation;Assembly;Shape;Machine vision;Image recognition;Parameter estimation;Image databases},
number = {},
pages = {178-178},
title = {Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories},
volume = {},
year = {2004},
}
Country211¶
Classifying images of 211 countries.
Dataset: clip-benchmark/wds_country211
• License: cc-by-sa-4.0 • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Scene | derived | created |
Citation
@article{radford2021learning,
author = {Radford, Alec and Kim, Jong Wook and Hallacy, Chris and Ramesh, Aditya and Goh, Gabriel and Agarwal, Sandhini and Sastry, Girish and Askell, Amanda and Mishkin, Pamela and Clark, Jack and others},
journal = {arXiv preprint arXiv:2103.00020},
title = {Learning Transferable Visual Models From Natural Language Supervision},
year = {2021},
}
DTD¶
Describable Textures Dataset in 47 categories.
Dataset: tanganke/dtd
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{cimpoi14describing,
author = {M. Cimpoi and S. Maji and I. Kokkinos and S. Mohamed and and A. Vedaldi},
booktitle = {Proceedings of the {IEEE} Conf. on Computer Vision and Pattern Recognition ({CVPR})},
title = {Describing Textures in the Wild},
year = {2014},
}
EuroSAT¶
Classifying satellite images.
Dataset: timm/eurosat-rgb
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@article{8736785,
author = {Helber, Patrick and Bischke, Benjamin and Dengel, Andreas and Borth, Damian},
doi = {10.1109/JSTARS.2019.2918242},
journal = {IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing},
keywords = {Satellites;Earth;Remote sensing;Machine learning;Spatial resolution;Feature extraction;Benchmark testing;Dataset;deep convolutional neural network;deep learning;earth observation;land cover classification;land use classification;machine learning;remote sensing;satellite image classification;satellite images},
number = {7},
pages = {2217-2226},
title = {EuroSAT: A Novel Dataset and Deep Learning Benchmark for Land Use and Land Cover Classification},
volume = {12},
year = {2019},
}
FER2013¶
Classifying facial emotions.
Dataset: clip-benchmark/wds_fer2013
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@misc{goodfellow2015explainingharnessingadversarialexamples,
archiveprefix = {arXiv},
author = {Ian J. Goodfellow and Jonathon Shlens and Christian Szegedy},
eprint = {1412.6572},
primaryclass = {stat.ML},
title = {Explaining and Harnessing Adversarial Examples},
url = {https://arxiv.org/abs/1412.6572},
year = {2015},
}
FGVCAircraft¶
Classifying aircraft images from 41 manufacturers and 102 variants.
Dataset: HuggingFaceM4/FGVC-Aircraft
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@misc{maji2013finegrainedvisualclassificationaircraft,
archiveprefix = {arXiv},
author = {Subhransu Maji and Esa Rahtu and Juho Kannala and Matthew Blaschko and Andrea Vedaldi},
eprint = {1306.5151},
primaryclass = {cs.CV},
title = {Fine-Grained Visual Classification of Aircraft},
url = {https://arxiv.org/abs/1306.5151},
year = {2013},
}
Food101Classification¶
Classifying food.
Dataset: ethz/food101
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Web | derived | created |
Citation
@inproceedings{bossard14,
author = {Bossard, Lukas and Guillaumin, Matthieu and Van Gool, Luc},
booktitle = {European Conference on Computer Vision},
title = {Food-101 -- Mining Discriminative Components with Random Forests},
year = {2014},
}
GTSRB¶
The German Traffic Sign Recognition Benchmark (GTSRB) is a multi-class classification dataset for traffic signs. It consists of dataset of more than 50,000 traffic sign images. The dataset comprises 43 classes with unbalanced class frequencies.
Dataset: clip-benchmark/wds_gtsrb
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Scene | derived | created |
Citation
@inproceedings{6033395,
author = {Stallkamp, Johannes and Schlipsing, Marc and Salmen, Jan and Igel, Christian},
booktitle = {The 2011 International Joint Conference on Neural Networks},
doi = {10.1109/IJCNN.2011.6033395},
keywords = {Humans;Training;Image color analysis;Benchmark testing;Lead;Histograms;Image resolution},
number = {},
pages = {1453-1460},
title = {The German Traffic Sign Recognition Benchmark: A multi-class classification competition},
volume = {},
year = {2011},
}
Imagenet1k¶
ImageNet, a large-scale ontology of images built upon the backbone of the WordNet structure.
Dataset: clip-benchmark/wds_imagenet1k
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Scene | human-annotated | created |
Citation
@article{deng2009imagenet,
author = {Deng, Jia and Dong, Wei and Socher, Richard and Li, Li-Jia and Li, Kai and Fei-Fei, Li},
journal = {2009 IEEE Conference on Computer Vision and Pattern Recognition},
organization = {Ieee},
pages = {248--255},
title = {ImageNet: A large-scale hierarchical image database},
year = {2009},
}
MNIST¶
Classifying handwritten digits.
Dataset: ylecun/mnist
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@article{lecun2010mnist,
author = {LeCun, Yann and Cortes, Corinna and Burges, CJ},
journal = {ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
title = {MNIST handwritten digit database},
volume = {2},
year = {2010},
}
OxfordFlowersClassification¶
Classifying flowers
Dataset: nelorth/oxford-flowers
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Reviews | derived | found |
Citation
@inproceedings{4756141,
author = {Nilsback, Maria-Elena and Zisserman, Andrew},
booktitle = {2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing},
doi = {10.1109/ICVGIP.2008.47},
keywords = {Shape;Kernel;Distributed computing;Support vector machines;Support vector machine classification;object classification;segmentation},
number = {},
pages = {722-729},
title = {Automated Flower Classification over a Large Number of Classes},
volume = {},
year = {2008},
}
OxfordPets¶
Classifying animal images.
Dataset: isaacchung/OxfordPets
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{6248092,
author = {Parkhi, Omkar M and Vedaldi, Andrea and Zisserman, Andrew and Jawahar, C. V.},
booktitle = {2012 IEEE Conference on Computer Vision and Pattern Recognition},
doi = {10.1109/CVPR.2012.6248092},
keywords = {Positron emission tomography;Image segmentation;Cats;Dogs;Layout;Deformable models;Head},
number = {},
pages = {3498-3505},
title = {Cats and dogs},
volume = {},
year = {2012},
}
PatchCamelyon¶
Histopathology diagnosis classification dataset.
Dataset: clip-benchmark/wds_vtab-pcam
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Medical | derived | created |
Citation
@inproceedings{10.1007/978-3-030-00934-2_24,
abstract = {We propose a new model for digital pathology segmentation, based on the observation that histopathology images are inherently symmetric under rotation and reflection. Utilizing recent findings on rotation equivariant CNNs, the proposed model leverages these symmetries in a principled manner. We present a visual analysis showing improved stability on predictions, and demonstrate that exploiting rotation equivariance significantly improves tumor detection performance on a challenging lymph node metastases dataset. We further present a novel derived dataset to enable principled comparison of machine learning models, in combination with an initial benchmark. Through this dataset, the task of histopathology diagnosis becomes accessible as a challenging benchmark for fundamental machine learning research.},
address = {Cham},
author = {Veeling, Bastiaan S.
and Linmans, Jasper
and Winkens, Jim
and Cohen, Taco
and Welling, Max},
booktitle = {Medical Image Computing and Computer Assisted Intervention -- MICCAI 2018},
editor = {Frangi, Alejandro F.
and Schnabel, Julia A.
and Davatzikos, Christos
and Alberola-L{\'o}pez, Carlos
and Fichtinger, Gabor},
isbn = {978-3-030-00934-2},
pages = {210--218},
publisher = {Springer International Publishing},
title = {Rotation Equivariant CNNs for Digital Pathology},
year = {2018},
}
RESISC45¶
Remote Sensing Image Scene Classification by Northwestern Polytechnical University (NWPU).
Dataset: timm/resisc45
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@article{7891544,
author = {Cheng, Gong and Han, Junwei and Lu, Xiaoqiang},
doi = {10.1109/JPROC.2017.2675998},
journal = {Proceedings of the IEEE},
keywords = {Remote sensing;Benchmark testing;Spatial resolution;Social network services;Satellites;Image analysis;Machine learning;Unsupervised learning;Classification;Benchmark data set;deep learning;handcrafted features;remote sensing image;scene classification;unsupervised feature learning},
number = {10},
pages = {1865-1883},
title = {Remote Sensing Image Scene Classification: Benchmark and State of the Art},
volume = {105},
year = {2017},
}
STL10¶
Classifying 96x96 images from 10 classes.
Dataset: tanganke/stl10
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{pmlr-v15-coates11a,
address = {Fort Lauderdale, FL, USA},
author = {Coates, Adam and Ng, Andrew and Lee, Honglak},
booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics},
editor = {Gordon, Geoffrey and Dunson, David and DudÃk, Miroslav},
month = {11--13 Apr},
pages = {215--223},
pdf = {http://proceedings.mlr.press/v15/coates11a/coates11a.pdf},
publisher = {PMLR},
series = {Proceedings of Machine Learning Research},
title = {An Analysis of Single-Layer Networks in Unsupervised Feature Learning},
url = {https://proceedings.mlr.press/v15/coates11a.html},
volume = {15},
year = {2011},
}
SUN397¶
Large scale scene recognition in 397 categories.
Dataset: dpdl-benchmark/sun397
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{5539970,
author = {Xiao, Jianxiong and Hays, James and Ehinger, Krista A. and Oliva, Aude and Torralba, Antonio},
booktitle = {2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition},
doi = {10.1109/CVPR.2010.5539970},
number = {},
pages = {3485-3492},
title = {SUN database: Large-scale scene recognition from abbey to zoo},
volume = {},
year = {2010},
}
StanfordCars¶
Classifying car images from 196 makes.
Dataset: isaacchung/StanfordCars
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Encyclopaedic | derived | created |
Citation
@inproceedings{Krause2013CollectingAL,
author = {Jonathan Krause and Jia Deng and Michael Stark and Li Fei-Fei},
title = {Collecting a Large-scale Dataset of Fine-grained Cars},
url = {https://api.semanticscholar.org/CorpusID:16632981},
year = {2013},
}
UCF101¶
UCF101 is an action recognition data set of realistic action videos collected from YouTube, having 101 action categories. This version of the dataset does not contain images but images saved frame by frame. Train and test splits are generated based on the authors' first version train/test list.
Dataset: flwrlabs/ucf101
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | accuracy | eng | Scene | derived | created |
Citation
@misc{soomro2012ucf101dataset101human,
archiveprefix = {arXiv},
author = {Khurram Soomro and Amir Roshan Zamir and Mubarak Shah},
eprint = {1212.0402},
primaryclass = {cs.CV},
title = {UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild},
url = {https://arxiv.org/abs/1212.0402},
year = {2012},
}
VOC2007¶
Classifying bird images from 500 species.
Dataset: HuggingFaceM4/pascal_voc
• License: not specified • Learn more →
Task category | Score | Languages | Domains | Annotations Creators | Sample Creation |
---|---|---|---|---|---|
image to category (i2c) | lrap | eng | Encyclopaedic | derived | created |
Citation
@article{Everingham10,
author = {Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.},
journal = {International Journal of Computer Vision},
month = jun,
number = {2},
pages = {303--338},
title = {The Pascal Visual Object Classes (VOC) Challenge},
volume = {88},
year = {2010},
}