Awesome Computer Vision Resources: From Foundations to Research

A structured learning reference: image fundamentals → classical methods → deep learning → generative artificial intelligence in CV.

Why this list exists

There are dozens of "awesome computer vision" repositories on GitHub. Most are encyclopedic with thousands of links arranged by topic, with no guidance on where to start, what order to read things in, or why one resource matters more than another. They are useful as archives. They are less useful as learning tools.

This list is built around a different idea: curation over comprehensiveness.

Every entry is here because it genuinely helps someone understand computer vision more deeply — not simply because it exists. Resources are organised to reflect how the field is actually learned: from image fundamentals and classical methods, through deep learning, to the transformer-era models that define current research.

What makes this different

	This list	Most other CV lists
Paper context	✅ Why each paper matters, in sequence	❌ Flat citation lists
Evaluation metrics	✅ Full breakdown per task	❌ Rarely covered
Actively maintained	✅ Updated with recent work	⚠️ Many are abandoned
Conference & journal tiers	✅ CORE-ranked, explained	❌ Usually just a list
Multi-language libraries	✅ Python, Rust, MATLAB	❌ Python only

Who this is for

Students starting a CV module or thesis who want a clear first step
Engineers moving into CV who need to fill gaps systematically
Researchers wanting a compact reference for venues, metrics, and landmark papers
Educators looking for a syllabus scaffold they can point students to

💡 New to the field? Start at Courses or Reference Books.
🔬 Already in research? Jump to Popular Articles or Repos.

Python Libraries

Status: ✅ active (updated within 2 years) · ⚠️ legacy (unmaintained but historically useful) · 🗄️ archived (officially abandoned)

OpenCV: Open Source Computer Vision Library · ✅ active
Pillow: The friendly PIL fork (Python Imaging Library) · ✅ active
scikit-image: Collection of algorithms for image processing · ✅ active
SciPy: Open-source software for mathematics, science, and engineering · ✅ active
mmcv: OpenMMLab foundational library for computer vision research · ✅ active
imutils: Convenience functions for basic image processing operations · ✅ active
kornia: Open source differentiable computer vision library for PyTorch · ✅ active
pgmagick: Python wrapper for GraphicsMagick/ImageMagick · ⚠️ legacy
Mahotas: Fast computer vision algorithms in Python · ⚠️ legacy
SimpleCV: Open Source Framework for Machine Vision · 🗄️ archived

Rust Libraries

Status: ✅ active (updated within 2 years) · ⚠️ legacy (unmaintained but historically useful) · 🗄️ archived (officially abandoned)

OpenCV-Rust: Rust bindings for OpenCV 3.4, 4.x, and 5.x · ✅ active
Image: Encoding and decoding images in Rust · ✅ active
ImageProc: Image processing operations built on the image crate · ✅ active
Photon: Rust/WebAssembly image processing library · ⚠️ legacy

MATLAB Libraries

Status: ✅ active (updated within 2 years) · ⚠️ legacy (unmaintained but historically useful) · 🗄️ archived (officially abandoned)

MLV: Mid-level Vision Toolbox, BWLab, University of Toronto · ✅ active
PMT: Piotr's Computer Vision MATLAB Toolbox, P. Dollar · ⚠️ legacy
matlabfns: MATLAB and Octave functions for computer vision and image processing, P. Kovesi, University of Western Australia · ⚠️ legacy
VLFeat: Open source library of popular CV algorithms (SIFT, VLAD, Fisher Vectors, SLIC), A. Vedaldi and B. Fulkerson · ⚠️ legacy
ElencoCode: Loris Nanni's CV functions, University of Padova · ⚠️ legacy

Reference Books

Antonio Torralba, Phillip Isola, William T. Freeman. "Foundations of Computer Vision" MIT Press, (2024). · [Goodreads]
Nixon, Mark, and Alberto Aguado. "Feature extraction and image processing for computer vision" Academic press, (2019). · [Goodreads]
González, Rafael Corsino and Richard E. Woods. "Digital image processing, 4th Edition" (2018). · [Goodreads]
E.R. Davies. "Computer Vision: Principles, Algorithms, Applications, Learning" Academic press, (2017). · [Goodreads]
Prince, Simon. "Computer Vision: Models, Learning, and Inference" (2012). · [Goodreads]
Forsyth, David Alexander and Jean Ponce. "Computer Vision - A Modern Approach, Second Edition" (2011). · [Goodreads]
Szeliski, Richard. "Computer Vision - Algorithms and Applications" Texts in Computer Science (2010). · [Goodreads]
Bishop, Charles M.. "Pattern recognition and machine learning, 5th Edition" Information science and statistics (2007). · [Goodreads]
Harltey, Andrew and Andrew Zisserman. "Multiple view geometry in computer vision (2. ed.)" (2003). · [Goodreads]
Stockman, George C. and Linda G. Shapiro. "Computer Vision" (2001). · [Goodreads]

Courses

Introduction to Computer Vision · 2026 · James Tompkin · Brown
Deep Learning for Computer Vision · 2025 · Fei-Fei Li · Stanford
Advances in Computer Vision · 2023 · William T. Freeman · MIT
OpenCV for Python Developers · 2023 · Patrick Crawford · LinkedIn Learning
Computer Vision · 2021 · Andreas Geiger · University of Tübingen
Computer Vision · 2021 · Yogesh S Rawat / Mubarak Shah · University of Central Florida
Advanced Computer Vision · 2021 · Mubarak Shah · University of Central Florida
Deep Learning for Computer Vision · 2020 · Justin Johnson · University of Michigan
Advanced Deep Learning for Computer Vision · 2020 · Laura Leal-Taixé / Matthias Niessner · Technical University of Munich
Introduction to Digital Image Processing · 2020 · Ahmadreza Baghaie · New York Institute of Technology
Quantitative Imaging · 2019 · Kevin Mader · ETH Zurich
Convolutional Neural Networks for Visual Recognition · 2017 · Fei-Fei Li · Stanford University
Introduction to Digital Image Processing · 2015 · Rich Radke · Rensselaer Polytechnic Institute
Machine Learning for Robotics and Computer Vision · 2014 · Rudolph Triebel · Technical University of Munich
Multiple View Geometry · 2013 · Daniel Cremers · Technical University of Munich
Variational Methods for Computer Vision · 2013 · Daniel Cremers · Technical University of Munich
Computer Vision · 2012 · Mubarak Shah · University of Central Florida
Image and video processing · Guillermo Sapiro · Duke University
Introduction to Computer Vision · Aaron Bobick / Irfan Essa · Udacity

Conferences

Ranks follow CORE Conference Ranking. Acceptance rates are approximate, based on recent editions. Note: in CV and ML, conference prestige often exceeds journal prestige, unlike in most other fields.

CORE Rank A*
- CVPR: Conference on Computer Vision and Pattern Recognition (IEEE) · ~22% acceptance · the highest-volume top-tier CV venue [dblp]
- ICCV: International Conference on Computer Vision (IEEE) · ~26% acceptance · held in odd years only [dblp]
- NeurIPS: Conference on Neural Information Processing Systems · ~26% acceptance · primary venue for ML theory and deep learning [dblp]
- ICML: International Conference on Machine Learning · ~28% acceptance · top ML venue with growing CV presence [dblp]
- ICLR: International Conference on Learning Representations · ~32% acceptance · open-review format; major venue for deep learning and VLMs [dblp]
- ECCV: European Conference on Computer Vision (Springer) · ~28% acceptance · held in even years only [dblp]
- AAAI: AAAI Conference on Artificial Intelligence · ~20% acceptance · broad AI scope with strong CV track [dblp]
- ACMMM: ACM International Conference on Multimedia (ACM) [dblp]
- ICRA: International Conference on Robotics and Automation (IEEE) [dblp]
CORE Rank A
- MICCAI: Conference on Medical Image Computing and Computer Assisted Intervention (Springer) · ~30% acceptance · premier venue for medical imaging [dblp]
- WACV: Winter Conference on Applications of Computer Vision (IEEE) · ~29% acceptance · practical and applied CV; growing rapidly [dblp]
- IROS: International Conference on Intelligent Robots and Systems (IEEE) · covers CV for robotics and perception [dblp]
- ISBI: IEEE International Symposium on Biomedical Imaging (IEEE) [dblp]
- BMVC: British Machine Vision Conference (BMVA) [dblp]
CORE Rank B
- ICPR: International Conference on Pattern Recognition (IEEE) [dblp]
- ACCV: Asian Conference on Computer Vision (Springer) [dblp]
- ICASSP: International Conference on Acoustics, Speech, and Signal Processing (IEEE) [dblp]
- ICIP: International Conference on Image Processing (IEEE) [dblp]
- VISAPP: International Conference on Vision Theory and Applications (SCITEPRESS) [dblp]
- ACIVS: Conference on Advanced Concepts for Intelligent Vision Systems (Springer) [dblp]
- EUSIPCO: European Signal Processing Conference (EURASIP/IEEE) [dblp]
CORE Rank C
- VCIP: International Conference on Visual Communications and Image Processing (IEEE) [dblp]
- CAIP: International Conference on Computer Analysis of Images and Patterns (Springer) [dblp]
- ICISP: International Conference on Image and Signal Processing (Springer) [dblp]
- ICIAR: International Conference on Image Analysis and Recognition (Springer) [dblp]
- ICVS: International Conference on Computer Vision Systems (Springer) [dblp]
Unranked but notable
- MIUA: Medical Image Understanding and Analysis (BMVA) · UK-focused medical imaging [dblp]
- EUVIP: European Workshop on Visual Information Processing (IEEE/EURASIP) [dblp]
- CIC: Color and Imaging Conference (IS&T) [dblp]
- CVCS: Colour and Visual Computing Symposium [dblp]
- DSP: International Conference on Digital Signal Processing (IEEE) [dblp]

Journals

Rankings use the SCImago Journal Rank (SJR) indicator. SJR is a size-independent prestige metric: it weights citations by the influence of the citing journal, not just their count. Quartiles (Q1 to Q4) place each journal within its subject category; Q1 is the top 25%. In computer vision and machine learning, top conferences (CVPR, ICCV, ECCV) often carry more prestige than journals; many researchers publish conference papers first and submit extended versions to journals later.

Core CV and ML Journals
- IEEE TPAMI: Transactions on Pattern Analysis and Machine Intelligence · Q1 · the highest-prestige journal in CV/ML; publishes foundational and survey work [dblp] [scimago]
- Elsevier MedIA: Medical Image Analysis · Q1 · leading venue in medical imaging [dblp] [scimago]
- IEEE TIP: Transactions on Image Processing · Q1 · image processing, analysis, and low-level vision [dblp] [scimago]
- IEEE TMI: Transactions on Medical Imaging · Q1 · premier journal for medical image analysis [dblp] [scimago]
- Elsevier PR: Pattern Recognition · Q1 · broad scope; high volume [dblp] [scimago]
- IJCV: International Journal of Computer Vision (Springer) · Q1 · primary venue for long-form CV research [dblp] [scimago]
- IEEE TCSVT: Transactions on Circuits and Systems for Video Technology · Q1 · video understanding, compression, and streaming [dblp] [scimago]
- IEEE TVCG: Transactions on Visualization and Computer Graphics · Q1 · covers rendering, visual analytics, and 3D vision [dblp] [scimago]
- Elsevier CVIU: Computer Vision and Image Understanding · Q1 [dblp] [scimago]
Robotics and Automation
- IEEE RAL: Robotics and Automation Letters · Q1 · fast-track letters; papers often presented at ICRA or IROS [dblp] [scimago]
Applied and Interdisciplinary
- Elsevier ESWA: Expert Systems with Applications · Q1 · broad applied scope; high volume [dblp] [scimago]
- Elsevier Neurocomputing · Q1 [dblp] [scimago]
- Springer NCA: Neural Computing and Applications · Q1 [dblp] [scimago]
- Elsevier CMIG: Computerized Medical Imaging and Graphics · Q1 [dblp] [scimago]
- Elsevier CMPB: Computer Methods and Programs in Biomedicine · Q1 [dblp] [scimago]
- Elsevier CBM: Computers in Biology and Medicine · Q1 [dblp] [scimago]
Specialist and Lower-Tier
- Elsevier PRL: Pattern Recognition Letters · Q1 · shorter-format work [dblp] [scimago]
- Elsevier IVC: Image and Vision Computing · Q1 [dblp] [scimago]
- Elsevier JVCIR: Journal of Visual Communication and Image Representation · Q2 [dblp] [scimago]
- Springer JMIV: Journal of Mathematical Imaging and Vision · Q2 · mathematical foundations of imaging [dblp] [scimago]
- SPIE JEI: Journal of Electronic Imaging · Q3 [dblp] [scimago]
- IET Image Processing · Q2 [dblp] [scimago]
- Springer PAA: Pattern Analysis and Applications · Q2 [dblp] [scimago]
- Springer MVA: Machine Vision and Applications · Q2 [dblp] [scimago]
- IET Computer Vision · Q2 [dblp] [scimago]
Open Access
- IEEE Access · Q1 · broad scope; fast publication; lower selectivity than the IEEE transactions [dblp] [scimago]
- MDPI Journal of Imaging · Q2 · fully open access; no subscription required [dblp] [scimago]

Summer Schools

Summer schools are one of the best ways to get intensive, structured exposure to current CV research. Most run annually and accept applications from MSc students, PhD students, postdocs, and industry researchers.

Status: ✅ active (running regularly) · 🗄️ concluded (no longer running)

ICVSS: International Computer Vision Summer School [2007-Present], Sicily, Italy · competitive application · winner of the IEEE PAMI Mark Everingham Prize (2017) · ✅ active
BMVA CVSS: British Computer Vision Summer School [2013-Present], UK · Organized by BMVA · ✅ active
VISUM: Machine Intelligence and Visual Computing Summer School [2013-2020], Porto, Portugal · 🗄️ concluded

Evaluation Metrics

Performance - Classification
- Confusion Matrix: TP, FP, TN, and FN for each class
- For class-balanced datasets:
  - Accuracy: (TP+TN) / (TP+FP+TN+FN)
  - ROC curve: TPR vs FPR · summarised by AUROC (higher is better)
- For class-imbalanced datasets:
  - Precision (P): TP / (TP+FP)
  - Recall (R): TP / (TP+FN)
  - F1-Score: 2·P·R / (P+R)
  - Balanced Accuracy: (TPR+TNR) / 2
  - Weighted-Averaged Precision, Recall, and F1-Score
  - PR curve: Precision vs Recall · summarised by AUPRC (higher is better, more informative than AUROC on imbalanced data)
- For multi-label classification:
  - Macro / Micro / Weighted averaging of above metrics
  - Hamming Loss: fraction of labels incorrectly predicted
Performance - Detection
- Intersection over Union (IoU): area of overlap / area of union between predicted and ground-truth box
- Average Precision (AP): area under the Precision-Recall curve for a single class
- mAP: mean AP averaged over all classes
- mAP@0.5: IoU threshold of 0.5 (PASCAL VOC standard)
- mAP@0.5:0.95: mean over IoU thresholds 0.5 to 0.95 in steps of 0.05 (COCO standard, harder and preferred)
- AR@k: Average Recall at k proposals per image
- False Positives Per Image (FPPI): used in pedestrian detection benchmarks (e.g. Caltech)
- Log-Average Miss Rate (LAMR): standard metric for pedestrian detection, computed on FPPI vs Miss Rate curve
Performance - Segmentation
- Intersection over Union (IoU) / Jaccard Index: TP / (TP+FP+FN) per class
- mean IoU (mIoU): IoU averaged over all classes · primary metric for semantic segmentation benchmarks (Cityscapes, ADE20K)
- Dice Coefficient / F1-Score: 2·TP / (2·TP+FP+FN) · standard for medical image segmentation
- Mean Pixel Accuracy (mPA): fraction of pixels correctly classified per class, then averaged
- Panoptic Quality (PQ): PQ = SQ · RQ · unified metric for panoptic segmentation (COCO Panoptic)
- Boundary IoU (BIoU): IoU computed only near object boundaries · penalises coarse masks
- Hausdorff Distance (HD): maximum surface distance between predicted and ground-truth masks · common in medical imaging
- HD95: 95th-percentile Hausdorff Distance · more robust to outliers than HD
Performance - Tracking
- Multiple Object Tracking Accuracy (MOTA): combines false positives, false negatives, and identity switches
- Multiple Object Tracking Precision (MOTP): average localisation precision of matched detections
- ID F1-Score (IDF1): ratio of correctly identified detections over average of ground-truth and computed detections · better reflects long-term identity consistency than MOTA
- HOTA (Higher Order Tracking Accuracy): geometric mean of detection and association accuracy · increasingly preferred over MOTA/MOTP as a single summary metric
- Identity Switches (IDSW): number of times a tracked object changes its assigned ID
- Mostly Tracked (MT) / Mostly Lost (ML): fraction of ground-truth trajectories tracked for more than 80% / less than 20% of their lifespan
Performance - Perceptual Quality (Super-resolution, Denoising, Enhancement)
- Reference-based (require a clean ground-truth image):
  - Peak Signal-to-Noise Ratio (PSNR): 10·log10(MAX² / MSE) · in dB, higher is better · fast to compute but weakly correlated with human perception
  - Structural Similarity Index (SSIM): measures luminance, contrast, and structure jointly · range [0,1], higher is better
  - Multi-Scale SSIM (MS-SSIM): SSIM computed at multiple resolutions · more robust to viewing distance
  - Learned Perceptual Image Patch Similarity (LPIPS): deep feature distance · strongly correlated with human judgement · lower is better
  - Visual Information Fidelity (VIF): mutual information between reference and distorted image features
- No-reference (blind, no ground-truth required):
  - Natural Image Quality Evaluator (NIQE): lower is better · measures deviation from natural scene statistics
  - BRISQUE: lower is better · spatial natural scene statistics
  - Gradient Magnitude Similarity Deviation (GMSD): fast, gradient-based · lower is better
Performance - Generation (GANs, Diffusion Models)
- Fréchet Inception Distance (FID): distance between Inception feature distributions of real and generated images · lower is better · primary benchmark metric
- Inception Score (IS): measures quality and diversity jointly using classifier confidence and entropy · higher is better · less reliable than FID on its own
- Kernel Inception Distance (KID): like FID but uses MMD instead of Gaussian assumption · unbiased with small sample sizes · lower is better
- Perceptual Path Length (PPL): smoothness of the latent space · used for GANs · lower is better
- CLIP Score: cosine similarity between CLIP embeddings of generated image and text prompt · used for text-to-image evaluation · higher is better
- Human Evaluation: side-by-side preference studies remain the gold standard for generative quality
Performance - Depth Estimation
- Absolute Relative Error (AbsRel): mean( |d - d*| / d* ) · lower is better
- Squared Relative Error (SqRel): mean( |d - d*|² / d* )
- Root Mean Squared Error (RMSE) and RMSE log
- Threshold Accuracy (δ < 1.25, 1.25², 1.25³): fraction of pixels where max(d/d*, d*/d) < threshold · higher is better
Performance - Pose Estimation
- Percentage of Correct Keypoints (PCK): keypoint within α · torso diameter of ground truth · PCK@0.2 is standard
- Object Keypoint Similarity (OKS): analogous to IoU for keypoints · accounts for keypoint visibility and scale · used by COCO
- Mean Per Joint Position Error (MPJPE): average Euclidean distance between predicted and ground-truth 3D joints · in mm
Computation
- Latency: end-to-end inference time per image (ms) · report hardware, batch size, and input resolution
- Throughput: Frames Per Second (FPS) · report the same context as latency
- Parameters (M): total trainable parameter count · proxy for memory footprint
- FLOPs / MACs: floating-point operations or multiply-accumulate operations per forward pass · hardware-independent complexity measure
- Model Size (MB): weight file size on disk
- GPU Memory (VRAM, GB): peak memory during inference · critical for deployment constraints

Repos

Tags: Object Classification [ObjCls], Object Detection [ObjDet], Object Segmentation [ObjSeg], General Library [GenLib], Text Reading / Object Character Recognition [OCR], Action Recognition [ActRec], Object Tracking [ObjTrk], Data Augmentation [DatAug], Simultaneous Localization and Mapping [SLAM], Outlier/Anomaly/Novelty Detection [NvlDet], Content-based Image Retrieval [CBIR], Image Enhancement [ImgEnh], Aesthetic Assessment [AesAss], Explainable Artificial Intelligence [XAI], Text-to-Image Generation [TexImg], Pose Estimation [PosEst], Video Matting [VidMat], Eye Tracking [EyeTrk]

computervision-recipes [GenLib] Microsoft's best practices, code samples, and documentation for Computer Vision.
FastAI [GenLib] Library over PyTorch used for learning and practicing machine learning and deep learning.
pytorch-lightning [GenLib] Lightweight PyTorch wrapper for high-performance AI research.
ignite [GenLib] PyTorch's high-level library to help with training and evaluating neural networks flexibly and transparently.
pytorch_geometric [GenLib] Graph Neural Network Library for PyTorch.
kornia [GenLib] Open source differentiable computer vision library.
ncnn [GenLib] Tencent's high-performance neural network inference framework optimized for mobile platforms.
ITK [GenLib] Open-source, cross-platform toolkit for N-dimensional scientific image processing, segmentation, and registration.
VTK [GenLib] Open-source software system for image processing, 3D graphics, volume rendering and visualization.
MONAI [GenLib] PyTorch-based, open-source framework for deep learning in healthcare imaging.
keras-cv [GenLib] Library of modular computer vision oriented Keras components.
MediaPipe [ObjDet] [ObjSeg] [ObjTrk] [GenLib] Google's cross-platform framework supporting face detection, hand/pose tracking, object detection, hair segmentation, and more.
PyTorch image models [ObjCls] A wide collection of PyTorch image classification models, scripts, and pretrained weights.
mmclassification [ObjCls] OpenMMLab's image classification toolbox and benchmark.
vit-pytorch [ObjCls] SOTA implementations of vision transformers in PyTorch.
face_classification [ObjCls] [ObjDet] Real-time face detection and emotion/gender classification.
mmdetection [ObjDet] OpenMMLab's image detection toolbox and benchmark.
detectron2 [ObjDet] [ObjSeg] Facebook FAIR's next-generation platform for object detection, segmentation, and other visual recognition tasks.
detr [ObjDet] Facebook's end-to-end object detection with transformers.
libfacedetection [ObjDet] Open source library for face detection in images, achieving ~1000FPS.
FaceDetection-DSFD [ObjDet] Tencent's state-of-the-art face detector.
Object-Detection-Metrics [ObjDet] The most popular metrics used to evaluate object detection algorithms.
SAHI [ObjDet] [ObjSeg] Lightweight vision library for large-scale object detection and instance segmentation.
yolov5 [ObjDet] Ultralytics' YOLOv5 object detection framework.
darknet [ObjDet] YOLOv4 / Scaled-YOLOv4 / YOLOv3 / YOLOv2 implementations.
U-2-Net [ObjDet] U²-Net: nested U-structure architecture for salient object detection.
segmentation_models.pytorch [ObjSeg] PyTorch segmentation models with pretrained backbones.
mmsegmentation [ObjSeg] OpenMMLab's semantic segmentation toolbox and benchmark.
PaddleSeg [ObjSeg] Easy-to-use image segmentation library supporting semantic, interactive, panoptic, and 3D segmentation among others.
mmocr [OCR] OpenMMLab's text detection, recognition and understanding toolbox.
pytesseract [OCR] A Python wrapper for Google's Tesseract OCR engine.
EasyOCR [OCR] Ready-to-use OCR supporting 80+ languages and all popular writing scripts.
PaddleOCR [OCR] Practical ultra-lightweight OCR system supporting 80+ languages with tools for training and deployment across server, mobile, and IoT devices.
mmtracking [ObjTrk] OpenMMLab's video perception toolbox for object detection and tracking.
mmaction [ActRec] OpenMMLab's open-source toolbox for action understanding based on PyTorch.
albumentations [DatAug] Fast image augmentation library with an easy-to-use wrapper around other libraries.
Random-Erasing [DatAug] Random erasing data augmentation implemented in PyTorch.
CutMix-PyTorch [DatAug] Official PyTorch implementation of the CutMix regularizer.
ORB_SLAM2 [SLAM] Real-time SLAM for monocular, stereo and RGB-D cameras with loop detection and relocalization.
pyod [NvlDet] Python toolbox for scalable outlier and anomaly detection.
alibi-detect [NvlDet] Algorithms for outlier, adversarial, and drift detection.
fastdup [NvlDet] [CBIR] Unsupervised and free tool for image and video dataset analysis.
imagededup [CBIR] Simple tool to find and remove duplicate images from datasets.
image-match [CBIR] Fast image retrieval system capable of searching over billions of images.
Bringing-Old-Photos-Back-to-Life [ImgEnh] Microsoft's CVPR 2020 oral paper implementation for restoring old and damaged photos.
image-quality-assessment [AesAss] Idealo's NIMA model to predict the aesthetic and technical quality of images.
aesthetics [AesAss] Image aesthetics toolkit using Fisher Vectors.
openpose [PosEst] Real-time multi-person keypoint detection for body, face, hands, and feet.
RobustVideoMatting [VidMat] Robust video matting supporting PyTorch, TensorFlow, ONNX, and CoreML.
PsychoPy [EyeTrk] Library for running psychology and neuroscience experiments.
pytorch-cnn-visualizations [XAI] PyTorch implementations of convolutional neural network visualization techniques.
Captum [XAI] PyTorch team's library for model interpretability and understanding.
Alibi [XAI] Algorithms for explaining machine learning models.
iNNvestigate [XAI] TensorFlow toolbox for investigating neural network predictions.
keras-vis [XAI] Neural network visualization toolkit for Keras.
Keract [XAI] Keras tool for extracting layer outputs and gradients.
pytorch-grad-cam [XAI] Advanced AI explainability for computer vision in PyTorch.
SHAP [XAI] Game-theoretic approach to explain the output of any machine learning model.
TensorWatch [XAI] Microsoft's debugging, monitoring, and visualization tool for Python ML and data science.
WeightWatcher [XAI] Open-source diagnostic tool for analyzing deep neural networks without needing training or test data.
DALLE2-pytorch [TexImg] PyTorch implementation of OpenAI's DALL-E 2 text-to-image synthesis network.
imagen-pytorch [TexImg] PyTorch implementation of Google's Imagen text-to-image neural network.

Dataset Collections

PyTorch - CV Datasets, Meta
Tensorflow - CV Datasets, Google
CVonline: Image Databases, Edinburgh University, Thanks to Robert Fisher!
Kaggle
PaperWithCode, Meta
RoboFlow
VisualData
CUHK Computer Vision
VGG - University of Oxford

YouTube Channels

Tags: Popular individuals [Individual], Conference and event [Conferences], University research groups [University], Interactive talks and podscasts [Talks], Research articles' explanation [Papers].

@AurelienGeron [Individual], Aurélien Géron: former lead of YouTube's video classification team, and author of the O'Reilly book Hands-On Machine Learning with Scikit-Learn and TensorFlow.
@howardjeremyp [Individual], Jeremy Howard: former president and chief scientist of Kaggle, and co-founder of fast.ai.
@PieterAbbeel [Individual], Pieter Abbeel: professor of electrical engineering and computer sciences, University of California, Berkeley.
@pascalpoupart3507 [Individual], Pascal Poupart: professor in the David R. Cheriton School of Computer Science at the University of Waterloo.
@MatthiasNiessner [Individual], Matthias Niessner: Professor at the Technical University of Munich and head of the Visual Computing Lab.
@MichaelBronsteinGDL [Individual], Michael Bronstein: DeepMind Professor of AI, University of Oxford / Head of Graph Learning Research, Twitter.
@DeepFindr [Individual], Videos about all kinds of Machine Learning / Data Science topics.
@deeplizard [Individual], Videos about building collective intelligence.
@YannicKilcher [Individual], Yannic Kilcher: videos about machine learning research papers, programming, and issues of the AI community, and the broader impact of AI in society.
@sentdex [Individual], sentdex: provides Python programming tutorials in machine learning, finance, data analysis, robotics, web development, game development and more.
@AAmini [Individual], Alexander Amini: Research Affilliate at MIT, videos about deep learning and data science.
@WhatsAI [Individual], Louis-François Bouchard: PhD in MILA, videos about AI.
mrdbourke [Individual], Daniel Bourke: ML engineer in healthcare, videos about AI.
marksaroufim [Individual], Mark Saroufim: PyTorch engineer at Meta (Facebook), videos about AI.
NicholasRenotte [Individual], Nicholas Renotte: videos about computer vision, natural language processign and reinforcement learning applications.
abhishekkrthakur [Individual], Abhishek Thakur: world's first Quadruple Grand Master on Kaggle, videos about applied machine learning, deep learning, and data science.
@AladdinPersson [Individual], Aladdin Persson: clear implementations of ML and CV papers from scratch in PyTorch and TensorFlow.
@CodeEmporium [Individual], The Code Emporium: intuitive explanations of ML concepts and architectures.
@AICoffeeBreak [Individual], AI Coffee Break with Letitia: short, accessible walkthroughs of recent AI and CV research.
@mildlyoverfitted [Individual], Mildly Overfitted: hands-on CV and ML tutorials with clean code.
@SmithaKolan [Individual], Smitha Kolan: computer vision tutorials focused on practical applications.
@KapilSachdeva [Individual], Kapil Sachdeva: in-depth explanations of ML research and engineering.
@alfcnz [Individual], Alfredo Canziani: assistant professor at NYU, deep learning theory and practice.
@arp_ai [Individual], Jay Alammar: applied ML and computer vision projects.
@bmvabritishmachinevisionas8529 [Conferences], BMVA: British Machine Vision Association.
@ComputerVisionFoundation [Conferences], Computer Vision Foundation (CVF): co-sponsored conferences on computer vision (e.g. CVPR and ICCV).
@cvprtum [University], Computer Vision Group at Technical University of Munich.
@UCFCRCV [University], Center for Research in Computer Vision at University of Central Florida.
@dynamicvisionandlearninggr1022 [University], Dynamic Vision and Learning research group channel! Technical University of Munich.
@TubingenML [University], Machine Learning groups at the University of Tübingen.
@computervisiontalks4659 [Talks], Computer Vision Talks.
@freecodecamp [Talks], Videos to learn how to code.
@LondonMachineLearningMeetup [Talks], Largest machine learning community in Europe.
@LesHouches-iu6nv [Talks], Summer school on Statistical Physics of Machine learning held in Les Houches, July 4 - 29, 2022.
@MachineLearningStreetTalk [Talks], top AI podcast on Spotify.
@WeightsBiases [Talks], Weights and Biases team's conversations with industry experts, and researchers.
@PreserveKnowledge [Talks], Canada higher education media organization that focuses on advances in mathematics, computer science, and artificial intelligence.
@TwoMinutePapers [Papers], Two Minute Papers: Explaining AI papers in few mins.
@TheAIEpiphany [Papers], Aleksa Gordić: x-Google DeepMind, x-Microsoft engineer explaining AI papers.
@bycloudAI [Papers], bycloud: covers the latest AI tech/research papers for fun.

Mailing Lists

Vision Science, announcements about industry/academic jobs in computer vision around the world (in English).
bull-i3, posts about job opportunities in computer vision in France (in French).

Curation Philosophy

Entries in this list are included because they are:

Genuinely educational — they help you understand something, not just use it
Well-maintained (or historically significant if archived)
Accessible — free or widely available where possible

Entries marked (last updated: YEAR) in the libraries section are included for historical or educational value despite no longer being actively developed.

This list is maintained by a computer vision researcher and university academic. Suggestions and pull requests are welcome. Please check CONTRIBUTING.md.

Thanks

Frida de Sigley
CORE Conference Ranking
Scimago Journal Ranking
benthecoder/yt-channels-DS-AI-ML-CS
anomaly-detection-resources, Anomaly detection related books, papers, videos, and toolboxes
awesome-satellite-imagery-datasets List of satellite image training datasets with annotations for computer vision and deep learning
awesome-Face_Recognition, Computer vision papers about faces.
the-incredible-pytorch, Curated list of tutorials, papers, projects, communities and more relating to PyTorch

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Computer Vision Resources: From Foundations to Research

Why this list exists

What makes this different

Who this is for

Contents

Python Libraries

Rust Libraries

MATLAB Libraries

Reference Books

Courses

Conferences

Journals

Summer Schools

Popular Articles

Evaluation Metrics

Repos

Dataset Collections

YouTube Channels

Mailing Lists

Curation Philosophy

Thanks

About

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Awesome Computer Vision Resources: From Foundations to Research

Why this list exists

What makes this different

Who this is for

Contents

Python Libraries

Rust Libraries

MATLAB Libraries

Reference Books

Courses

Conferences

Journals

Summer Schools

Popular Articles

Evaluation Metrics

Repos

Dataset Collections

YouTube Channels

Mailing Lists

Curation Philosophy

Thanks

About

Topics

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!