Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)
-
Updated
May 6, 2021 - Python
Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)
Official PyTorch implementation of LaMI: Augmenting Large Language Models via Late Multi-Image Fusion (ACL 2026)
A list of research papers on knowledge-enhanced multimodal learning
Vision-Zephyr: a multimodal LLM for Visual Commonsense Reasoning—CLIP-ViT + Zephyr-7B with visual prompting; code, training scripts, and VCR evaluation.
Neuro-symbolic visual reasoning engine with Grounding DINO, abductive inference, and Gemini-powered self-evolving rules
Add a description, image, and links to the visual-commonsense-reasoning topic page so that developers can more easily learn about it.
To associate your repository with the visual-commonsense-reasoning topic, visit your repo's landing page and select "manage topics."