Large Models Learning

Model	Key Points
CLIP	Vision-Language Model: Contrastive Pre-training, zero-shot transfer, image-text encoder fusion
SigLIP	Vision-Language Model: Sigmoid Pairwise Loss, improved training efficiency over CLIP
Gemma 3	Vision-Language Model (on Decoder-only LLM): vision encoder (SigLIP) + GQA + 5:1 Local/Global Attention Interleaving
Gemma 4	Multimodal Open Model Family: Hybrid local/global attention + Per-Layer Embeddings (PLE) + variable-resolution vision token budgets + Dense / MoE deployment scaling
DeepSeek-VL	Vision-Language Model (on Decoder-only LLM): Hybrid vision encoder (SigLIP semantic + SAM-B high-res details) → fixed-token high-res processing, gradual modality-balanced pretraining to preserve language strength
DeepSeek-VL2	Vision-Language Model (MoE Decoder-only LLM): Single SigLIP dynamic tiling (global thumbnail + local tiles) → arbitrary high-res/aspect ratios with controlled tokens, DeepSeekMoE backbone with MLA
DeepSeek-V2	Decoder-only Transformer: MLA + DeepSeekMoE
DeepSeek-V3	Decoder-only Transformer: MLA + DeepSeekMoE with auxiliary-loss-free + Multi-token prediction (MTP)
DeepSeek-V3.2	Decoder-only Transformer (Long-Context + Agentic RL): DeepSeek Sparse Attention (DSA) (Lightning Indexer → Top-k KV selection; O(L·k) core attention for 128K) + MLA; MQA-mode integration for efficient sparse KV sharing + scaled post-training RL (GRPO) (>10% pretrain compute) + large-scale agent/tool-use task synthesis (verified environments)
DeepSeek-R1	Reasoning MoE on DeepSeek-V3-Base: R1-Zero shows pure RL can induce long-CoT reasoning; R1 adds cold-start SFT + multi-stage RL to improve readability, language consistency, and general assistant behavior

LLM Knowledge System

A topic-based map of this repo. This section is organized by knowledge domains rather than learning phases.

Visual Map

flowchart TD
    A[LLM Knowledge System]

    A --> B[Foundations]
    A --> C[Architecture and Scaling]
    A --> D[Adaptation and Alignment]
    A --> E[Inference and Serving]
    A --> G[Model Case Studies]

    B --> B1[SVD / dtypes / AdamW]
    B --> B2[Attention: MHA / MQA / GQA]
    B --> B3[RoPE / SwiGLU]

    C --> C1[FlashAttention / MLA]
    C --> C2[DeepSeekMoE]
    C --> C3[TP / PP / EP]

    D --> D1[LoRA / QLoRA / DoRA]
    D --> D2[Specialized LoRA Variants]
    D --> D3[SFT / RLHF / DPO / PPO / GRPO]

    E --> E1[Speculative Decoding]
    E --> E2[Continuous Batching / PagedAttention]
    E --> E3[AWQ / GPTQ / TensorRT-LLM]
    E --> E4[Hallucination Mitigation]

    G --> G1[DeepSeek-V2]
    G --> G2[DeepSeek-V3]
    G --> G3[DeepSeek-V3.2]

    G1 -. combines .-> C1
    G1 -. combines .-> C2
    G2 -. combines .-> C1
    G2 -. combines .-> C2

The diagram gives a high-level overview; the sections below act as the detailed index.

Domain	Focus	Core Topics
Foundations	Math, optimization, losses, normalization, and Transformer building blocks	SVD, dtypes, AdamW, Sigmoid, GELU, LayerNorm, RMSNorm, BatchNorm, MHA/MQA/GQA, RoPE, SwiGLU
Architecture & Scaling	Efficient training and large-scale model design	FlashAttention, MLA, DeepSeekMoE, TP/PP/EP
Adaptation & Alignment	Task adaptation and preference learning	LoRA family, SFT, RLHF, DPO, PPO, GRPO
Agent Systems	Retrieval, memory, tool use, API interfaces, and task orchestration	Agent Basics, Memory Systems, RAG Systems, OpenAI API Interfaces
Inference & Serving	Latency, memory, and deployment efficiency	Speculative Decoding, Continuous Batching, Quantization, TensorRT-LLM, Hallucination Mitigation
VLA & Robotics	Vision-language-action policies and embodied control	RT-1

1. Foundations

Math and numerical basics: SVD, dtypes, Memory Estimation, AdamW
Activation basics: Sigmoid, GELU
Normalization basics: LayerNorm, RMSNorm, BatchNorm
Attention mechanisms: SVD + Attention, MHA, MQA, GQA
Position and FFN blocks: RoPE, SwiGLU

2. Architecture & Scaling

Efficient attention: FlashAttention, MLA
Sparse architecture: DeepSeekMoE
Distributed training: TP, PP, EP (In Progress)

3. Adaptation & Alignment

PEFT: LoRA, QLoRA, DoRA, Specialized LoRA Variants
Supervised and preference alignment: SFT, RLHF, DPO, PPO, GRPO

4. Inference & Serving

Decoding acceleration: Speculative Decoding (Medusa/Lookahead)
Serving systems: Continuous Batching & PagedAttention, TensorRT-LLM & Multi-LoRA Serving
Compression and reliability: Post-Training Quantization (AWQ/GPTQ), Hallucination Mitigation at Inference

5. Agent Systems

Core concepts: Agent Systems Basics
Memory design: Memory Systems for Agents
Retrieval grounding: RAG Systems
Reusable task methods: Skill Systems
Tool architecture: Tool Registry and Function Calling
API interface format: OpenAI API Interface Format
Protocol layer: Model Context Protocol (MCP)

6. VLA & Robotics

Vision-language-action policies: RT-1

File Structure of /docs

docs/
|-- Agent_Systems/
|   |-- Agent_Basics.md
|   |-- MCP_Protocol.md
|   |-- Memory_Systems.md
|   |-- OpenAI_API_Interface_Format.md
|   |-- RAG_Systems.md
|   |-- Skill_Systems.md
|   `-- Tool_Registry_and_Function_Calling.md
|-- Activation_Layers/
|   |-- GELU.md
|   |-- Sigmoid.md
|   `-- SwiGLU.md
|-- Attention_Machanisms/
|   |-- FlashAttention.md
|   |-- GQA.md
|   |-- MHA.md
|   |-- MLA.md
|   |-- MQA.md
|   `-- SVD_Attention.md
|-- Inference_Optimization/
|   |-- continuous_batching.md
|   |-- hallucination_mitigation.md
|   |-- quantization_inference.md
|   |-- speculative_decoding.md
|   `-- tensorrt_multilora.md
|-- Large_Models/
|   |-- CLIP.md
|   |-- DeepSeek_R1.md
|   |-- DeepSeek_V2.md
|   |-- DeepSeek_V3.md
|   |-- DeepSeek_V32.md
|   |-- DeepSeek_VL.md
|   |-- DeepSeek_VL2.md
|   |-- Gemma_3.md
|   |-- Gemma_4.md
|   `-- SigLIP.md
|-- Math/
|   |-- Memory_Estimation.md
|   |-- SVD.md
|   `-- dtypes.md
|-- MoE/
|   `-- DeepSeekMoE.md
|-- Norm/
|   |-- BatchNorm.md
|   |-- RMSNorm.md
|   `-- LayerNorm.md
|-- Optimizer/
|   `-- AdamW.md
|-- PEFT/
|   |-- DoRA.md
|   |-- LoRA.md
|   |-- QLoRA.md
|   `-- Specialized_LoRA.md
|-- Parallelism/
|   |-- EP.md
|   |-- PP.md
|   `-- TP.md
|-- Position_Embeding/
|   `-- RoPE.md
|-- Preference_Alignment/
|   |-- DPO.md
|   |-- GRPO.md
|   |-- PPO.md
|   |-- RLHF.md
|   `-- SFT.md
|-- Loss/
|-- VLAs/
|   `-- RT_1.md
`-- Resource/
    |-- Text_Color_Table.md
    `-- pics/
        `-- ...

Learning Resource Recommendation

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
docs		docs
src		src
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Large Models Learning

LLM Knowledge System

Visual Map

1. Foundations

2. Architecture & Scaling

3. Adaptation & Alignment

4. Inference & Serving

5. Agent Systems

6. VLA & Robotics

File Structure of /docs

Learning Resource Recommendation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Large Models Learning

LLM Knowledge System

Visual Map

1. Foundations

2. Architecture & Scaling

3. Adaptation & Alignment

4. Inference & Serving

5. Agent Systems

6. VLA & Robotics

File Structure of /docs

Learning Resource Recommendation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages