Skip to content

Latest commit

 

History

History
144 lines (123 loc) · 14.7 KB

File metadata and controls

144 lines (123 loc) · 14.7 KB

AIPaperNotes

Record daily reading of papers and related reproduction results in Chinese.

For more notes, please follow the blog: https://nopsled.blog.csdn.net/

Paper Catalog

Architecture

  • Diffusion:
  • Flow:
    • Flow Matching [Link]
  • MOE:
    • Switch Transformer [Link]
    • DeepSeekMoE [Link]
    • Loss-Free Balancing [Link]
  • Vision Transformer:

Embedding

  • NV-EMBED [Link]
  • Qwen3 Embedding [Link]

LLM

  • Agent: LLM-based Single/Multi Agent model/system

    • DeepResearch:
    • Memory:
      • Dynamic Cheatsheet [Link]
      • EgoMem [Link]
      • ReasoningBank [Link]
    • Multi Agent Optimization
    • RAG
    • Reflection:
      • Reflexion [Link]
      • Metacognitive Reuse [Link]
    • Router:
    • Visual Agent
  • Base Model: Large Language Model

    • DeepSeek
      • DeepSeek-V2 [Link]
      • DeepSeek-V3 [Link]
      • DeepSeek-V3.2 [Link]
    • Google
      • Gemma 3
      • Gemma 4
    • Moonshot AI
      • KIMI LINEAR [Link]
    • Zhipu AI
    • OpenAI
  • Dataset: Data building and processing for Model training

    • Pretrain:
    • SFT:
  • Long Sequence

  • Prompt: Prompt Engineering

    • Context Learning
    • Skills
      • Extending Claude’s capabilities with skills and MCP servers [Link]
      • Building agents with Skills: Equipping agents for specialized work [Link]
  • Omni: LLM-based full modal model

    • Qwen2.5 - Omni [Link]
    • M3 - Agent [Link]
  • Quantization: Model Weight/Optimizer/Activation Compressing

  • Speech: Speech LLM

    • ALM: Audio LLM for auido Input
      • Audio Flamingo 3 [Link]
  • Survey

  • Training: LLM Model Training:

    • Ptrtrain
      • FIM (fill-in-the-middle) [Link]
    • RL
      • RLHF: Reinforcement Learning from Human Feedback
      • RLRF: Reinforcement Learning with Rich Feedback
      • RLVR: Reinforcement Learning with Verifiable Rewards
        • Deepseek - R1 [Link]
        • Dr.GRPO [Link]
        • DAPO [Link]
        • GCG [Link]
        • LUFFY [Link]
        • GSPO [Link]
        • DeepSeek - R1 v2 [Link]
        • Truncated Importance Sampling (TIS)
    • SFT:
      • EAFT
    • Speculative Decoding or MTP: Speculative Decoding or Multi-token Prediction
      • Better & Faster Large Language Models via Multi-token Prediction [Link]
      • CAFT [Link]
      • EAGLE3 [Link]
  • VLM: Visual LLM

    • LLaVA [Link]
    • Qwen - VL [Link]
    • Qwen2 - VL [Link]
    • Qwen2.5 - VL [Link]
    • Qwen3 - VL [Link]
    • MiniCPM-V 4.5 [Link]
    • DeepSeek - OCR [Link]
    • DeepSeek - OCR2
    • Kimi K2.5 [Link]

Visual Encoder

  • Image Segment Pretraining

  • Language-Image Representation Learning: