LUFFY Development Repository

🚧 Development Branch - This is the main development repository for LUFFY (Learning to Reason Under Off‑Policy Guidance)

About LUFFY

LUFFY is a reinforcement learning framework that bridges the gap between zero-RL and imitation learning by incorporating off-policy reasoning traces into the training process. This repository contains the core implementation and development work.

🔧 Development Status

This repository is under active development. Many features are currently being implemented or need refactoring.

🚀 Quick Start

⚠️ Note: This development version has incomplete implementations. Many features are marked as TODO and need to be completed before production use.

# Clone the repository
git clone <repository-url>
cd LUFFY

# Install dependencies
pip install -r luffy/requirements.txt

# Note: Some functionality is incomplete - check TODO list below for details

📁 Repository Structure

LUFFY/
├── luffy/                 # Core framework
│   ├── deepscaler/        # Scaling utilities (⚠️ API integration needed)
│   ├── verl/              # RL training components (⚠️ Some features incomplete)
│   └── ...
├── data/                  # Training data and scripts
├── eval_scripts/          # Evaluation utilities
├── exp_scripts/           # Experiment scripts
└── README.md              # This file

⚠️ Development Notes

This is a development version with incomplete implementations
Many functions contain TODO markers indicating pending work
API integrations (OpenAI, Gemini) are currently placeholder implementations
FSDP and distributed training features need completion

🔴 High Priority TODOs

API Integration: OpenAI and Gemini API implementations need completion
Reward System: Parallel processing and validation for reward computation
FSDP Training: Model loading and distributed training setup
Data Processing: Batch dimension operations and tensor reshaping

📝 Complete TODO List

🤝 Contributing

Pick a TODO item from the list above
Implement the functionality
Test your implementation
Update this README when TODOs are completed

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LUFFY Development Repository

About LUFFY

🔧 Development Status

🚀 Quick Start

📁 Repository Structure

⚠️ Development Notes

🔴 High Priority TODOs

📝 Complete TODO List

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LUFFY Development Repository

About LUFFY

🔧 Development Status

🚀 Quick Start

📁 Repository Structure

⚠️ Development Notes

🔴 High Priority TODOs

📝 Complete TODO List

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages