🔗 Maintained fork of the archived Lilac project.
Osmanthus is a production-ready fork of the archived Lilac project. It is designed for exploration, curation, and quality control of datasets for LLMs, with a focus on modern embedding infrastructure and Windows stability.
Osmanthus continues the mission of providing "Better data, better AI" by maintaining the core registry-based signal architecture while decoupling from defunct hosted services.
- Modern GGUF Support: Enhanced
llama-cpp-pythonintegration for state-of-the-art GGUF embeddings. - Independent Identity: Full decoupling from the defunct "Lilac Garden" infrastructure.
- Windows Optimized: Critical stability fixes for high-performance embedding pipelines on Windows systems.
- Botanical UI: A premium, high-density design system focusing on "Cockpit Mode" utility.
- Interactive Exploration: Search, filter, cluster, and annotate your data with an LLM-powered interface.
- On-Device Performance: Runs entirely on your local machine using open-source LLMs.
- Data Hygiene: Detect PII, remove duplicates, and analyze text statistics to lower training costs.
- Centralized Insights: Understand how your data evolves across the entire ML lifecycle.
# Install directly from the fork repository
pip install git+https://github.com/user177013/lilac.gitStart the Osmanthus webserver using the new CLI:
osmanthus start ~/my_projectOr from Python:
import osmanthus as osman
osman.start_server(project_dir='~/my_project')The server will be available at http://localhost:5432/.
For detailed guides on loading datasets from HuggingFace, Parquet, JSON, and more, please refer to the docs/ folder.
Osmanthus is licensed under the Apache License, Version 2.0. This project is an independent fork and is not affiliated with the original Lilac AI Inc. team.
Created with focus on performance and independence.
