Building the next frontier of agent evaluation. Co-founder and CTO of ellamind, and co-founder of DiscoResearch, our open-source research and development community.
- elluminate — Agentic evaluation platform. Offline criteria-based scoring, experiment comparison, and quality gates before production.
- ellarun — The agent runtime. Deploy any AI agent to production with security, credential brokering, and full audit trails.
Selected outputs from ellamind research:
- propella-1 — A family of small multilingual LLMs (57 languages) for annotating text documents across six categories to filter, select, and curate LLM training data at scale. Outperforms much larger general-purpose baselines. DATA-FM @ ICLR 2026 (Spotlight). Paper · Dataset
- sui-1: Summarization with Unique Identifiers — A 24B-parameter LLM for abstractive summarization with inline citations. Every claim is traceable to its source sentence; supports documents with more than 2M tokens and outperforms all tested open-weight baselines, including models with 3x more parameters. Paper
- base-eval — Curated lm-evaluation-harness task configurations for evaluating English and German base models. 47 benchmarks, 730+ task configs, validated against reference models for early-stage pretraining and in-loop evaluation.
- inference-hive — Distributed LLM inference at scale for SLURM clusters. Configure cluster, server, and data settings, then scale across thousands of GPUs with near-linear throughput.
We're also part of EU-funded consortia building sovereign, open foundation models for Europe: OpenEuroLLM, LLMs4EU, SOOFI, and LLM4KMU.
- LeoLM: German LLM: I used large-scale continued pretraining to transfer the English-language capabilities of Llama-2 to German. Together with LAION and Hessian.AI we released LeoLM: Linguistically Enhanced Open Language Model at different model scales. Check out our blog post for more info: https://laion.ai/blog/leo-lm/
- Vision-Language Explanations: Transformer explainability is lacking but they are great at producing text. Why not have it explain its own decisions? A large research project investigating natural language explanations for multimodal transformer applications. Arxiv preprint: https://arxiv.org/abs/2212.04231
- B. Plüster, C. Weber, L. Qu and S. Wermter, "Hearing Faces: Target Speaker Text-to-Speech Synthesis from a Face," 2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2021, pp. 757-764, doi: 10.1109/ASRU51503.2021.9687866.
Older Projects
- KOSMOS-1 Reimplementation: The KOSMOS-1 paper (multimodal foundation model) was super interesting to me at the time but no code to be found anywhere. This is a very rudimentary reimplementation of the core aspects.
- Tagesschau: Simple scrape of Tagesschau news articles.
- DiscoveredWeekly contains the source code for my website discoveredweekly.com where users can log in with their Spotify account and every Monday their new Discover Weekly playlist will get copied automatically, making sure no valuable song suggestions are ever lost.
- AutoObjectRemoval is a combination of Instance Segmentation using Detectron2, and Flow-Guided Video Completion to create a system which can automatically mask and remove objects from videos.
- VideoSilenceRemover is a tool for automatically cutting segments of silence out of a video. Created this tool for a friend to facilitate the boring parts of his job.
- DirectoryStats is a python CLI for efficiently counting large amounts of files and subdirectories. Needed this to keep track of directory size during creation of the dataset for my thesis project.
- PaypalTransactionVisualizer is a Jupyter notebook which shows you some interesting infos about your past spending with PayPal. This is a project I implemented mostly to gain some insight on my own spending habits but also to practice using Jupyter and some interesting python features.
- YoutubeHistoryVisualizer is a notebook along a similar line which shows you some stats regarding the YouTube videos you've used in the past. It works with data from Google Takeout.
- ColorFlow is an Android game written in Java, which was a cool side project. The repo is not well maintained and used primarily as my own VCS. Check out the game in the Play Store.
Best way to reach me is via e-mail bjoern@ellamind.com.




