UPSTREAM PR #21242: fix: tool call parsing for LFM2 and LFM2.5 models#1325
UPSTREAM PR #21242: fix: tool call parsing for LFM2 and LFM2.5 models#1325
Conversation
OverviewAnalysis of 124,195 functions across 15 binaries reveals negligible performance impact from LFM2/LFM2.5 parsing refactoring. 115 functions modified (0.09%), 192 new, 0 removed. All changes are compiler-generated STL code artifacts—no modifications to performance-critical inference paths. Power Consumption Changes:
Function AnalysisAll modified functions are STL template instantiations (std::vector, std::map, std::_Rb_tree) with no source code changes. Performance variations result from compiler code generation differences between builds. Most Significant Changes:
Other analyzed functions show similar compiler-induced variations in non-critical initialization code. No changes detected in inference hot paths: Additional FindingsSource code changes limited to 🔎 Full breakdown: Loci Inspector |
126cd1f to
a8215be
Compare
e800934 to
a024d9c
Compare
7638ab4 to
f1b46d5
Compare
Note
Source pull request: ggml-org/llama.cpp#21242
Overview
Currently, LFM2 & LFM2.5 tool calling is broken in
llama.cpp, issue ggml-org/llama.cpp#20245, this commit ggml-org/llama.cpp#20251 introduced a dedicated parser for LFM2; however, LFM2.5 and LFM2 have different tool calling jinja templatesThis PR fixes the tool calling parser to catch the expected formats for both cases
<|tool_call_start|>[name(arg="val")]<|tool_call_end|>
Added
common_chat_params_init_lfm2_5Testing
LFM2.5-1.2B-Instruct-BF16.ggufandLFM2-8B-A1B-Q4_0.ggufRequirements