Skip to content
Discussion options

You must be logged in to vote

You're looking for thinking_budget_tokens. You can include this in the request body ({"thinking_budget_tokens": N}) as long as you haven't specified a budget on the command-line.

Relevant code:

{
int reasoning_budget = opt.reasoning_budget;
if (reasoning_budget == -1 && body.contains("thinking_budget_tokens")) {
reasoning_budget = json_value(body, "thinking_budget_tokens", -1);
}
if (!chat_params.thinking_end_tag.empty()) {
llama_params["reasoning_budget_tokens"] = reasoning_budget;
llama_params["reasoning_budget_start_tag"] = chat_params.thinking_start_tag;
llam…

Replies: 2 comments 5 replies

Comment options

You must be logged in to vote
4 replies
@SpeedyCraftah
Comment options

@YannFollet
Comment options

@SpeedyCraftah
Comment options

@YannFollet
Comment options

Comment options

You must be logged in to vote
1 reply
@SpeedyCraftah
Comment options

Answer selected by SpeedyCraftah
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants