Can llama_state_* save/restore be used across different n_ctx? Which params must match?
#15569
Replies: 1 comment
-
|
The state blob has no compatibility metadata. nothing is validated before restore. n_ctx: restore works if destination n_ctx ≥ source. If smaller, the blob overflows the buffer and assert fail (nread <= state_size, same crash as #20473). Params that must match: n_embd, n_layer, n_head_kv, type_k/type_v, rope_freq_base, n_vocab. Any mismatch = crash or silent garbage. Watch out for type_k/type_v and enabling -fa can change KV quantization implicitly, so same model + same n_ctx produces a different blob size. Size to pass to llama_state_set_data(): the actual byte length of the saved blob, not llama_state_get_size() on the destination (which reflects the destination's capacity, not the blob's). Related: #21145 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I’m would like to use the state APIs and wanted to clarify the compatibility contract.
APIs involved
llama_state_get_size(ctx),llama_state_get_data(ctx, buf, size)(orllama_state_save_file(path, ctx, …))llama_state_set_data(ctx, buf, size)(orllama_state_load_file(path, ctx, …))Questions
If a state was saved from a context created with
llama_context_paramswheren_ctx = A, can it be restored into a context created withn_ctx = BwhereA != B?B > A,B < A, or only whenB == A?Beyond
n_ctx, which fields inllama_context_paramsmust match forllama_state_set_datato succeed and reproduce the same continuation?For example:
type_k/type_v(KV precision)n_seq_max-related behaviorSizing on restore: is the intended pattern to pass the serialized blob’s byte length to
llama_state_set_data(ctx, buf, saved_size)rather than callingllama_state_get_size(dst_ctx)on the destination?Any authoritative guidance (or doc pointers) on which parameters must match for a valid restore would be super helpful. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions