llama-server context-shift

when running ml-intern through CLI with llama-server being hit, it doesn't automatically do compaction. instead, llama-server has something called --context-shift which is by default disabled and is ran on server side. here's the command I used:

```
llama-server --model [path-to-gguf] --host 0.0.0.0 --port 8081 --n-gpu-layers 99 --ctx-size 262144 --context-shift
```

would be nice to enable this for all local servers if we can do it on our end @lewtun 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-server context-shift #259

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

llama-server context-shift #259

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions