support mlx backend#30
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f0c5cc3af8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| int promptUncomputed = Math.Max(0, seq.PromptTokens.Count - seq.NumComputedTokens); | ||
| bool isPrefill = promptUncomputed > 0; | ||
| int want = isPrefill | ||
| ? Math.Min(promptUncomputed, _cfg.MaxPrefillChunkSize) | ||
| : 1; // decode step |
There was a problem hiding this comment.
Recompute full sequence after preemption
After ResetForPreemption() sets NumComputedTokens back to 0, the scheduler still derives prefill work from PromptTokens.Count - NumComputedTokens, so resumed requests only replay the original prompt and skip already-generated output tokens. If a sequence is preempted mid-generation, its KV state is rebuilt without prior decoded tokens, and subsequent sampling continues from the wrong context (duplicated/divergent continuations). The prefill budget for preempted/rerun sequences should be based on total logical tokens that must be recomputed, not just prompt length.
Useful? React with 👍 / 👎.
| _shutdownCts.Cancel(); | ||
| _commands.Writer.TryComplete(); | ||
| try { _worker.Join(2000); } catch { /* best effort */ } |
There was a problem hiding this comment.
Complete in-flight handles when disposing engine
Dispose() cancels the worker and stops reading commands, but it never completes or faults outstanding InferenceRequestHandles in _handles. During model reload (InferenceEngineHost.Reset) or shutdown with active requests, token readers and Completion awaiters can hang indefinitely because their channels/TCS are never resolved. Disposal should drain/abort remaining requests and complete their handles before returning.
Useful? React with 👍 / 👎.
| if (args[i].StartsWith("--", StringComparison.Ordinal)) | ||
| { | ||
| string suggestion = SuggestFlagCorrection(args[i]); | ||
| string suffix = suggestion != null ? $" Did you mean '{suggestion}'?" : string.Empty; | ||
| throw new ArgumentException($"Unknown option '{args[i]}'.{suffix}"); |
There was a problem hiding this comment.
Allow ASP.NET host flags to pass through parsing
The new unknown-flag trap throws for any --* option not explicitly recognized by ServerOptionsBuilder, which now rejects standard ASP.NET host arguments like --urls/--environment before WebApplication.CreateBuilder(args) can consume them. This is a behavioral regression for deployments that rely on built-in host CLI switches; custom option validation should ignore or forward framework-owned flags instead of hard-failing on all unknown -- options.
Useful? React with 👍 / 👎.
TensorSharp Test MatrixNo report artifacts produced. |
No description provided.