Skip to content

support mlx backend#30

Merged
zhongkaifu merged 3 commits into
mainfrom
feature/support_mlx_backend
May 27, 2026
Merged

support mlx backend#30
zhongkaifu merged 3 commits into
mainfrom
feature/support_mlx_backend

Conversation

@zhongkaifu
Copy link
Copy Markdown
Owner

No description provided.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f0c5cc3af8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +108 to +112
int promptUncomputed = Math.Max(0, seq.PromptTokens.Count - seq.NumComputedTokens);
bool isPrefill = promptUncomputed > 0;
int want = isPrefill
? Math.Min(promptUncomputed, _cfg.MaxPrefillChunkSize)
: 1; // decode step
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recompute full sequence after preemption

After ResetForPreemption() sets NumComputedTokens back to 0, the scheduler still derives prefill work from PromptTokens.Count - NumComputedTokens, so resumed requests only replay the original prompt and skip already-generated output tokens. If a sequence is preempted mid-generation, its KV state is rebuilt without prior decoded tokens, and subsequent sampling continues from the wrong context (duplicated/divergent continuations). The prefill budget for preempted/rerun sequences should be based on total logical tokens that must be recomputed, not just prompt length.

Useful? React with 👍 / 👎.

Comment on lines +112 to +114
_shutdownCts.Cancel();
_commands.Writer.TryComplete();
try { _worker.Join(2000); } catch { /* best effort */ }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Complete in-flight handles when disposing engine

Dispose() cancels the worker and stops reading commands, but it never completes or faults outstanding InferenceRequestHandles in _handles. During model reload (InferenceEngineHost.Reset) or shutdown with active requests, token readers and Completion awaiters can hang indefinitely because their channels/TCS are never resolved. Disposal should drain/abort remaining requests and complete their handles before returning.

Useful? React with 👍 / 👎.

Comment on lines +382 to +386
if (args[i].StartsWith("--", StringComparison.Ordinal))
{
string suggestion = SuggestFlagCorrection(args[i]);
string suffix = suggestion != null ? $" Did you mean '{suggestion}'?" : string.Empty;
throw new ArgumentException($"Unknown option '{args[i]}'.{suffix}");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow ASP.NET host flags to pass through parsing

The new unknown-flag trap throws for any --* option not explicitly recognized by ServerOptionsBuilder, which now rejects standard ASP.NET host arguments like --urls/--environment before WebApplication.CreateBuilder(args) can consume them. This is a behavioral regression for deployments that rely on built-in host CLI switches; custom option validation should ignore or forward framework-owned flags instead of hard-failing on all unknown -- options.

Useful? React with 👍 / 👎.

@zhongkaifu zhongkaifu merged commit cd6ab84 into main May 27, 2026
0 of 2 checks passed
@zhongkaifu zhongkaifu deleted the feature/support_mlx_backend branch May 27, 2026 06:17
@github-actions
Copy link
Copy Markdown

TensorSharp Test Matrix

No report artifacts produced.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant