Improve all samples with cache-awareness, add 4 new samples, fix SDK versions, and prepare repo for public sharing by leestott · Pull Request #546 · microsoft/Foundry-Local

leestott · 2026-03-23T23:37:37Z

Summary

This PR improves every existing sample across all languages (C#, JavaScript, Python, Rust) with cache-awareness and visual feedback, adds 4 brand-new samples, fixes SDK version inconsistencies across the repo, and addresses repo hygiene issues for public sharing readiness.

93 files changed — 67 new files, 26 modified files.

What's Changed

1. New Samples (4)

`samples/js/local-cag/` — Context-Augmented Generation (12 files)

Offline CAG-powered support agent for gas field engineers. Pre-loads domain documents (valve inspections, PPE requirements, emergency shutdown procedures, etc.) directly into the context window — no vector database, no embeddings, no retrieval pipeline needed.

Express web server with streaming chat UI
Full document context pre-loading at startup
Model auto-selection with cache awareness
Domain-specific gas field safety documentation included

`samples/js/local-rag/` — Retrieval-Augmented Generation (11 files)

Offline RAG-powered support agent using SQLite + term-frequency vectors for document retrieval. Demonstrates the full RAG pipeline running 100% locally.

Document ingestion with chunking (npm run ingest)
SQLite-backed vector store with term-frequency ranking
Express web server with streaming chat UI
Same gas field domain docs for direct comparison with CAG approach

`samples/python/agent-framework/` — Microsoft Agent Framework Integration (24 files)

Full-featured agent framework sample showing Foundry Local as the LLM backend for agentic AI workflows.

5 interactive demos: weather tools, code reviewer, math agent, sentiment analyzer, multi-agent debate
Tool calling with automatic function dispatch
RAG pipeline with document ingestion
Flask web UI with streaming responses
Orchestrator pattern for multi-step reasoning
Comprehensive README with architecture diagrams

`samples/cs/whisper-transcription/` — ASP.NET Core Whisper Transcription (13 files)

Production-quality audio transcription service using Foundry Local's Whisper model via WinML.

ASP.NET Core Minimal API with proper service architecture
Drag-and-drop audio upload UI
Real-time recording via MediaRecorder API
Health checks for Foundry service availability
Error handling middleware
Clean separation: FoundryModelService, TranscriptionService, FoundryHealthCheck

2. Cache-Awareness Improvements (All Existing Samples)

Every existing sample was updated to check the local model cache before attempting downloads. This provides:

Visual feedback — users see whether their model is already cached or needs downloading
Faster startup — skips unnecessary download operations
Better UX — clear progress indicators with ✓ Model already cached or ⏳ Downloading...

C# samples updated (6 files):

AudioTranscriptionExample/Program.cs
FoundryLocalWebServer/Program.cs
HelloFoundryLocalSdk/Program.cs
ModelManagementExample/Program.cs
ToolCallingFoundryLocalSdk/Program.cs
ToolCallingFoundryLocalWebServer/Program.cs

JavaScript samples updated (7 files):

audio-transcription-example/app.js
copilot-sdk-foundry-local/src/app.ts and src/tool-calling.ts
langchain-integration-example/app.js
native-chat-completions/app.js
tool-calling-foundry-local/src/app.js
web-server-example/app.js

Python samples updated (4 files):

hello-foundry-local/src/app.py
summarize/summarize.py
functioncalling/fl_tools.ipynb
functioncalling/README.md

Notebooks updated (1 file):

rag/rag_foundrylocal_demo.ipynb — significant rewrite with cache detection, clearer cell structure, and improved RAG pipeline

3. SDK API Correctness Fixes (7 files)

Validated all samples against the latest public SDK APIs (JS SDK sdk/js/src, Python SDK sdk_legacy/python, C# SDK sdk/cs/src) and fixed:

File	Issue	Fix
`js/local-cag/src/modelSelector.js`	Used private `selectedVariant._modelInfo`	Switched to public `model.variants` / `variant.modelInfo` / `model.isCached`
`js/local-rag/src/chatEngine.js`	`progress * 100` yielded 0–10000 (SDK reports 0–100)	Changed to `Math.round(progress)` for display, `progress / 100` for normalized value
`python/summarize/summarize.py`	`load_model(cached_models[0].id)` inconsistent with alias pattern	Changed to `load_model(cached_models[0].alias)`
`python/agent-framework/foundry_boot.py`	Fragile `str(m)` substring match for model ID resolution	Replaced with `manager.get_model_info(alias).id`
`python/agent-framework/web.py`	`drain()` buffered all SSE events before yielding	Replaced with incremental `__anext__()` loop for real-time streaming
`cs/whisper-transcription/TranscriptionService.cs`	`CancellationToken.None` hardcoded	Threaded `CancellationToken` through method and into all async calls
`cs/whisper-transcription/FoundryModelService.cs`	`progress % 10 == 0` unreliable for float	Replaced with `Math.Floor(progress / 10)` threshold bucket approach

4. Review Feedback Fixes — Round 2 (3 files)

File	Issue	Fix
`cs/whisper-transcription/FoundryModelService.cs`	`InitializeAsync()` not thread-safe — concurrent ASP.NET requests could double-initialize	Added `SemaphoreSlim` with double-check locking pattern
`python/summarize/README.md`	Claimed default model is `phi-4-mini` but code uses first cached model	Aligned README with actual behavior
`js/local-rag/README.md`	Claimed "TF-IDF" throughout but implementation uses raw term-frequency (no IDF)	Replaced all "TF-IDF" references with "term-frequency"

5. Review Feedback Fixes — Round 3 (7 files)

File	Issue	Fix
`python/agent-framework/README.md`	Troubleshooting referenced `FLASK_PORT` env var that doesn't exist in code	Changed to `--port <number>` CLI flag which matches `__main__.py`
`js/local-rag/package.json`	`"tfidf"` keyword misleading — implementation is term-frequency only	Changed keyword to `"term-frequency"`
`python/agent-framework/web.py`	`asyncio.new_event_loop()` without `set_event_loop()` — breaks on Python 3.10+	Added `asyncio.set_event_loop(loop)` after creation, clears in `finally` block
`cs/whisper-transcription/FoundryModelService.cs`	`EnsureModelReadyAsync` lacked `CancellationToken`	Added `CancellationToken ct = default` parameter, threaded through `IsCachedAsync(ct)`, `DownloadAsync(..., ct)`, `LoadAsync(ct)`
`cs/whisper-transcription/TranscriptionService.cs`	Caller didn't pass `ct` to `EnsureModelReadyAsync`	Now passes `ct` from `TranscribeAsync`
`js/local-cag/src/config.js`	`host` hardcoded to `"127.0.0.1"` despite README documenting `HOST` env var	Changed to `process.env.HOST \|\| "127.0.0.1"`
`js/local-rag/src/config.js`	All config values hardcoded — `FOUNDRY_MODEL`, `PORT`, `HOST` env vars documented but not read	Added `process.env.FOUNDRY_MODEL`, `parseInt(process.env.PORT, 10)`, `process.env.HOST` with sensible defaults

6. SDK Version Fixes

File	Before	After	Issue
`samples/js/local-cag/package.json`	`^0.9.0`	`^0.5.1`	Version 0.9.0 doesn't exist on npm
`samples/js/local-rag/package.json`	`^0.9.0`	`^0.5.1`	Version 0.9.0 doesn't exist on npm
`samples/js/copilot-sdk-foundry-local/package.json`	`"latest"`	`^0.5.1`	Unpinned — could break at any time
`samples/js/chat-and-audio-foundry-local/package.json`	`"latest"`	`^0.5.1`	Unpinned — could break at any time
`samples/js/electron-chat-application/package.json`	(missing)	`^0.5.1`	`foundry-local-sdk` not listed despite `import` in `main.js`
`samples/python/summarize/requirements.txt`	`>=0.3.1`	`>=0.5.1`	Outdated min version
`samples/python/hello-foundry-local/requirements.txt`	(file missing)	Created with `>=0.5.1`	No requirements.txt existed at all

7. Repo Hygiene

SUPPORT.md — Replaced the default GitHub template (contained TODO and REPO MAINTAINER: INSERT INSTRUCTIONS HERE placeholders) with actual content pointing to GitHub Issues, docs, and samples.

Validation Performed

Check	Result
SDK API correctness — validated all samples against latest SDK source in `sdk/js/src`, `sdk_legacy/python`, `sdk/cs/src`	✅ 7 issues fixed
Thread safety — FoundryModelService.InitializeAsync uses SemaphoreSlim	✅ Fixed
CancellationToken propagation — EnsureModelReadyAsync threads ct through all async calls	✅ Fixed
Event loop safety — web.py sets event loop for Python 3.10+ compatibility	✅ Fixed
Env var consistency — config.js files in local-cag and local-rag read documented env vars	✅ Fixed
README accuracy — all READMEs match actual implementation behavior	✅ Fixed
Security scan — searched all samples for hardcoded secrets, API keys, tokens	✅ Clean — all `api_key` references are programmatic
SDK version consistency — cross-referenced every dependency file against published SDK versions	✅ Fixed (7 issues resolved)
.gitignore coverage — verified no build artifacts can be committed	✅ Comprehensive
README coverage — checked every sample has documentation	✅ 35 README.md files
License files — verified legal files present	✅ All present
No TODO/FIXME in shipping code	✅ Clean
No committed build artifacts	✅ Clean
C# compile errors — checked TranscriptionService.cs and FoundryModelService.cs	✅ No errors

SDK Version Matrix (Current State)

Language	Package	Version	Source
C#	`Microsoft.AI.Foundry.Local`	0.9.0	Central `Directory.Packages.props`
C#	`Microsoft.AI.Foundry.Local.WinML`	0.9.0	Central `Directory.Packages.props`
JavaScript	`foundry-local-sdk`	^0.5.1	All sample `package.json` files
Python	`foundry-local-sdk`	>=0.5.1	All sample `requirements.txt` files
Rust	`foundry-local-sdk`	0.1.0	Path reference to `sdk/rust/`

Notes

4 JS samples (native-chat-completions, web-server-example, audio-transcription-example, langchain-integration-example) are intentionally single-file with no package.json — their READMEs instruct users to npm install manually.
Rust samples all use path = "../../../sdk/rust" which always resolves to the latest local SDK.
Python functioncalling notebook uses ! pip install foundry-local-sdk without version pin — standard for notebooks.

…ions, and prepare repo for public sharing

vercel · 2026-03-23T23:37:42Z

@leestott is attempting to deploy a commit to the MSFT-AIP Team on Vercel.

A member of the Team first needs to authorize it.

Copilot

Pull request overview

This PR updates the repository’s samples to be more “cache-aware” (skip redundant model downloads and provide clearer progress UX), adds several new end-to-end samples (JS local CAG/RAG, Python agent framework, C# Whisper transcription), and tightens repo hygiene/version consistency in preparation for public sharing.

Changes:

Added new JS offline CAG and offline RAG samples with web UIs + model init progress reporting.
Added a new Python “agent-framework” sample (multi-agent orchestration + Flask SSE UI) and smoke tests.
Updated multiple existing samples/notebooks/docs to use cache checks, clearer lifecycle steps, and pinned SDK versions (plus SUPPORT.md refresh).

Reviewed changes

Copilot reviewed 93 out of 93 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
samples/rag/rag_foundrylocal_demo.ipynb	Updates notebook to use Foundry Local C# SDK lifecycle + SDK-managed endpoint.
samples/rag/README.md	Documents SDK-based lifecycle and removes hardcoded endpoint/variant guidance.
samples/python/summarize/summarize.py	Adds cache-aware model selection/download UX for summarize CLI.
samples/python/summarize/requirements.txt	Bumps minimum `foundry-local-sdk` version.
samples/python/summarize/README.md	Adds feature notes for cache-awareness + UX improvements.
samples/python/hello-foundry-local/src/app.py	Adds cache-check + explicit lifecycle steps before streaming chat.
samples/python/hello-foundry-local/requirements.txt	Adds missing requirements file with SDK + OpenAI deps.
samples/python/hello-foundry-local/README.md	Adds cache-aware feature notes + clarifies run steps.
samples/python/functioncalling/fl_tools.ipynb	Adds explicit lifecycle (start/cache/download/load) before tool-calling demo.
samples/python/functioncalling/README.md	Fixes notebook link + adds prerequisites/features.
samples/python/agent-framework/tests/test_smoke.py	Adds smoke tests for imports, doc loading, env override, demo registry.
samples/python/agent-framework/src/app/web.py	Flask web UI + SSE endpoints for orchestrator + demos.
samples/python/agent-framework/src/app/tool_demo.py	Standalone tool-calling validation for direct + LLM-driven tools.
samples/python/agent-framework/src/app/orchestrator.py	Implements sequential/concurrent/hybrid orchestration as async generators.
samples/python/agent-framework/src/app/foundry_boot.py	Bootstrapper for Foundry Local endpoint/model selection + env override.
samples/python/agent-framework/src/app/documents.py	Loads/chunks local docs into retriever context.
samples/python/agent-framework/src/app/demos/weather_tools.py	Adds multi-tool weather demo.
samples/python/agent-framework/src/app/demos/sentiment_analyzer.py	Adds sentiment/emotion/key-phrase tools demo.
samples/python/agent-framework/src/app/demos/registry.py	Central demo registry for web UI listing/routing.
samples/python/agent-framework/src/app/demos/multi_agent_debate.py	Adds multi-agent debate demo.
samples/python/agent-framework/src/app/demos/math_agent.py	Adds math/tools demo (includes expression evaluation).
samples/python/agent-framework/src/app/demos/code_reviewer.py	Adds code review tools demo.
samples/python/agent-framework/src/app/demos/init.py	Exposes demos + registry helpers for import/registration.
samples/python/agent-framework/src/app/agents.py	Agent factories + shared tool functions.
samples/python/agent-framework/src/app/main.py	CLI entry (web/cli modes) + orchestrator runner.
samples/python/agent-framework/src/app/init.py	Defines package root.
samples/python/agent-framework/requirements.txt	Declares runtime dependencies for the new sample.
samples/python/agent-framework/pyproject.toml	Packaging metadata + deps + dev extras (pytest).
samples/python/agent-framework/data/orchestration_patterns.md	Sample docs for retriever context.
samples/python/agent-framework/data/foundry_local_overview.md	Sample docs for retriever context.
samples/python/agent-framework/data/agent_framework_guide.md	Sample docs for retriever context.
samples/python/agent-framework/README.md	Full sample documentation + quickstart + structure.
samples/python/agent-framework/.env.example	Environment template for model/docs/log level.
samples/js/web-server-example/app.js	Adds cache check + progress bar before downloading models.
samples/js/tool-calling-foundry-local/src/app.js	Adds cache check + progress bar before downloading models.
samples/js/native-chat-completions/app.js	Adds cache check + reusable progress bar for model download.
samples/js/local-rag/src/vectorStore.js	New SQLite-backed TF store with inverted index + caching.
samples/js/local-rag/src/server.js	New Express server with SSE status + chat + upload + ingestion.
samples/js/local-rag/src/prompts.js	System prompts for gas-field RAG agent (full + compact).
samples/js/local-rag/src/ingest.js	New ingestion script to chunk + index docs into SQLite.
samples/js/local-rag/src/config.js	Config for model, chunking, paths, and server settings.
samples/js/local-rag/src/chunker.js	Front-matter parsing + chunking + cosine similarity helpers.
samples/js/local-rag/src/chatEngine.js	Initializes SDK/model + retrieval + streaming/non-streaming responses.
samples/js/local-rag/package.json	New package manifest for local-rag sample.
samples/js/local-rag/docs/valve-inspection.md	Domain doc for RAG ingestion.
samples/js/local-rag/docs/pressure-testing.md	Domain doc for RAG ingestion.
samples/js/local-rag/docs/ppe-requirements.md	Domain doc for RAG ingestion.
samples/js/local-rag/docs/gas-leak-detection.md	Domain doc for RAG ingestion.
samples/js/local-rag/docs/emergency-shutdown.md	Domain doc for RAG ingestion.
samples/js/local-rag/README.md	New sample documentation (setup/ingest/architecture).
samples/js/local-cag/src/server.js	New Express server for CAG sample + init status SSE.
samples/js/local-cag/src/prompts.js	System prompts for gas-field CAG agent (full + compact).
samples/js/local-cag/src/modelSelector.js	Auto model selection based on RAM + caching preference.
samples/js/local-cag/src/context.js	Loads docs + keyword scoring + builds selected context per query.
samples/js/local-cag/src/config.js	Config for model selection, RAM budget, server, and context size.
samples/js/local-cag/src/chatEngine.js	Initializes SDK/model + injects preloaded context per query.
samples/js/local-cag/package.json	New package manifest for local-cag sample.
samples/js/local-cag/docs/valve-inspection.md	Domain doc for CAG startup context.
samples/js/local-cag/docs/pressure-testing.md	Domain doc for CAG startup context.
samples/js/local-cag/docs/ppe-requirements.md	Domain doc for CAG startup context.
samples/js/local-cag/docs/gas-leak-detection.md	Domain doc for CAG startup context.
samples/js/local-cag/docs/emergency-shutdown.md	Domain doc for CAG startup context.
samples/js/local-cag/README.md	New sample documentation (setup/architecture/config).
samples/js/langchain-integration-example/app.js	Adds cache check + progress bar before downloading models.
samples/js/electron-chat-application/package.json	Adds missing `foundry-local-sdk` dependency.
samples/js/copilot-sdk-foundry-local/src/tool-calling.ts	Pins SDK version + cache-aware model download.
samples/js/copilot-sdk-foundry-local/src/app.ts	Pins SDK version + cache-aware model download.
samples/js/copilot-sdk-foundry-local/package.json	Pins `foundry-local-sdk` version.
samples/js/chat-and-audio-foundry-local/package.json	Pins `foundry-local-sdk` version.
samples/js/audio-transcription-example/app.js	Adds cache check + progress bar before downloading models.
samples/cs/whisper-transcription/wwwroot/styles.css	New UI styling for Whisper transcription sample.
samples/cs/whisper-transcription/wwwroot/index.html	New drag/drop UI for uploading and transcribing audio.
samples/cs/whisper-transcription/wwwroot/app.js	Client-side upload/transcribe/copy + health polling.
samples/cs/whisper-transcription/nuget.config	Adds package source mapping for Foundry packages.
samples/cs/whisper-transcription/appsettings.json	Adds Foundry config (model alias, log level).
samples/cs/whisper-transcription/WhisperTranscription.csproj	New ASP.NET Core project for transcription service.
samples/cs/whisper-transcription/Services/TranscriptionService.cs	Implements streaming transcription via Foundry SDK audio client.
samples/cs/whisper-transcription/Services/FoundryOptions.cs	Options binding for model alias + logging.
samples/cs/whisper-transcription/Services/FoundryModelService.cs	Initializes Foundry manager + cache-aware download + load.
samples/cs/whisper-transcription/README.md	New sample documentation + endpoints + setup.
samples/cs/whisper-transcription/Program.cs	Minimal API endpoints + swagger + error middleware.
samples/cs/whisper-transcription/Middleware/ErrorHandlingMiddleware.cs	Centralized exception-to-JSON error handling.
samples/cs/whisper-transcription/Health/FoundryHealthCheck.cs	Health check that validates model availability.
samples/cs/GettingStarted/src/ToolCallingFoundryLocalWebServer/Program.cs	Adds explicit cache check + download progress bar.
samples/cs/GettingStarted/src/ToolCallingFoundryLocalSdk/Program.cs	Adds explicit cache check + download progress bar.
samples/cs/GettingStarted/src/ModelManagementExample/Program.cs	Adds explicit cache check + download progress bar.
samples/cs/GettingStarted/src/HelloFoundryLocalSdk/Program.cs	Adds explicit cache check + download progress bar.
samples/cs/GettingStarted/src/FoundryLocalWebServer/Program.cs	Adds explicit cache check + download progress bar.
samples/cs/GettingStarted/src/AudioTranscriptionExample/Program.cs	Adds explicit cache check + download progress bar.
SUPPORT.md	Replaces template placeholders with real support guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/js/local-cag/src/modelSelector.js

samples/python/summarize/summarize.py

samples/python/agent-framework/src/app/foundry_boot.py

samples/cs/whisper-transcription/Services/TranscriptionService.cs

samples/js/local-rag/src/chatEngine.js

samples/python/agent-framework/src/app/web.py

samples/cs/whisper-transcription/Services/FoundryModelService.cs

Copilot

Pull request overview

Copilot reviewed 93 out of 93 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/summarize/README.md

samples/cs/whisper-transcription/Services/FoundryModelService.cs

samples/js/local-rag/README.md

samples/cs/whisper-transcription/Services/FoundryModelService.cs

… claims - FoundryModelService.cs: add SemaphoreSlim for thread-safe InitializeAsync to prevent concurrent callers from double-initializing in ASP.NET - summarize/README.md: align docs with code (uses first cached model, not phi-4-mini default) - local-rag/README.md: replace 'TF-IDF' with 'term-frequency' throughout since the implementation uses raw term-frequency maps without IDF weighting

Copilot

Pull request overview

Copilot reviewed 93 out of 93 changed files in this pull request and generated 9 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/agent-framework/README.md

samples/js/local-rag/package.json

samples/python/agent-framework/src/app/web.py

samples/cs/whisper-transcription/Services/FoundryModelService.cs

samples/js/local-cag/src/config.js

samples/js/local-rag/src/config.js

samples/python/agent-framework/src/app/orchestrator.py

samples/cs/whisper-transcription/Program.cs

samples/cs/whisper-transcription/Services/TranscriptionService.cs

…onToken, README accuracy

Copilot

Pull request overview

Copilot reviewed 93 out of 93 changed files in this pull request and generated 7 comments.

Comments suppressed due to low confidence (1)

samples/python/agent-framework/src/app/web.py:177

api_demo_run() creates a new event loop but doesn't call asyncio.set_event_loop(loop) (and doesn't clear it). For consistency with api_run() and to avoid libraries failing due to missing current event loop, set/clear the loop in a try/finally around run_until_complete.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/agent-framework/src/app/web.py

samples/js/local-cag/src/modelSelector.js

samples/cs/whisper-transcription/Program.cs

samples/js/local-rag/src/chatEngine.js

samples/python/agent-framework/src/app/orchestrator.py

samples/python/functioncalling/fl_tools.ipynb

Copilot

Pull request overview

Copilot reviewed 94 out of 94 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

samples/python/agent-framework/src/app/web.py:183

SSE responses for /api/demo/<demo_id>/run are returned with only mimetype="text/event-stream". For consistent real-time streaming (especially behind proxies), add the usual SSE headers (Cache-Control: no-cache, Connection: keep-alive, and optionally X-Accel-Buffering: no) to this Response as well.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/js/local-cag/README.md

samples/js/tool-calling-foundry-local/src/app.js

samples/python/agent-framework/src/app/web.py

Copilot

Pull request overview

Copilot reviewed 94 out of 94 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/summarize/summarize.py

samples/python/hello-foundry-local/src/app.py

samples/cs/whisper-transcription/Middleware/ErrorHandlingMiddleware.cs

Copilot

Pull request overview

Copilot reviewed 94 out of 94 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

samples/python/functioncalling/fl_tools.ipynb

Copilot · 2026-03-25T14:35:30Z

samples/js/local-rag/src/chatEngine.js

+    // Buffer chunks from the callback and yield them as an async iterable
+    const textChunks = [];
+    let resolve;
+    let done = false;
+
+    const streamPromise = this.chatClient.completeStreamingChat(messages, (chunk) => {
+      textChunks.push(chunk);
+      if (resolve) { resolve(); resolve = null; }
+    }).then(() => {
+      done = true;
+      if (resolve) { resolve(); resolve = null; }
+    });
+
+    // Yield sources metadata first
+    yield {
+      type: "sources",
+      data: chunks.map((c) => ({
+        title: c.title,
+        category: c.category,
+        docId: c.doc_id,
+        score: Math.round(c.score * 100) / 100,
+      })),
+    };
+
+    // Yield text chunks from the SDK streaming callback buffer
+    let head = 0;
+    while (!done || head < textChunks.length) {
+      if (head >= textChunks.length && !done) {
+        await new Promise((r) => { resolve = r; });
+      }
+      while (head < textChunks.length) {
+        const chunk = textChunks[head++];
+        const content = chunk.choices?.[0]?.delta?.content;
+        if (content) {
+          yield { type: "text", data: content };
+        }
+      }


queryStream() buffers every streaming chunk into textChunks and never removes processed entries. For long responses this can grow unbounded and increase memory usage. Consider storing only the extracted delta.content strings and periodically compacting the buffer (e.g., slice/splice once head passes a threshold) to keep memory bounded.

Copilot · 2026-03-25T14:35:31Z

samples/js/local-cag/src/chatEngine.js

+    // Collect streamed chunks via callback and yield them
+    const chunks = [];
+    let resolve;
+    let done = false;
+
+    const promise = this.chatClient
+      .completeStreamingChat(messages, (chunk) => {
+        const content = chunk.choices?.[0]?.delta?.content;
+        if (content) {
+          chunks.push(content);
+          if (resolve) {
+            const r = resolve;
+            resolve = null;
+            r();
+          }
+        }
+      })
+      .then(() => {
+        done = true;
+        if (resolve) {
+          const r = resolve;
+          resolve = null;
+          r();
+        }
+      });
+
+    let index = 0;
+    while (!done || index < chunks.length) {
+      if (index < chunks.length) {
+        yield { type: "text", data: chunks[index++] };
+      } else {
+        await new Promise((r) => { resolve = r; });
+      }
+    }


queryStream() accumulates streamed text in the chunks array but never clears already-yielded entries. Over time (or with many concurrent clients) this can lead to unnecessary memory growth. Consider using a bounded queue/compaction strategy (drop entries once yielded) so memory usage stays proportional to the largest in-flight gap, not total response size.

Improve samples with cache-awareness, add 4 new samples, fix SDK vers…

b6c9e49

…ions, and prepare repo for public sharing

Copilot AI review requested due to automatic review settings March 23, 2026 23:37

Copilot started reviewing on behalf of leestott March 23, 2026 23:38 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

leestott marked this pull request as draft March 24, 2026 21:38

leestott requested a review from Copilot March 24, 2026 23:42

Update

78206d0

Copilot started reviewing on behalf of leestott March 24, 2026 23:43 View session

leestott closed this Mar 24, 2026

leestott reopened this Mar 24, 2026

leestott marked this pull request as ready for review March 24, 2026 23:47

Copilot AI reviewed Mar 24, 2026

View reviewed changes

leestott requested a review from Copilot March 24, 2026 23:52

Copilot started reviewing on behalf of leestott March 24, 2026 23:55 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

fix: address round-3 review issues — env vars, event loop, Cancellati…

acf06fc

…onToken, README accuracy

leestott requested a review from Copilot March 25, 2026 00:11

Copilot started reviewing on behalf of leestott March 25, 2026 00:12 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

update

050fbed

leestott requested a review from Copilot March 25, 2026 03:06

Copilot started reviewing on behalf of leestott March 25, 2026 03:07 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

samples/js/local-cag/README.md Outdated Show resolved Hide resolved

samples/js/tool-calling-foundry-local/src/app.js Show resolved Hide resolved

samples/python/agent-framework/src/app/web.py Outdated Show resolved Hide resolved

update

e373a2b

leestott requested a review from Copilot March 25, 2026 03:58

Copilot started reviewing on behalf of leestott March 25, 2026 03:59 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Update

10d78b3

update

26908ec

leestott requested a review from Copilot March 25, 2026 14:31

Copilot started reviewing on behalf of leestott March 25, 2026 14:32 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

Conversation

leestott commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's Changed

1. New Samples (4)

samples/js/local-cag/ — Context-Augmented Generation (12 files)

samples/js/local-rag/ — Retrieval-Augmented Generation (11 files)

samples/python/agent-framework/ — Microsoft Agent Framework Integration (24 files)

samples/cs/whisper-transcription/ — ASP.NET Core Whisper Transcription (13 files)

2. Cache-Awareness Improvements (All Existing Samples)

3. SDK API Correctness Fixes (7 files)

4. Review Feedback Fixes — Round 2 (3 files)

5. Review Feedback Fixes — Round 3 (7 files)

6. SDK Version Fixes

7. Repo Hygiene

Validation Performed

SDK Version Matrix (Current State)

Notes

Uh oh!

vercel bot commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leestott commented Mar 23, 2026 •

edited

Loading

`samples/js/local-cag/` — Context-Augmented Generation (12 files)

`samples/js/local-rag/` — Retrieval-Augmented Generation (11 files)

`samples/python/agent-framework/` — Microsoft Agent Framework Integration (24 files)

`samples/cs/whisper-transcription/` — ASP.NET Core Whisper Transcription (13 files)