diff --git a/docs.json b/docs.json
index d7af84da..1d3a6919 100644
--- a/docs.json
+++ b/docs.json
@@ -399,6 +399,16 @@
}
]
},
+ {
+ "group": "Frames",
+ "pages": [
+ "server/frames/overview",
+ "server/frames/data-frames",
+ "server/frames/control-frames",
+ "server/frames/system-frames",
+ "server/frames/llm-frames"
+ ]
+ },
{
"group": "Pipeline",
"pages": [
diff --git a/server/frames/control-frames.mdx b/server/frames/control-frames.mdx
new file mode 100644
index 00000000..73b4a987
--- /dev/null
+++ b/server/frames/control-frames.mdx
@@ -0,0 +1,363 @@
+---
+title: "Control Frames"
+description: "Reference for ControlFrame types: pipeline lifecycle, response boundaries, service settings, and runtime configuration"
+---
+
+ControlFrames signal boundaries, state changes, and configuration updates within the pipeline. They are queued and processed in order alongside DataFrames. ControlFrames are cancelled on `InterruptionFrame` unless combined with `UninterruptibleFrame`. See the [frames overview](/server/frames/overview) for base class details and the full frame hierarchy.
+
+## Pipeline Lifecycle
+
+### EndFrame
+
+Signals graceful pipeline shutdown. `EndFrame` is queued with other non-SystemFrames, which allows FrameProcessors to be shut down in order, allowing queued frames ahead of the `EndFrame` to be processed first.
+
+Inherits from `UninterruptibleFrame`, meaning it cannot be cancelled by `InterruptionFrame`.
+
+
+ Optional reason for the shutdown, passed along for logging or inspection.
+
+
+### StopFrame
+
+Stops the pipeline but keeps processors in a running state. Like `EndFrame`, `StopFrame` is queued with other non-SystemFrames allowing frames preceding it to be processed first. Useful when you need to halt frame flow without tearing down the entire processor graph.
+
+Inherits from `UninterruptibleFrame`.
+
+### OutputTransportReadyFrame
+
+Indicates that the output transport is ready to receive frames. Processors waiting on transport availability can use this as their signal to begin sending.
+
+### HeartbeatFrame
+
+Used for pipeline health monitoring. Processors can observe these to detect stalls or measure latency.
+
+
+ Timestamp value for the heartbeat.
+
+
+## Processor Pause/Resume
+
+While a processor is paused, incoming frames accumulate in its internal queue rather than being dropped. Once the processor is resumed, it drains the queue and processes all buffered frames in the order they arrived.
+
+For example, the TTS service pauses itself while synthesizing a `TTSSpeakFrame`. If new text frames arrive during synthesis, they queue up instead of producing overlapping audio. The TTS resumes when `BotStoppedSpeakingFrame` (a `SystemFrame`) arrives, and the buffered frames are processed in order.
+
+Internally, each processor has two queues: a high-priority input queue for SystemFrames and a process queue for everything else. Pausing blocks the process queue, but SystemFrames continue to flow through the input queue. This is why the typical pattern is for a processor to pause itself and then resume in response to a `SystemFrame`.
+
+
+ `FrameProcessorResumeFrame` is a `ControlFrame`, which means it enters the
+ same process queue that pausing blocks. If DataFrames have already queued up
+ ahead of it, the resume frame will be stuck behind them and the processor will
+ stay paused. To resume a paused processor from outside, use the `SystemFrame`
+ variant `FrameProcessorResumeUrgentFrame` instead — it bypasses the process
+ queue entirely. See [System
+ Frames](/server/frames/system-frames#processor-pauseresume-urgent).
+
+
+### FrameProcessorPauseFrame
+
+Pauses a specific processor. Queued in order, so the processor finishes handling any frames ahead of it before pausing.
+
+
+ The processor to pause.
+
+
+### FrameProcessorResumeFrame
+
+Resumes a previously paused processor, releasing all buffered frames for processing.
+
+
+ The processor to resume.
+
+
+
+ Because this is a `ControlFrame`, it will be blocked behind any DataFrames
+ that queued up while the processor was paused. Use
+ `FrameProcessorResumeUrgentFrame` if the processor may have buffered frames.
+
+
+## LLM Response Boundaries
+
+These frames bracket LLM output, letting downstream processors (aggregators, TTS services, transports) know when a response starts and ends.
+
+### LLMFullResponseStartFrame
+
+Marks the beginning of an LLM response. Followed by one or more `TextFrame`s and terminated by `LLMFullResponseEndFrame`.
+
+### LLMFullResponseEndFrame
+
+Marks the end of an LLM response.
+
+### VisionFullResponseStartFrame
+
+Beginning of a vision model response. Inherits from `LLMFullResponseStartFrame`.
+
+### VisionFullResponseEndFrame
+
+End of a vision model response. Inherits from `LLMFullResponseEndFrame`.
+
+### LLMAssistantPushAggregationFrame
+
+Forces the assistant aggregator to commit its buffered text to context immediately, rather than waiting for the normal end-of-response boundary.
+
+## LLM Context Summarization
+
+Frames that coordinate context summarization: compressing conversation history to stay within token limits.
+
+### LLMSummarizeContextFrame
+
+Triggers manual context summarization. Push this frame to request that the LLM summarize the current conversation context.
+
+
+ Optional configuration controlling summarization behavior.
+
+
+### LLMContextSummaryRequestFrame
+
+Internal request from the aggregator to the LLM service, asking it to produce a summary. You typically won't push this yourself — the aggregator creates it in response to `LLMSummarizeContextFrame` or automatic summarization triggers.
+
+
+ Unique identifier for this summarization request.
+
+
+
+ The conversation context to summarize.
+
+
+
+ Minimum number of recent messages to preserve after summarization.
+
+
+
+ Target token count for the summarized context.
+
+
+
+ Prompt instructing the LLM how to summarize.
+
+
+
+ Optional timeout in seconds for the summarization request.
+
+
+### LLMContextSummaryResultFrame
+
+The LLM's summarization result, sent back to the aggregator.
+
+Inherits from `UninterruptibleFrame` to ensure the result is never dropped.
+
+
+ Matches the originating request.
+
+
+
+ The generated summary text.
+
+
+
+ Index of the last message included in the summary.
+
+
+
+ Error message if summarization failed, otherwise `None`.
+
+
+## LLM Thought Frames
+
+Bracket extended thinking output from LLMs that support it (e.g., Claude with extended thinking enabled).
+
+### LLMThoughtStartFrame
+
+Marks the beginning of LLM extended thinking content.
+
+
+ Whether to append thought content to the conversation context. Raises
+ `ValueError` if set to `True` without specifying `llm`.
+
+
+
+ Identifier for the LLM producing the thought. Required when
+ `append_to_context` is `True`.
+
+
+### LLMThoughtEndFrame
+
+Marks the end of LLM extended thinking content.
+
+
+ Thought signature, if provided by the LLM. Anthropic models include a
+ signature that must be preserved when appending thoughts back to context.
+
+
+## Function Calling
+
+### FunctionCallInProgressFrame
+
+Indicates that a function call is currently executing.
+
+Inherits from `UninterruptibleFrame`, ensuring it reaches downstream processors even during interruption.
+
+
+ Name of the function being called.
+
+
+
+ Unique identifier for this tool call.
+
+
+
+ Arguments passed to the function.
+
+
+
+ Whether the function call should be cancelled if the user interrupts.
+
+
+## TTS State
+
+### TTSStartedFrame
+
+Signals the beginning of a TTS audio response.
+
+
+ Identifier linking this TTS output to its originating context.
+
+
+### TTSStoppedFrame
+
+Signals the end of a TTS audio response.
+
+
+ Identifier linking this TTS output to its originating context.
+
+
+## Service Settings
+
+Runtime settings updates for LLM, TTS, STT, and other services. These let you change service configuration mid-conversation without rebuilding the pipeline. Push an `LLMUpdateSettingsFrame`, `TTSUpdateSettingsFrame`, or `STTUpdateSettingsFrame` to update the corresponding service. See the [Changing Service Settings at Runtime](/server/frames/overview#changing-service-settings-at-runtime) pattern for an example.
+
+### ServiceUpdateSettingsFrame
+
+Base frame for runtime service settings updates.
+
+Inherits from `UninterruptibleFrame`.
+
+
+ Dictionary of settings to update.
+
+
+
+ Typed settings delta. Takes precedence over the `settings` dict when both are
+ provided.
+
+
+
+ Target a specific service instance. When `None`, the frame applies to the
+ first matching service in the pipeline.
+
+
+### LLMUpdateSettingsFrame
+
+Update LLM service settings at runtime. Inherits from `ServiceUpdateSettingsFrame`.
+
+### TTSUpdateSettingsFrame
+
+Update TTS service settings at runtime. Inherits from `ServiceUpdateSettingsFrame`.
+
+### STTUpdateSettingsFrame
+
+Update STT service settings at runtime. Inherits from `ServiceUpdateSettingsFrame`.
+
+## Audio Processing
+
+### VADParamsUpdateFrame
+
+Update Voice Activity Detection parameters at runtime.
+
+
+ New VAD parameters to apply.
+
+
+### FilterControlFrame
+
+Base frame for audio filter control. Subclass this for custom filter commands.
+
+### FilterUpdateSettingsFrame
+
+Update audio filter settings. Inherits from `FilterControlFrame`.
+
+
+ Filter settings to update.
+
+
+### FilterEnableFrame
+
+Enable or disable an audio filter. Inherits from `FilterControlFrame`.
+
+
+ `True` to enable the filter, `False` to disable it.
+
+
+### MixerControlFrame
+
+Base frame for audio mixer control.
+
+### MixerUpdateSettingsFrame
+
+Update audio mixer settings. Inherits from `MixerControlFrame`.
+
+
+ Mixer settings to update.
+
+
+### MixerEnableFrame
+
+Enable or disable an audio mixer. Inherits from `MixerControlFrame`.
+
+
+ `True` to enable the mixer, `False` to disable it.
+
+
+## Service Switching
+
+### ServiceSwitcherFrame
+
+Base frame for service switching operations.
+
+### ManuallySwitchServiceFrame
+
+Request a manual switch to a different service instance. Inherits from `ServiceSwitcherFrame`.
+
+
+ The service to switch to.
+
+
+### ServiceSwitcherRequestMetadataFrame
+
+Request that a service re-emit its metadata. Useful after switching services to ensure downstream processors have current configuration.
+
+
+ The service to request metadata from.
+
+
+## Task Frames
+
+Task frames are pushed upstream to the pipeline task, which converts them into the appropriate downstream frame. This indirection lets processors request pipeline-level actions without needing direct access to the pipeline task.
+
+### TaskFrame
+
+Base frame for task control.
+
+### EndTaskFrame
+
+Request graceful pipeline shutdown. The pipeline task converts this into an `EndFrame` and pushes it downstream. Inherits from `TaskFrame` and `UninterruptibleFrame`.
+
+
+ Optional reason for the shutdown request.
+
+
+### StopTaskFrame
+
+Request pipeline stop while keeping processors alive. Converted to a `StopFrame` downstream. Inherits from `TaskFrame` and `UninterruptibleFrame`.
diff --git a/server/frames/data-frames.mdx b/server/frames/data-frames.mdx
new file mode 100644
index 00000000..540b67a0
--- /dev/null
+++ b/server/frames/data-frames.mdx
@@ -0,0 +1,382 @@
+---
+title: "Data Frames"
+description: "Reference for DataFrame types: audio, image, text, transcription, and transport messages"
+---
+
+## Overview
+
+DataFrames carry the main content flowing through a pipeline: audio chunks, text, images, transcriptions, and messages. They are queued and processed in order with other DataFrames and ControlFrames, and any pending DataFrames are discarded when a user interrupts. See the [Frames overview](/server/frames/overview) for base class details, mixin fields, and frame properties common to all frames.
+
+## Audio Frames
+
+These frames carry raw audio through the pipeline toward the output transport. Each inherits the `audio`, `sample_rate`, `num_channels`, and `num_frames` fields from the [`AudioRawFrame`](/server/frames/overview#audiorawframe) mixin.
+
+### OutputAudioRawFrame
+
+A chunk of raw audio destined for the output transport. Use the inherited `transport_destination` field when your transport supports multiple audio tracks.
+
+Inherits from `AudioRawFrame`.
+
+### TTSAudioRawFrame
+
+Audio generated by a TTS service, ready for playback.
+
+Inherits from `OutputAudioRawFrame`.
+
+
+ Identifier for the TTS context that generated this audio.
+
+
+### SpeechOutputAudioRawFrame
+
+Audio from a continuous speech stream. The stream may contain silence frames intermixed with speech, so downstream processors may need to distinguish between the two.
+
+Inherits from `OutputAudioRawFrame`.
+
+## Image Frames
+
+Frames for carrying image data to the output transport. Each inherits `image`, `size`, and `format` from the [`ImageRawFrame`](/server/frames/overview#imagerawframe) mixin.
+
+### OutputImageRawFrame
+
+An image for display by the output transport. Supports the `transport_destination` field for transports with multiple video tracks.
+
+Inherits from `ImageRawFrame`.
+
+
+ The `sync_with_audio` field (default `False`) is set internally, not via the
+ constructor. When `True`, the image is queued with audio frames so it displays
+ only after all preceding audio has been sent. When `False`, the transport
+ displays it immediately.
+
+
+### URLImageRawFrame
+
+An output image with an associated download URL, typically from a third-party image generation service.
+
+Inherits from `OutputImageRawFrame`.
+
+
+ URL where the image can be downloaded.
+
+
+### AssistantImageRawFrame
+
+An image generated by the assistant for both display and inclusion in LLM context. The superclass handles display; the additional fields here carry the original image data in a format suitable for direct use in LLM context messages.
+
+Inherits from `OutputImageRawFrame`.
+
+
+ Original image data for use in LLM context messages without further encoding.
+
+
+
+ MIME type of the original image data.
+
+
+### SpriteFrame
+
+An animated sprite composed of multiple image frames. The transport plays the images at the framerate specified by the transport's `camera_out_framerate` parameter.
+
+
+ Ordered list of image frames that make up the sprite animation.
+
+
+## Text Frames
+
+Text content at various stages of processing: raw text, LLM output, aggregated results, TTS input, and transcriptions.
+
+### TextFrame
+
+The fundamental text container. Emitted by LLM services, consumed by context aggregators, TTS services, and other processors.
+
+
+ The text content.
+
+
+
+ Several non-constructor fields control downstream behavior: - `skip_tts`
+ (default `None`): when set, tells the TTS service to skip this text -
+ `includes_inter_frame_spaces` (default `False`): indicates whether
+ leading/trailing spaces are already included - `append_to_context` (default
+ `True`): whether this text should be appended to the LLM context
+
+
+### LLMTextFrame
+
+Text generated by an LLM service. Behaves like a `TextFrame` with `includes_inter_frame_spaces` set to `True`, since LLM services include all necessary spacing.
+
+Inherits from `TextFrame`.
+
+### AggregatedTextFrame
+
+Multiple text frames combined into a single frame for processing or output.
+
+Inherits from `TextFrame`.
+
+
+ Method used to aggregate the text frames.
+
+
+
+ Identifier for the TTS context associated with this text.
+
+
+### VisionTextFrame
+
+Text output from a vision model. Functionally identical to `LLMTextFrame` but distinguished by type for routing purposes.
+
+Inherits from `LLMTextFrame`.
+
+### TTSTextFrame
+
+Text that has been sent to a TTS service for synthesis.
+
+Inherits from `AggregatedTextFrame`.
+
+
+ Identifier for the TTS context that generated this text.
+
+
+### Transcriptions
+
+Frames produced by speech-to-text services at different stages of recognition. All inherit from `TextFrame`, so they flow through text aggregators and other `TextFrame` handlers.
+
+#### TranscriptionFrame
+
+A non-interim transcription result from an STT service: the service's best recognition of what the user said, as opposed to the streaming partial results in `InterimTranscriptionFrame`.
+
+
+ Identifier for the user who spoke.
+
+
+
+ When the transcription occurred.
+
+
+
+ Detected or specified language of the speech.
+
+
+
+ Raw result object from the STT service.
+
+
+
+ Whether the STT service has explicitly committed this transcription via a
+ finalize signal. Some services (AssemblyAI, Deepgram, Soniox, Speechmatics)
+ support this; others don't, so it defaults to `False`. Turn detection
+ strategies can use this flag to trigger the bot's response immediately rather
+ than waiting for a timeout.
+
+
+#### InterimTranscriptionFrame
+
+A partial, in-progress transcription. These frames update frequently while the user is still speaking, and are superseded by a `TranscriptionFrame` once the STT service produces its result.
+
+
+ The partial transcription text.
+
+
+
+ Identifier for the user who spoke.
+
+
+
+ When the interim transcription occurred.
+
+
+
+ Detected or specified language of the speech.
+
+
+
+ Raw result object from the STT service.
+
+
+#### TranslationFrame
+
+A translated transcription, typically placed in the transport's receive queue when a participant speaks in a different language.
+
+
+ Identifier for the user who spoke.
+
+
+
+ When the translation occurred.
+
+
+
+ Target language of the translation.
+
+
+## TTS Frames
+
+### TTSSpeakFrame
+
+Sends text to the pipeline's TTS service as a standalone utterance, independent of any LLM response turn. The TTS service creates a fresh audio context for each `TTSSpeakFrame`, whereas `TextFrame`s produced during an LLM response are grouped under the same turn context.
+
+
+ The text to be spoken.
+
+
+
+ Whether to append the spoken text to the LLM context.
+
+
+## Transport Message Frames
+
+### OutputTransportMessageFrame
+
+A transport-specific message payload for sending data through the output transport. The message format depends on the transport implementation.
+
+
+ The transport message payload.
+
+
+## DTMF Frames
+
+### OutputDTMFFrame
+
+A DTMF (Dual-Tone Multi-Frequency) keypress queued for output. Inherits the `button` field from the `DTMFFrame` mixin, which holds the keypad entry that was pressed.
+
+Inherits from `DTMFFrame`.
+
+
+ The DTMF keypad entry to send.
+
+
+
+ For transports that support multiple dial-out destinations, set the
+ `transport_destination` field (inherited from `Frame`) to specify which
+ destination receives the DTMF tone.
+
+
+## LLM Context Management
+
+Frames that modify or trigger processing of the LLM conversation context.
+
+### LLMMessagesAppendFrame
+
+Appends messages to the current conversation context without replacing existing ones.
+
+
+ List of message dictionaries to append.
+
+
+
+ Whether the LLM should process the updated context immediately. When `None`,
+ the default behavior of the context aggregator applies.
+
+
+### LLMMessagesUpdateFrame
+
+Replaces the current context messages entirely with a new set.
+
+
+ List of message dictionaries to replace the current context.
+
+
+
+ Whether the LLM should process the updated context immediately. When `None`,
+ the default behavior of the context aggregator applies.
+
+
+### LLMRunFrame
+
+Triggers LLM processing with the current context. Push this frame when you want the LLM to generate a response using whatever context has already been assembled.
+
+### LLMContextAssistantTimestampFrame
+
+Records when an assistant message was created. Used internally to track timing of assistant responses in the conversation context.
+
+
+ Timestamp when the assistant message was created.
+
+
+## LLM Thinking
+
+### LLMThoughtTextFrame
+
+A chunk of thought or reasoning text from the LLM. This is a `DataFrame`, not a `TextFrame` subclass — TTS services and text aggregators will not process it.
+
+
+ The text (or text chunk) of the thought.
+
+
+## LLM Tool Configuration
+
+Frames for configuring LLM function calling behavior and output settings at runtime.
+
+### LLMSetToolsFrame
+
+Sets the available tools for LLM function calling. The format of tool definitions typically follows JSON Schema conventions, though the exact structure depends on the LLM provider.
+
+
+ List of tool/function definitions for the LLM.
+
+
+### LLMSetToolChoiceFrame
+
+Configures how the LLM selects tools during function calling.
+
+
+ Tool choice setting: `"none"` disables tool use, `"auto"` lets the LLM decide,
+ `"required"` forces a tool call, or a dict specifying a particular tool.
+
+
+### LLMEnablePromptCachingFrame
+
+Toggles prompt caching for LLMs that support it.
+
+
+ Whether to enable prompt caching.
+
+
+### LLMConfigureOutputFrame
+
+Configures how the LLM produces output. Useful for scenarios where you want the LLM to generate tokens that update context but should not be spoken aloud.
+
+
+ When `True`, LLM tokens are added to context but not passed to TTS.
+
+
+## Function Call Results
+
+### FunctionCallResultFrame
+
+Contains the result of a completed function call execution.
+
+Inherits from `UninterruptibleFrame` to ensure the result always reaches the context aggregator.
+
+
+ Name of the function that was executed.
+
+
+
+ Unique identifier for the function call.
+
+
+
+ Arguments that were passed to the function.
+
+
+
+ The result returned by the function.
+
+
+
+ Whether to run the LLM after this result. Overrides the default behavior.
+
+
+
+ Additional properties for result handling.
+
diff --git a/server/frames/llm-frames.mdx b/server/frames/llm-frames.mdx
new file mode 100644
index 00000000..e0664779
--- /dev/null
+++ b/server/frames/llm-frames.mdx
@@ -0,0 +1,64 @@
+---
+title: "LLM Frames"
+description: "LLM context frame and function calling helper dataclasses"
+---
+
+This page documents LLM-specific types that don't belong on a base-type page: `LLMContextFrame` (which inherits directly from `Frame`) and the helper dataclasses used by function calling frames. All other LLM-related frames are documented on their base-type pages. See [Related Frames](#related-frames) below for links.
+
+## LLMContextFrame
+
+Contains a complete LLM context. Acts as a signal to LLM services to ingest the provided context and generate a response.
+
+Inherits directly from `Frame` (not `DataFrame`, `ControlFrame`, or `SystemFrame`).
+
+
+ The LLM context containing messages, tools, and configuration.
+
+
+## Function Calling Helper Dataclasses
+
+These are plain dataclasses used as fields within function calling frames, not frames themselves.
+
+### FunctionCallFromLLM
+
+Represents a function call returned by the LLM, ready for execution.
+
+
+ The name of the function to call.
+
+
+
+ A unique identifier for the function call.
+
+
+
+ The arguments to pass to the function.
+
+
+
+ The LLM context at the time the function call was made.
+
+
+### FunctionCallResultProperties
+
+Configures how a function call result is handled after execution.
+
+
+ Whether to run the LLM after receiving this result.
+
+
+
+ Async callback to execute when the context is updated with the result.
+
+
+## Related Frames
+
+LLM-related frames organized by base type:
+
+- **Data Frames**: [Context Management](/server/frames/data-frames#llm-context-management), [Thinking](/server/frames/data-frames#llm-thinking), [Tool Configuration](/server/frames/data-frames#llm-tool-configuration), [Function Call Results](/server/frames/data-frames#function-call-results)
+- **Control Frames**: [Response Boundaries](/server/frames/control-frames#llm-response-boundaries), [Context Summarization](/server/frames/control-frames#llm-context-summarization), [Thought Frames](/server/frames/control-frames#llm-thought-frames), [Function Calling](/server/frames/control-frames#function-calling), [Service Settings](/server/frames/control-frames#service-settings)
+- **System Frames**: [Function Calling](/server/frames/system-frames#function-calling)
diff --git a/server/frames/overview.mdx b/server/frames/overview.mdx
new file mode 100644
index 00000000..6f86ef6e
--- /dev/null
+++ b/server/frames/overview.mdx
@@ -0,0 +1,300 @@
+---
+title: "Frames"
+description: "Frame categories, processing behavior, and common patterns for Pipecat pipelines"
+---
+
+## Overview
+
+Frames are the fundamental units of data in Pipecat. Every piece of information that moves through a pipeline — audio, text, images, control signals — is wrapped in a frame. Frame processors receive frames, act on them, and push new or modified frames along to the next processor.
+
+All frames inherit from the base `Frame` class and are Python [dataclasses](https://docs.python.org/3/library/dataclasses.html).
+
+## Frame Categories
+
+Pipecat has three base frame types, each with different processing behavior:
+
+| Base Type | Processing | Interruption Behavior |
+| -------------- | ------------------------------------------------------------- | -------------------------------------- |
+| `DataFrame` | Queued, processed in order with non-SystemFrames | Cancelled on user interruption |
+| `ControlFrame` | Queued, processed in order with non-SystemFrames | Cancelled on user interruption |
+| `SystemFrame` | Higher priority, queued, processed in order with SystemFrames | **Not** cancelled on user interruption |
+
+### DataFrame
+
+Data frames carry the main content flowing through a pipeline: audio chunks, text, images, and LLM messages. They are queued and processed in order with other DataFrames and ControlFrames. If a user interrupts (starts speaking while the bot is responding), any pending data frames are discarded so the new input can be handled immediately.
+
+Examples: `TextFrame`, `OutputAudioRawFrame`, `LLMMessagesAppendFrame`, `TTSSpeakFrame`
+
+### ControlFrame
+
+ControlFrames signal processing boundaries and configuration changes: response start/end markers, settings updates, and state transitions. They are queued and processed in order alongside DataFrames, and like DataFrames, any pending ControlFrames are discarded when a user interrupts unless combined with `UninterruptibleFrame`.
+
+Examples: `EndFrame`, `LLMFullResponseStartFrame`, `TTSStartedFrame`, `ServiceUpdateSettingsFrame`
+
+### SystemFrame
+
+SystemFrames are high-priority signals that must always be delivered: interruptions, user input, error notifications, and pipeline lifecycle events. They are queued and processed in order with other SystemFrames. Unlike DataFrames and ControlFrames, they are never discarded when a user interrupts.
+
+Examples: `StartFrame`, `CancelFrame`, `InterruptionFrame.`, `UserStartedSpeakingFrame`, `InputAudioRawFrame`
+
+## Frame Properties
+
+Every frame has these properties set automatically:
+
+
+ Unique identifier for the frame instance.
+
+
+
+ Human-readable name combining class name and instance count (e.g.,
+ `TextFrame#3`). Useful for debugging.
+
+
+
+ Presentation timestamp in nanoseconds. Used for audio/video synchronization.
+
+
+
+ Dictionary for arbitrary frame metadata.
+
+
+
+ Name of the transport source that created this frame.
+
+
+
+ Name of the transport destination for this frame. Used when a transport
+ supports multiple output tracks.
+
+
+## Frame Direction
+
+Frames flow through the pipeline in one of two directions:
+
+```python
+from pipecat.processors.frame_processor import FrameDirection
+
+class FrameDirection(Enum):
+ DOWNSTREAM = 1 # Input → Output (default)
+ UPSTREAM = 2 # Output → Input
+```
+
+**Downstream** is the default. In a typical voice AI pipeline, audio enters from the transport input, gets transcribed, runs through the LLM, converts to speech, and reaches the transport output.
+
+**Upstream** lets processors send information back toward the start of the pipeline. The most common example: the assistant context aggregator at the end of the pipeline pushes context frames upstream so they flow back to the LLM.
+
+### Pushing Frames
+
+Within a frame processor, call `push_frame()` to send a frame to the next processor:
+
+```python
+# Push downstream (default)
+await self.push_frame(frame, FrameDirection.DOWNSTREAM)
+
+# Push upstream
+await self.push_frame(frame, FrameDirection.UPSTREAM)
+```
+
+### Broadcasting Frames
+
+To send a frame in **both** directions simultaneously, use `broadcast_frame()`:
+
+```python
+# Create and push instances upstream and downstream
+await self.broadcast_frame(UserStartedSpeakingFrame)
+```
+
+Each direction receives its own frame instance, linked by `broadcast_sibling_id`.
+
+To broadcast an existing frame instance (when you are not the original creator of the frame), use `broadcast_frame_instance()`:
+
+```python
+# Broadcast an existing frame instance in both directions
+await self.broadcast_frame_instance(frame)
+```
+
+This creates two new instances by shallow-copying all fields from the original frame except `id` and `name`, which get fresh values.
+
+Prefer `broadcast_frame()` when possible, as it is more efficient.
+
+## Mixins
+
+Mixins add cross-cutting behavior or shared data fields to frames without changing their base type.
+
+### UninterruptibleFrame
+
+Occasionally a `DataFrame` or `ControlFrame` is too important to discard during an interruption. Adding the `UninterruptibleFrame` mixin protects it: the frame stays in internal queues and any task processing it will not be cancelled.
+
+```python
+@dataclass
+class FunctionCallResultFrame(DataFrame, UninterruptibleFrame):
+ """Must be delivered even if the user interrupts."""
+ ...
+```
+
+Examples: `EndFrame`, `StopFrame`, `FunctionCallResultFrame`, `FunctionCallInProgressFrame`
+
+### AudioRawFrame
+
+Carries raw audio fields shared by both input and output audio frames.
+
+
+ Raw audio bytes in PCM format.
+
+
+
+ Audio sample rate in Hz (e.g., 16000).
+
+
+
+ Number of audio channels (e.g., 1 for mono).
+
+
+
+ Number of audio frames. Calculated automatically from the audio data.
+
+
+### ImageRawFrame
+
+Carries raw image fields shared by both input and output image frames.
+
+
+ Raw image bytes.
+
+
+
+ Image dimensions as (width, height).
+
+
+
+ Image format (e.g., `"RGB"`, `"RGBA"`).
+
+
+## Common Patterns
+
+Pipecat prefers pushing frames over calling methods directly between processors. Routing data through the pipeline as frames ensures correct processing order, which is critical for real-time use cases.
+
+Most frames are produced and consumed by Pipecat's built-in services. The patterns below cover the frames you're most likely to push yourself in application code.
+
+### Starting a Conversation
+
+Add an initial message to the context, then push `LLMRunFrame` to kick off processing:
+
+```python
+@transport.event_handler("on_client_connected")
+async def on_client_connected(transport, client):
+ context.add_message({"role": "user", "content": "Please introduce yourself."})
+ await task.queue_frames([LLMRunFrame()])
+```
+
+### Injecting a Prompt
+
+`LLMMessagesAppendFrame` adds messages to the context without replacing what's already there. Set `run_llm=True` to trigger a response immediately:
+
+```python
+message = {
+ "role": "user",
+ "content": "The user has been quiet. Ask if they're still there.",
+}
+await aggregator.push_frame(LLMMessagesAppendFrame([message], run_llm=True))
+```
+
+### Speaking Without the LLM
+
+`TTSSpeakFrame` sends text directly to the TTS service as a standalone utterance, bypassing the LLM entirely:
+
+```python
+@llm.event_handler("on_function_calls_started")
+async def on_function_calls_started(service, function_calls):
+ await tts.queue_frame(TTSSpeakFrame("Let me check on that."))
+```
+
+### Ending a Conversation
+
+Push `EndTaskFrame` upstream to gracefully shut down the pipeline. Pair it with a `TTSSpeakFrame` to say goodbye first:
+
+```python
+await aggregator.push_frame(
+ TTSSpeakFrame("It seems like you're busy. Have a nice day!")
+)
+await aggregator.push_frame(EndTaskFrame(), FrameDirection.UPSTREAM)
+```
+
+### Changing Service Settings at Runtime
+
+Push settings frames to adjust LLM, TTS, or STT configuration mid-conversation:
+
+```python
+await task.queue_frame(
+ LLMUpdateSettingsFrame(delta=OpenAILLMService.Settings(temperature=0.1))
+)
+```
+
+### Updating Tools at Runtime
+
+Add or replace available function-calling tools while the conversation is active:
+
+```python
+new_tools = ToolsSchema(
+ standard_tools=[weather_function, restaurant_function]
+)
+await task.queue_frames([LLMSetToolsFrame(tools=new_tools)])
+```
+
+### Playing Sound Effects
+
+Load audio files and push `OutputAudioRawFrame` directly from a custom processor:
+
+```python
+with wave.open("ding.wav") as f:
+ ding = OutputAudioRawFrame(f.readframes(-1), f.getframerate(), f.getnchannels())
+
+class SoundEffect(FrameProcessor):
+ async def process_frame(self, frame, direction):
+ await super().process_frame(frame, direction)
+ if isinstance(frame, LLMFullResponseEndFrame):
+ await self.push_frame(ding)
+ await self.push_frame(frame, direction)
+```
+
+### Reacting to LLM Response Boundaries
+
+`LLMFullResponseStartFrame` and `LLMFullResponseEndFrame` bracket every LLM response. Custom processors can watch for these to trigger side effects:
+
+```python
+class ResponseLogger(FrameProcessor):
+ async def process_frame(self, frame, direction):
+ await super().process_frame(frame, direction)
+ if isinstance(frame, LLMFullResponseStartFrame):
+ logger.info("LLM response started")
+ elif isinstance(frame, LLMFullResponseEndFrame):
+ logger.info("LLM response finished")
+ await self.push_frame(frame, direction)
+```
+
+## Frame Type Reference
+
+The individual reference pages below document every frame class, organized by function:
+
+
+
+ Audio, image, text, transcription, and transport message frames that carry
+ content through the pipeline.
+
+
+ Pipeline lifecycle, LLM response boundaries, TTS state, service settings,
+ and filter/mixer configuration.
+
+
+ Interruptions, user/bot speaking state, VAD events, errors, metrics, and raw
+ input frames.
+
+
+ LLM context frame, function calling helper dataclasses, and links to
+ LLM-related frames on other pages.
+
+
diff --git a/server/frames/system-frames.mdx b/server/frames/system-frames.mdx
new file mode 100644
index 00000000..deb3d24e
--- /dev/null
+++ b/server/frames/system-frames.mdx
@@ -0,0 +1,478 @@
+---
+title: "System Frames"
+description: "Reference for SystemFrame types: pipeline lifecycle, interruptions, speaking state, input, and diagnostics"
+---
+
+SystemFrames have higher priority than DataFrames and ControlFrames and are never cancelled during user interruptions. They are queued and processed in order with other SystemFrames. They carry signals that must always be delivered: pipeline startup and teardown, error notifications, user input, and speaking state changes. See the [frames overview](/server/frames/overview) for base class details, mixin fields, and frame properties common to all frames.
+
+## Pipeline Lifecycle
+
+### StartFrame
+
+The first frame pushed into a pipeline, initializing all processors. Every processor receives this before any DataFrames or ControlFrames arrive.
+
+
+ Input audio sample rate in Hz.
+
+
+
+ Output audio sample rate in Hz.
+
+
+
+ Whether user interruptions are allowed. Deprecated since 0.0.99: use
+ interruption strategies instead.
+
+
+
+ Enable performance metrics collection from processors.
+
+
+
+ Enable tracing for pipeline execution.
+
+
+
+ Enable usage metrics (token counts, API calls) from services.
+
+
+
+ List of interruption strategies for the pipeline. Deprecated since 0.0.99.
+
+
+
+ When `True`, only report time-to-first-byte for the initial response rather
+ than every response.
+
+
+
+ Optional tracing context for distributed tracing integration.
+
+
+### CancelFrame
+
+Stops the pipeline immediately, skipping any queued non-SystemFrames. Use this when you need to abort without waiting for pending work to drain. For example, when the user has left the session.
+
+
+ Optional reason for the cancellation.
+
+
+## Errors
+
+### ErrorFrame
+
+Carries an error notification, typically pushed upstream so earlier processors can react.
+
+
+ Human-readable error message.
+
+
+
+ Whether this error is fatal and requires the bot to shut down.
+
+
+
+ The processor that raised the error.
+
+
+
+ The underlying exception, if one was caught.
+
+
+### FatalErrorFrame
+
+An unrecoverable error requiring the bot to shut down. The `fatal` field is always `True`.
+
+Inherits from `ErrorFrame`.
+
+## Processor Pause/Resume (Urgent)
+
+These are the `SystemFrame` variants of `FrameProcessorPauseFrame` and `FrameProcessorResumeFrame`. As SystemFrames, they flow through the high-priority input queue rather than the process queue, so they are not blocked by paused state or buffered frames. This makes `FrameProcessorResumeUrgentFrame` the correct way to resume a processor externally — the `ControlFrame` variant (`FrameProcessorResumeFrame`) would get stuck behind any DataFrames that queued up during the pause. See [Control Frames](/server/frames/control-frames#processor-pauseresume) for the full explanation.
+
+### FrameProcessorPauseUrgentFrame
+
+Pauses a processor immediately, without waiting for queued frames to drain first.
+
+
+ The processor to pause.
+
+
+### FrameProcessorResumeUrgentFrame
+
+Resumes a paused processor immediately, releasing buffered frames. Use this instead of `FrameProcessorResumeFrame` when the processor may have frames queued up.
+
+
+ The processor to resume.
+
+
+## Interruptions
+
+### InterruptionFrame
+
+Interrupts the pipeline, discarding pending DataFrames and ControlFrames. Typically triggered when the user starts speaking during a bot response.
+
+## User Speaking State
+
+### UserStartedSpeakingFrame
+
+Indicates that a user turn has begun. By this point, transcriptions are usually already flowing through the pipeline.
+
+
+ Whether this event was emulated rather than detected by VAD. Deprecated since
+ 0.0.99.
+
+
+### UserStoppedSpeakingFrame
+
+Marks the end of a user turn. The bot's response is triggered separately by the turn detection system.
+
+
+ Whether this event was emulated rather than detected by VAD. Deprecated since
+ 0.0.99.
+
+
+### UserSpeakingFrame
+
+Emitted by the VAD processor while the user is actively speaking. Useful for UI feedback or suppressing idle timeouts.
+
+### UserMuteStartedFrame
+
+Broadcast when one or more [user mute strategies](/server/utilities/turn-management/user-mute-strategies) activate. User mute temporarily suppresses user input while the bot is speaking to prevent interruptions. While muted, the `LLMUserAggregator` drops incoming user frames (`InputAudioRawFrame`, `TranscriptionFrame`, `InterimTranscriptionFrame`, `UserStartedSpeakingFrame`, `UserStoppedSpeakingFrame`, VAD signals, and `InterruptionFrame`). Lifecycle frames (`StartFrame`, `EndFrame`, `CancelFrame`) are never muted.
+
+### UserMuteStoppedFrame
+
+Broadcast when all active [user mute strategies](/server/utilities/turn-management/user-mute-strategies) deactivate, allowing user input to be processed again.
+
+## VAD Events
+
+These frames are emitted directly by the Voice Activity Detection (VAD) processor and carry timing metadata. Higher-level speaking-state frames (`UserStartedSpeakingFrame`, `UserStoppedSpeakingFrame`) are derived from these.
+
+### VADUserStartedSpeakingFrame
+
+VAD confirmed that speech has started.
+
+
+ Timestamp in seconds when speech onset was detected.
+
+
+
+ Wall-clock time when the frame was created.
+
+
+### VADUserStoppedSpeakingFrame
+
+VAD confirmed that speech has ended.
+
+
+ Timestamp in seconds when speech ended.
+
+
+
+ Wall-clock time when the frame was created.
+
+
+### SpeechControlParamsFrame
+
+Notifies processors that VAD or turn detection parameters have changed at runtime.
+
+
+ Updated VAD parameters.
+
+
+
+ Updated turn detection parameters.
+
+
+## Bot Speaking State
+
+### BotStartedSpeakingFrame
+
+Emitted by the output transport when the bot begins speaking. Broadcast in both directions so processors on either side of the transport can react.
+
+### BotStoppedSpeakingFrame
+
+Emitted by the output transport when the bot finishes speaking. Also broadcast in both directions.
+
+### BotSpeakingFrame
+
+Emitted continuously while the bot is speaking. Processors can use this to suppress idle timeouts or drive visual indicators.
+
+## Connection Status
+
+### BotConnectedFrame
+
+The bot has joined the transport room. Only relevant for SFU-based transports: Daily, LiveKit, HeyGen, and Tavus.
+
+### ClientConnectedFrame
+
+A client or participant has connected to the transport.
+
+## Input Frames
+
+Input frames carry raw data from transport sources into the pipeline. As `SystemFrame`s, they are never discarded during interruptions. Incoming user data must always be processed.
+
+### InputAudioRawFrame
+
+Raw audio received from the transport. Inherits the `audio`, `sample_rate`, `num_channels`, and `num_frames` fields from the [`AudioRawFrame`](/server/frames/overview#audiorawframe) mixin.
+
+Inherits from `AudioRawFrame`.
+
+### UserAudioRawFrame
+
+Audio from a specific user in a multi-participant session.
+
+Inherits from `InputAudioRawFrame`.
+
+
+ Identifier for the user who produced this audio.
+
+
+### InputImageRawFrame
+
+Raw image received from the transport. Inherits `image`, `size`, and `format` from the [`ImageRawFrame`](/server/frames/overview#imagerawframe) mixin.
+
+Inherits from `ImageRawFrame`.
+
+### UserImageRawFrame
+
+An image from a specific user, optionally tied to a pending image request.
+
+Inherits from `InputImageRawFrame`.
+
+
+ Identifier for the user who produced this image.
+
+
+
+ Optional text associated with the image.
+
+
+
+ Whether to append this image to the LLM context.
+
+
+
+ The original request frame that triggered this image capture.
+
+
+### InputTextRawFrame
+
+Text received from the transport, such as a user typing in a chat interface. Inherits the `text` field from `TextFrame`.
+
+Inherits from `TextFrame`.
+
+## DTMF Input
+
+### InputDTMFFrame
+
+A DTMF keypress received from the transport. Inherits the `button` field from the `DTMFFrame` mixin.
+
+Inherits from `DTMFFrame`.
+
+### OutputDTMFUrgentFrame
+
+A DTMF keypress for immediate output, bypassing the normal frame queue.
+
+Inherits from `DTMFFrame`.
+
+## Transport Messages
+
+### InputTransportMessageFrame
+
+A message received from an external transport. The message format is transport-specific.
+
+
+ The transport message payload.
+
+
+### OutputTransportMessageUrgentFrame
+
+An outbound transport message that bypasses the normal queue for immediate delivery.
+
+
+ The transport message payload.
+
+
+## Function Calling
+
+### FunctionCallsStartedFrame
+
+Signals that one or more function calls are about to begin executing.
+
+
+ Sequence of function calls that will be executed.
+
+
+### FunctionCallCancelFrame
+
+Signals that a function call was cancelled, typically due to user interruption when the function's `cancel_on_interruption` flag is set.
+
+
+ Name of the function that was cancelled.
+
+
+
+ Unique identifier for the cancelled function call.
+
+
+## User Interaction
+
+### UserImageRequestFrame
+
+Requests an image from a specific user, typically to capture a camera frame for vision processing.
+
+
+ Identifier for the user to capture from.
+
+
+
+ Optional text prompt associated with the image request.
+
+
+
+ Whether to append the resulting image to the LLM context.
+
+
+
+ Specific video source to capture from.
+
+
+
+ Function name if this request originated from a tool call.
+
+
+
+ Tool call identifier if this request originated from a tool call.
+
+
+
+ Callback to invoke with the captured image result.
+
+
+### STTMuteFrame
+
+Mutes or unmutes the STT service. While muted, incoming audio is not sent to the STT provider.
+
+
+ `True` to mute, `False` to unmute.
+
+
+### UserIdleTimeoutUpdateFrame
+
+Updates the user idle timeout at runtime. Set to `0` to disable idle detection entirely.
+
+
+ New idle timeout in seconds. `0` disables detection.
+
+
+## Diagnostics
+
+### MetricsFrame
+
+Performance metrics collected from processors. Emitted when metrics reporting is enabled via `StartFrame`.
+
+
+ List of metrics data entries.
+
+
+## Service Metadata
+
+### ServiceMetadataFrame
+
+Base metadata frame broadcast by services at startup, providing information about service capabilities and configuration.
+
+
+ Name of the service that emitted this metadata.
+
+
+### STTMetadataFrame
+
+Metadata from an STT service, including latency characteristics used for turn detection tuning.
+
+Inherits from `ServiceMetadataFrame`.
+
+
+ P99 latency in seconds for time-to-final-segment. Used by turn detectors to
+ calibrate wait times.
+
+
+## RTVI
+
+Frames for the [Real-Time Voice Interface (RTVI)](/server/frameworks/rtvi) protocol, which bridges clients and the pipeline. These frames handle custom messaging between the client and server.
+
+### RTVIServerMessageFrame
+
+Sends a server message to the connected client.
+
+
+ The message data to send to the client.
+
+
+### RTVIClientMessageFrame
+
+A message received from the client, expecting a server response via `RTVIServerResponseFrame`.
+
+
+ Unique identifier for the client message.
+
+
+
+ The message type.
+
+
+
+ Optional message data from the client.
+
+
+### RTVIServerResponseFrame
+
+Responds to an `RTVIClientMessageFrame`. Include the original client message frame to ensure the response is properly correlated. Set the `error` field to respond with an error instead of a normal response.
+
+
+ The original client message this response is for.
+
+
+
+ Response data to send to the client.
+
+
+
+ Error message. When set, the client receives an `error-response` instead of a
+ `server-response`.
+
+
+## Task Frames
+
+Task frames provide a system-priority mechanism for requesting pipeline actions from outside the normal frame flow. They are converted into their corresponding standard frames when processed.
+
+### TaskSystemFrame
+
+Base class for system-priority task frames.
+
+### CancelTaskFrame
+
+Requests immediate pipeline cancellation. Converted to a `CancelFrame` when processed by the pipeline.
+
+Inherits from `TaskSystemFrame`.
+
+
+ Optional reason for the cancellation request.
+
+
+### InterruptionTaskFrame
+
+Requests a pipeline interruption. Converted to an `InterruptionFrame` when processed.
+
+Inherits from `TaskSystemFrame`.