SourceConfigOutput

Configuration for collection source (bucket(s) or collection). Collections can process data from two types of sources: 1. Bucket Source: Process raw objects from one or more buckets (first-stage processing) - Use this to create your initial collections from uploaded data - Can specify multiple buckets to consolidate data from different sources - All buckets must have compatible schemas (validated at creation) - Example: Videos from multiple regions → Frame extraction collection 2. Collection Source: Process documents from another collection (decomposition trees) - Use this to create multi-stage processing pipelines - Example: Frames collection → Scene detection collection Multi-Bucket Requirements: - All buckets must have compatible schemas (same fields, types, and required status) - Schema compatibility is validated when the collection is created - Documents track which specific bucket they came from via root_bucket_id - Useful for consolidating data from multiple regions, teams, or environments The source determines: - What data the feature extractor receives as input - The input_schema available for input_mappings and field_passthrough - The lineage tracking in output documents Examples: Single bucket: {"type": "bucket", "bucket_ids": ["bkt_products"]} Multi-bucket: {"type": "bucket", "bucket_ids": ["bkt_us", "bkt_eu", "bkt_asia"]} Collection: {"type": "collection", "collection_id": "col_frames"}

Properties

Name	Type	Description	Notes
type	SourceType	REQUIRED. Type of source for this collection. 'bucket': Process objects from one or more buckets (first-stage processing). 'collection': Process documents from another collection (downstream processing). Use 'bucket' for initial data ingestion, 'collection' for decomposition trees.
bucket_ids	List[str]	List of bucket IDs when type='bucket'. REQUIRED when type='bucket'. NOT ALLOWED when type='collection'. Can specify one or more buckets to process. Single bucket: Use array with one element ['bkt_id']. Multiple buckets: All buckets MUST have compatible schemas. Schema compatibility validated at collection creation. Compatible schemas have: 1) Same field names, 2) Same field types, 3) Same required status. Documents will include root_bucket_id to track which bucket they came from. Use cases: multi-region data, multi-team consolidation, environment aggregation.	[optional]
source_namespace_id	str	Namespace ID where the source buckets reside. Use this to process buckets from a different namespace within the same organization. When omitted, buckets are looked up in the current (collection's) namespace. Only valid when type='bucket'.	[optional]
collection_id	str	Collection ID when type='collection' (single collection). Use this OR collection_ids (not both). REQUIRED when type='collection' and processing single collection. NOT ALLOWED when type='bucket'. The collection will process documents from this upstream collection. The upstream collection's output_schema becomes this collection's input_schema. This enables decomposition trees (multi-stage pipelines). Example: Process frames collection → create scenes collection.	[optional]
collection_ids	List[str]	List of collection IDs when type='collection' (multiple collections). Use this OR collection_id (not both). REQUIRED when type='collection' and processing multiple collections. NOT ALLOWED when type='bucket'. Used for operations that consolidate multiple upstream collections. Example: Clustering across multiple collections → cluster output collection. All collections must have compatible schemas for consolidation operations.	[optional]
inherited_bucket_ids	List[str]	List of original bucket IDs that source collections originated from. OPTIONAL. Only used when type='collection'. Tracks the complete lineage chain: buckets → collections → derived collections. Extracted from upstream collection metadata at collection creation time. Enables tracing derived collections (like cluster outputs) back to original data sources. Example: Cluster output collection inherits bucket IDs from its source collections. Format: List of bucket IDs with 'bkt_' prefix.	[optional]
source_filters	SourceFiltersOutput	Optional filters to apply to source data. When specified, only objects/documents matching these filters will be processed by this collection. Filters are evaluated at batch creation time. Uses same LogicalOperator model as list APIs for consistency.	[optional]

Example

from mixpeek.models.source_config_output import SourceConfigOutput

# TODO update the JSON string below
json = "{}"
# create an instance of SourceConfigOutput from a JSON string
source_config_output_instance = SourceConfigOutput.from_json(json)
# print the JSON string representation of the object
print(SourceConfigOutput.to_json())

# convert the object into a dict
source_config_output_dict = source_config_output_instance.to_dict()
# create an instance of SourceConfigOutput from a dict
source_config_output_from_dict = SourceConfigOutput.from_dict(source_config_output_dict)

[Back to Model list] [Back to API list] [Back to README]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SourceConfigOutput

Properties

Example

FilesExpand file tree

SourceConfigOutput.md

Latest commit

History

SourceConfigOutput.md

File metadata and controls

SourceConfigOutput

Properties

Example