Skip to content

Latest commit

 

History

History
34 lines (25 loc) · 2.58 KB

File metadata and controls

34 lines (25 loc) · 2.58 KB

StageDefsQueryPreprocessing

Configuration for query preprocessing — large file decomposition at query time. When a query input is a large file (video, PDF, long text), preprocessing decomposes it using the same extractor pipeline that indexed the data, generates N embeddings (one per chunk), runs N parallel searches, and fuses the results into a single ranked list. This is "ingestion applied to the query" — same decomposition and embedding, but vectors are used for search instead of storage.

Properties

Name Type Description Notes
feature_uri str Feature URI for the extractor pipeline to use for decomposition. If None, inherits from the parent search's feature_uri. [optional]
params Dict[str, object] Extractor-specific parameter overrides. Same params as ingestion: split_method, time_split_interval, chunk_size, chunk_overlap, etc. [optional]
max_chunks int Maximum number of chunks to search with. Caps parallel queries and embedding calls to control cost. Chunks are evenly sampled across the file if the extractor produces more than max_chunks. [optional] [default to 20]
aggregation str Fusion strategy for combining results from N chunk queries. 'rrf': Reciprocal Rank Fusion (balanced, recommended). 'max': Keep highest score per document (best for 'find this exact moment'). 'avg': Average scores (best for 'find similar overall content'). [optional] [default to 'rrf']
dedup_field str Optional payload field to deduplicate results by. E.g., '_internal.document_id' to collapse chunks from the same parent document. [optional]

Example

from mixpeek.models.stage_defs_query_preprocessing import StageDefsQueryPreprocessing

# TODO update the JSON string below
json = "{}"
# create an instance of StageDefsQueryPreprocessing from a JSON string
stage_defs_query_preprocessing_instance = StageDefsQueryPreprocessing.from_json(json)
# print the JSON string representation of the object
print(StageDefsQueryPreprocessing.to_json())

# convert the object into a dict
stage_defs_query_preprocessing_dict = stage_defs_query_preprocessing_instance.to_dict()
# create an instance of StageDefsQueryPreprocessing from a dict
stage_defs_query_preprocessing_from_dict = StageDefsQueryPreprocessing.from_dict(stage_defs_query_preprocessing_dict)

[Back to Model list] [Back to API list] [Back to README]