Skip to content

Fix VideoClassificationText#2044

Open
paulnoirel wants to merge 2 commits intodevelopfrom
PLT-3599-Fix-VideoClassificationAnnotation-text-per-frame
Open

Fix VideoClassificationText#2044
paulnoirel wants to merge 2 commits intodevelopfrom
PLT-3599-Fix-VideoClassificationAnnotation-text-per-frame

Conversation

@paulnoirel
Copy link
Contributor

@paulnoirel paulnoirel commented Feb 26, 2026

Description

Summary

Fixes NDJSON serialization for VideoClassificationAnnotation with Text values. Previously, frame information and secondary text segments were silently dropped -- only the first annotation's text was emitted as a plain string with no frames. After this fix, the output includes per-segment text values with their frame ranges, matching the format produced by TemporalClassificationText.

Problem

annotations = [
    lb_types.VideoClassificationAnnotation(
        name="free_text_per_frame", frame=9, segment_index=0,
        value=lb_types.Text(answer="sample text 1"),
    ),
    lb_types.VideoClassificationAnnotation(
        name="free_text_per_frame", frame=15, segment_index=0,
        value=lb_types.Text(answer="sample text 1"),
    ),
    lb_types.VideoClassificationAnnotation(
        name="free_text_per_frame", frame=40, segment_index=0,
        value=lb_types.Text(answer="sample text 2"),
    ),
    lb_types.VideoClassificationAnnotation(
        name="free_text_per_frame", frame=50, segment_index=0,
        value=lb_types.Text(answer="sample text 2"),
    ),
]

Before -- frames lost, second text segment dropped:

{
  "name": "free_text_per_frame",
  "answer": "sample text 1",
  "uuid": "73a61cfa-...",
  "dataRow": {"globalKey": "my-video-global-key"}
}

After -- all segments and frame ranges preserved:

{
  "name": "free_text_per_frame",
  "answer": [
    {"value": "sample text 1", "frames": [{"start": 9, "end": 15}]},
    {"value": "sample text 2", "frames": [{"start": 40, "end": 50}]}
  ],
  "dataRow": {"globalKey": "my-video-global-key"}
}

Changes

  • classification.py: Added NDVideoText and NDVideoTextAnswer classes (standalone BaseModel, no dependency on temporal pipeline).
  • label.py: Added a Text branch in _create_video_annotations that groups annotations by text value, computes segment-aware frame ranges, and yields NDVideoText.
  • test_video.py: Added two tests covering multi-text and single-text scenarios.

Scope

  • Only affects VideoClassificationAnnotation + Text. Video Radio, Checklist, and object annotations are unchanged.
  • Non-video text annotations are unchanged.
  • No breaking changes to the Python API -- users keep using VideoClassificationAnnotation(value=Text(...)).

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Document change (fix typo or modifying any markdown files, code comments or anything in the examples folder only)

All Submissions

  • Have you followed the guidelines in our Contributing document?
  • Have you provided a description?
  • Are your changes properly formatted?

New Feature Submissions

  • Does your submission pass tests?
  • Have you added thorough tests for your new feature?
  • Have you commented your code, particularly in hard-to-understand areas?
  • Have you added a Docstring?

Changes to Core Features

  • Have you written new tests for your core changes, as applicable?
  • Have you successfully run tests with your changes locally?
  • Have you updated any code comments, as applicable?

Note

Medium Risk
Changes the emitted NDJSON shape for video Text classifications from a single string to a structured list with frame ranges, which may impact downstream parsers expecting the old format. Scope is limited to VideoClassificationAnnotation + Text and is covered by new tests.

Overview
Fixes video free-text classification export so NDJSON no longer drops frame information or later text segments.

VideoClassificationAnnotation with Text is now serialized as a single row whose answer is a list of {value, frames} entries (computed by grouping annotations by text value and segment-aware frame ranges) via a new NDVideoText model, with tests added for multi-text and single-text cases.

Written by Cursor Bugbot for commit fcc1107. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix prepared fixes for both issues found in the latest run.

  • ✅ Fixed: Empty dataRow when only external_id is provided
    • NDVideoText now builds data_row via DataRow(id=data.uid, global_key=data.global_key), so an external_id-only row raises the same clear validation error instead of serializing an empty dataRow.
  • ✅ Fixed: NDVideoText drops feature_schema_id and requires name
    • NDVideoText now supports optional name, preserves feature_schema_id as schemaId, and validates that at least one identifier is set to match other classification serializers.

Create PR

Or push these changes by commenting:

@cursor push d470be00b2
Preview (d470be00b2)
diff --git a/libs/labelbox/src/labelbox/data/serialization/ndjson/classification.py b/libs/labelbox/src/labelbox/data/serialization/ndjson/classification.py
--- a/libs/labelbox/src/labelbox/data/serialization/ndjson/classification.py
+++ b/libs/labelbox/src/labelbox/data/serialization/ndjson/classification.py
@@ -223,10 +223,21 @@
       {"name": "...", "answer": [{"value": "text", "frames": [{"start": 1, "end": 5}]}], ...}
     """
 
-    name: str
+    name: Optional[str] = None
+    schema_id: Optional[Cuid] = Field(
+        default=None, serialization_alias="schemaId"
+    )
     answer: List[NDVideoTextAnswer]
-    dataRow: Dict[str, str]
+    data_row: DataRow = Field(serialization_alias="dataRow")
 
+    model_config = ConfigDict(populate_by_name=True)
+
+    @model_validator(mode="after")
+    def must_set_one(self):
+        if self.schema_id is None and self.name is None:
+            raise ValueError("Schema id or name are not set. Set either one.")
+        return self
+
     @classmethod
     def from_video_text_group(
         cls,
@@ -235,14 +246,10 @@
         data: "GenericDataRowData",
     ) -> "NDVideoText":
         first = annotation_group[0]
-        data_row = {}
-        if data.global_key:
-            data_row["globalKey"] = data.global_key
-        elif data.uid:
-            data_row["id"] = data.uid
         return cls(
             name=first.name,
-            dataRow=data_row,
+            schema_id=first.feature_schema_id,
+            data_row=DataRow(id=data.uid, global_key=data.global_key),
             answer=[
                 NDVideoTextAnswer(value=text_val, frames=ranges)
                 for text_val, ranges in frame_ranges_by_text.items()

diff --git a/libs/labelbox/tests/data/serialization/ndjson/test_video.py b/libs/labelbox/tests/data/serialization/ndjson/test_video.py
--- a/libs/labelbox/tests/data/serialization/ndjson/test_video.py
+++ b/libs/labelbox/tests/data/serialization/ndjson/test_video.py
@@ -1,4 +1,5 @@
 import json
+import pytest
 from labelbox.data.annotation_types.classification.classification import (
     Checklist,
     ClassificationAnnotation,
@@ -722,6 +723,59 @@
     assert answer[0]["frames"] == [{"start": 9, "end": 15}]
 
 
+def test_video_classification_text_with_external_id_raises():
+    label = Label(
+        data=GenericDataRowData(external_id="sample-video-external-id"),
+        annotations=[
+            VideoClassificationAnnotation(
+                name="free_text",
+                frame=9,
+                segment_index=0,
+                value=Text(answer="sample text"),
+            )
+        ],
+    )
+
+    with pytest.raises(ValueError, match="Must set either id or global_key"):
+        list(NDJsonConverter.serialize([label]))
+
+
+def test_video_classification_text_with_feature_schema_id_only():
+    label = Label(
+        data=GenericDataRowData(global_key="sample-video-schema-id-only"),
+        annotations=[
+            VideoClassificationAnnotation(
+                feature_schema_id="ckrb1sfjx099a0y914hl319ie",
+                frame=9,
+                segment_index=0,
+                value=Text(answer="sample text"),
+            ),
+            VideoClassificationAnnotation(
+                feature_schema_id="ckrb1sfjx099a0y914hl319ie",
+                frame=15,
+                segment_index=0,
+                value=Text(answer="sample text"),
+            ),
+        ],
+    )
+
+    serialized = list(NDJsonConverter.serialize([label]))
+    free_text_rows = [
+        r
+        for r in serialized
+        if r.get("schemaId") == "ckrb1sfjx099a0y914hl319ie"
+    ]
+    assert len(free_text_rows) == 1
+
+    row = free_text_rows[0]
+    assert row["schemaId"] == "ckrb1sfjx099a0y914hl319ie"
+    assert "name" not in row
+    assert row["dataRow"] == {"globalKey": "sample-video-schema-id-only"}
+    assert row["answer"] == [
+        {"value": "sample text", "frames": [{"start": 9, "end": 15}]}
+    ]
+
+
 def test_video_classification_nesting_bbox():
     bbox_annotation = [
         VideoObjectAnnotation(
This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

if data.global_key:
data_row["globalKey"] = data.global_key
elif data.uid:
data_row["id"] = data.uid
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty dataRow when only external_id is provided

Low Severity

NDVideoText.from_video_text_group builds data_row manually checking only data.global_key and data.uid. If a user creates a GenericDataRowData with only external_id, data_row stays {}, producing a silently empty "dataRow": {} in the output. All other annotation types use the DataRow class, which has a validator that raises a clear ValueError("Must set either id or global_key") in this situation. This new path silently drops the data row identifier, which could lead to API rejection or annotation loss.

Fix in Cursor Fix in Web

elif data.uid:
data_row["id"] = data.uid
return cls(
name=first.name,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NDVideoText drops feature_schema_id and requires name

Medium Severity

NDVideoText declares name: str (required, non-optional) and has no schema_id/feature_schema_id field. FeatureSchema allows name=None when feature_schema_id is set, and all other annotation types go through NDAnnotation which accepts either. With this change, a VideoClassificationAnnotation using only feature_schema_id as its identifier will crash at NDVideoText(name=None, …) with a Pydantic validation error. Even when both are provided, feature_schema_id is silently dropped from the serialized output, unlike the NDText/NDRadio/NDChecklist paths that preserve it as schemaId.

Additional Locations (1)

Fix in Cursor Fix in Web

@paulnoirel paulnoirel marked this pull request as draft February 26, 2026 10:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant