-
Notifications
You must be signed in to change notification settings - Fork 0
questionable TP alignment of TD annotation #379
Description
Bug Description
I am worried about whether the TP to which the TD is aligned is the same TP that was the source of the image processed by the VLM.
I ran the captioner on input MMIF from SWT. It produced this in its output view:
{
"@type": "http://mmif.clams.ai/vocabulary/TextDocument/v1",
"properties": {
"document": "d1",
"origin": "v_1:tf_18",
"provenance": "derived",
"mime": "application/json",
"text": {
"@value": "AL GALLETTA BLUEBERRY GROWER",
"@language": "en"
},
"id": "v_2:td_6"
}
},
{
"@type": "http://mmif.clams.ai/vocabulary/Alignment/v1",
"properties": {
"source": "v_0:tp_919",
"target": "v_2:td_6",
"id": "v_2:al_6"
}
},
Here are the TF and TP annotations that is referenced:
{
"@type": "http://mmif.clams.ai/vocabulary/TimeFrame/v6",
"properties": {
"label": "chyron and person",
"classification": {
"chyron and person": 0.9387489855289459
},
"targets": [
"v_0:tp_919",
"v_0:tp_920",
"v_0:tp_921",
"v_0:tp_922",
"v_0:tp_923",
"v_0:tp_924",
"v_0:tp_925",
"v_0:tp_926"
],
"representatives": [
"v_0:tp_919"
],
"timeUnit": "milliseconds",
"id": "v_1:tf_18"
}
},
{
"@type": "http://mmif.clams.ai/vocabulary/TimePoint/v5",
"properties": {
"timePoint": 459026,
"label": "IN",
"classification": {
"GLOTW": 3.5924065741710365e-05,
"CR": 3.7618651731463615e-06,
"IN": 0.947616696357727,
"KU": 0.001030643587000668,
"B": 2.926498436818542e-25,
"S": 1.0204522123136162e-11,
"M": 5.168930283794282e-10,
"Y": 1.1250751413172111e-05,
"F": 1.7080129310897973e-08,
"E": 0.0006756898364983499,
"P": 0.050621531903743744,
"-": 4.5278543439053465e-06
},
"id": "v_0:tp_919"
}
},
However, this is the frame from the video sought for 459026 (and found at 00459025).

Note that there is no text in that image, not even faintly. However, here are some nearby frames
(from 00459892)
(from 00461661)
Question: Is it possible that the app is pulling an image from later than the time point it is seeking?
Here is the cataloging aid where we discovered this.
Reproduction steps
Full MMIF file: cpb-aacip-259-9c6s1g2d_NJN_News_pre-1984_2.mmif.json
Media file: cpb-aacip-259-9c6s1g2d.mp4
Expected behavior
The VLM should perform captioning on the exact frame referenced by the TP annotation.
Log output
Screenshots
No response
Additional context
No response