-
Notifications
You must be signed in to change notification settings - Fork 79
Open
Description
gemma-2-9b-it
- doesn't support system message:
$ python3 simulstreaming_translate.py --model-dir ct2_gemma-2-9b-it/ --tokenizer-dir gemma-2-9b-it/ --src-lang en --tgt-lang he --input-jsonl tn.jsonl
INFO: System prompt: You are simultaneous interpreter from English to Hebrew. We are at a conference. It is important that you translate only what you hear, nothing else!
INFO: Init prompt src: ['My', 'hovercraft', 'is', 'full', 'of', 'eels.']
INFO: Init prompt tgt: הרחפת שלי מלאה בצלופחים.
Loading the model...
...done
INFO: Reading tn.jsonl in jsonl format, computationally aware simulation.
INPUT: Začínají telev
IS FINAL: False
SRC My hovercraft is full of eels. Začínají
FORCED TGT הרחפת שלי מלאה בצלופחים.
Traceback (most recent call last):
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 616, in <module>
main_simulation_from_file()
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 607, in main_simulation_from_file
simulation_update(simul, rows, timer)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 546, in simulation_update
out_handler(out, row, timer)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 519, in handle_outputs
for r in format_outputs(out_seq, in_row, timer, is_final=is_final):
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 506, in format_outputs
for status, confirmed, unconfirmed in out_seq:
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 275, in process_iter
out = self.llmtranslator.translate(src, forced_tgt)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 74, in translate
prompt_tokens = self.build_prompt(dialog)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 55, in build_prompt
base_toks = self.tokenizer.apply_chat_template(dialog[:2], tokenize=True, add_generation_prompt=True)["input_ids"]
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3132, in apply_chat_template
rendered_chat, generation_indices = render_jinja_template(
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/transformers/utils/chat_template_utils.py", line 537, in render_jinja_template
rendered_chat = compiled_template.render(
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/jinja2/environment.py", line 1295, in render
self.environment.handle_exception()
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/jinja2/environment.py", line 942, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 1, in top-level template code
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/jinja2/sandbox.py", line 401, in call
return __context.call(__obj, *args, **kwargs)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/transformers/utils/chat_template_utils.py", line 445, in raise_exception
raise jinja2.exceptions.TemplateError(message)
jinja2.exceptions.TemplateError: System role not supported
gemma-2-9b:
- doesn't support chat template
$ python3 simulstreaming_translate.py --model-dir ct2_gemma-2-9b/ --tokenizer-dir gemma-2-9b/ --src-lang en --tgt-lang he --input-jsonl tn.jsonl
INFO: System prompt: You are simultaneous interpreter from English to Hebrew. We are at a conference. It is important that you translate only what you hear, nothing else!
INFO: Init prompt src: ['My', 'hovercraft', 'is', 'full', 'of', 'eels.']
INFO: Init prompt tgt: הרחפת שלי מלאה בצלופחים.
Loading the model...
...done
INFO: Reading tn.jsonl in jsonl format, computationally aware simulation.
INPUT: Začínají telev
IS FINAL: False
SRC My hovercraft is full of eels. Začínají
FORCED TGT הרחפת שלי מלאה בצלופחים.
Traceback (most recent call last):
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 616, in <module>
main_simulation_from_file()
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 607, in main_simulation_from_file
simulation_update(simul, rows, timer)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 546, in simulation_update
out_handler(out, row, timer)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 519, in handle_outputs
for r in format_outputs(out_seq, in_row, timer, is_final=is_final):
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 506, in format_outputs
for status, confirmed, unconfirmed in out_seq:
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 275, in process_iter
out = self.llmtranslator.translate(src, forced_tgt)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 74, in translate
prompt_tokens = self.build_prompt(dialog)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/simulstreaming_translate.py", line 55, in build_prompt
base_toks = self.tokenizer.apply_chat_template(dialog[:2], tokenize=True, add_generation_prompt=True)["input_ids"]
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3112, in apply_chat_template
chat_template = self.get_chat_template(chat_template, tools)
File "/lnet/work/people/machacek/smluvni-2024/alignatt-whisper.202412/SimulStreaming/p3-check/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 3294, in get_chat_template
raise ValueError(
ValueError: Cannot use chat template functions because tokenizer.chat_template is not set and no template argument was passed! For information about writing templates and setting the tokenizer.chat_template attribute, please see the documentation at https://huggingface.co/docs/transformers/main/en/chat_templating
So we should find a systematic solution for any of these.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels