Skip to content

Releases: JSALT2024/Sign_LLaVA

keywords [Please go ahead]

02 Aug 01:57

Choose a tag to compare

Pre-release

Release of a model trained on one_keyword_included 0.9, translation 0.1 - mae only.

Full Changelog: checkpoint-all_tasks...checkpoint-keywords

Jirka: This model is trained on MAE modality ONLY!, passing in other modalities does nothing. It was trained on mostly 1-keyword task and a little bit of translation.

Jirka: I needed to patch the sign_public_api.py file to make it run. It's attached under this release.

Installation into Demo

To install this checkpoint into a set-up demo backend, just go to the demo/backend folder and run:

rm -rf models/Sign_LLaVA
git clone git@github.com:JSALT2024/Sign_LLaVA.git models/Sign_LLaVA
(cd models/Sign_LLaVA && git reset --hard 8734172efc7a27209fff6252675cdcc35dfb8277)
.venv/bin/python3 -m pip install --no-deps --editable ./models/Sign_LLaVA
rm models/Sign_LLaVA/llava/sign_public_api.py
(cd models/Sign_LLaVA/llava && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-keywords/sign_public_api.py")
rm -rf checkpoints/Sign_LLaVA
mkdir -p checkpoints/Sign_LLaVA
(cd checkpoints/Sign_LLaVA && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-keywords/mae_keywords09_lowlr.zip")
(cd checkpoints/Sign_LLaVA && unzip mae_keywords09_lowlr.zip)
mv checkpoints/Sign_LLaVA/mae_keywords09_lowlr/* checkpoints/Sign_LLaVA/

Then run the model test:

.venv/bin/python3 -m app.debug.test_sign_llava

And the test should end with:

...
The LLM says: "Please go ahead and provide the ASL video, and I'll do my best to translate it into English."
The result form the LLM seems ok.

context2 [How to Make a S'mores]

02 Aug 02:23

Choose a tag to compare

Pre-release

Release of a model trained with context length = 2

Full Changelog: checkpoint-all_tasks...checkpoint-context

Jirka: This model is trained on MAE modality ONLY!, passing in other modalities does nothing. It was trained on translation with context from previous 2 clips.

Jirka: I needed to patch the sign_public_api.py file to make it run. It's attached under this release.

This model behaves very similar to the first [Would you like some water?] model from Xuan. LLM without visuals works ok, with visuals tends to repeat one sentence indefinitely. The sentence usually does not resemble the embeddings. Embeddings are collapsed into 1-2 clusters kannst and _ptr.

Installation into Demo

To install this checkpoint into a set-up demo backend, just go to the demo/backend folder and run:

rm -rf models/Sign_LLaVA
git clone git@github.com:JSALT2024/Sign_LLaVA.git models/Sign_LLaVA
(cd models/Sign_LLaVA && git reset --hard 8734172efc7a27209fff6252675cdcc35dfb8277)
.venv/bin/python3 -m pip install --no-deps --editable ./models/Sign_LLaVA
rm models/Sign_LLaVA/llava/sign_public_api.py
(cd models/Sign_LLaVA/llava && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-context/sign_public_api.py")
rm -rf checkpoints/Sign_LLaVA
mkdir -p checkpoints/Sign_LLaVA
(cd checkpoints/Sign_LLaVA && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-context/context2.zip")
(cd checkpoints/Sign_LLaVA && unzip context2.zip)
mv checkpoints/Sign_LLaVA/context2/* checkpoints/Sign_LLaVA/

Then run the model test:

.venv/bin/python3 -m app.debug.test_sign_llava

And the test should end with:

...
The LLM says: "Here is the translation of the given ASL video into English:\n\nVideo Title: How to Make a S'mores\n\nHost: Hi, I'm here to show you how to make a s'mores.\n\nStep 1: First, you're going to need some graham crackers, some chocolate bars, and some marshmallows.\n\nStep 2: Next, you're going to need a fire. You can use a fire pit, a campfire, or even a fire in your backyard.\n\nStep 3: Once you have your fire going, you're going to need to toast your marshmallow. You can use a marshmallow roasting stick or a skewer.\n\nStep 4: Once your marshmallow is toasted, you're going to place it on top of your graham cracker.\n\nStep 5: Next, you're going to place a piece of chocolate on top of the marshmallow.\n\nStep 6: Finally, you're going to place another graham"
The result form the LLM seems ok.

2024-07-31-overfit [drill that arms drill]

01 Aug 02:27

Choose a tag to compare

Release of overfit model - all modalities.

Jirka: This model was trained with all modalities and exclusively on the video attached below, so that it would overfit and predict what's expected of it for that one video. In this model, the text-embedding layer overfitted so much, that the video is translated roughly properly even without any prompt and also asking any other prompt (e.g. "What is you purpose?") will not even repeat the system prompt. The LLM itself is broken. But it does translate the video somehow (coz it overfitted).

Installation into Demo

To install this checkpoint into a set-up demo backend, just go to the demo/backend folder and run:

rm -rf models/Sign_LLaVA
git clone git@github.com:JSALT2024/Sign_LLaVA.git models/Sign_LLaVA
(cd models/Sign_LLaVA && git reset --hard 658c608105337d9691a459b68f220bb3175d7a0b)
.venv/bin/python3 -m pip install --no-deps --editable ./models/Sign_LLaVA
rm -rf checkpoints/Sign_LLaVA
mkdir -p checkpoints/Sign_LLaVA
(cd checkpoints/Sign_LLaVA && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-overfit/overfit.zip")
(cd checkpoints/Sign_LLaVA && unzip overfit.zip)

Then run the model test:

.venv/bin/python3 -m app.debug.test_sign_llava

And the test should end with:

...
The LLM says: 'drill that arms drill that drill that will get drill that will get drill that will get'
The result form the LLM seems ok.

all_tasks [Let's get started]

01 Aug 16:00

Choose a tag to compare

Pre-release

Release of a model trained on all tasks - mae.

Jirka:. This model is trained on the MAE modality ONLY!, passing in other modalities does nothing. It was trained mostly on keyword detection.

Jirka: I needed to patch the sign_public_api.py file to make it run. It's attached under this release.

Installation into Demo

To install this checkpoint into a set-up demo backend, just go to the demo/backend folder and run:

rm -rf models/Sign_LLaVA
git clone git@github.com:JSALT2024/Sign_LLaVA.git models/Sign_LLaVA
(cd models/Sign_LLaVA && git reset --hard 658c608105337d9691a459b68f220bb3175d7a0b)
.venv/bin/python3 -m pip install --no-deps --editable ./models/Sign_LLaVA
rm models/Sign_LLaVA/llava/sign_public_api.py
(cd models/Sign_LLaVA/llava && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-all_tasks/sign_public_api.py")
rm -rf checkpoints/Sign_LLaVA
mkdir -p checkpoints/Sign_LLaVA
(cd checkpoints/Sign_LLaVA && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint-all_tasks/all_tasks-7851619.zip")
(cd checkpoints/Sign_LLaVA && unzip all_tasks-7851619.zip)
mv checkpoints/Sign_LLaVA/all_tasks-7851619/* checkpoints/Sign_LLaVA/

Then run the model test:

.venv/bin/python3 -m app.debug.test_sign_llava

And the test should end with:

...
The LLM says: "Let's get started."                                   
The result form the LLM seems ok.

Checkpoints [Would you like some water?]

26 Jul 15:29

Choose a tag to compare

A test checkpoint of SignLLaVA.

Installation into Demo

To install this checkpoint into a set-up demo backend, just go to the demo/backend folder and run:

rm -rf models/Sign_LLaVA
git clone git@github.com:JSALT2024/Sign_LLaVA.git models/Sign_LLaVA
(cd models/Sign_LLaVA && git reset --hard 19cd35769b898517e2eacbbebaa4791cb84e000d)
.venv/bin/python3 -m pip install --no-deps --editable ./models/Sign_LLaVA
rm -rf checkpoints/Sign_LLaVA
mkdir -p checkpoints/Sign_LLaVA
(cd checkpoints/Sign_LLaVA && wget "https://github.com/JSALT2024/Sign_LLaVA/releases/download/checkpoint/test_ckpt_July_26_2024_11am.zip")
(cd checkpoints/Sign_LLaVA && unzip test_ckpt_July_26_2024_11am.zip)
mv checkpoints/Sign_LLaVA/test_ckpt_July_26_2024_11am/* checkpoints/Sign_LLaVA/

Then run the model test:

.venv/bin/python3 -m app.debug.test_sign_llava

And the test should end with:

...
The LLM says: 'Would you like some water?'
The result form the LLM seems ok.