Conversation
|
I'm not very familiar with vision models, but I wonder if there is a particular reason to duplicate |
|
Your code is incomplete and unable to compile. Do you have updates since the last commit? Procedure: cd /tmp
git clone https://github.com/qlylangyu/llama.cpp llama.cpp-internvl
cd llama.cpp
git checkout internvl
# edit the file examples/CMakeLists.txt and add the line "add_subdirectory(internvl)"
mkdir build
cd build
cmake ..
make llama-internvl-cliError log: |
|
It looks like your code is very similar to the files under Anyway, I will check the overall model architecture, and make a working version instead of this. I have tried to load the model using ./llama-llava-cli \
-m ./InternVL-gguf/internlm2-1.8B-chat-q4_k.gguf \
--mmproj ./InternVL-gguf/InternViT-300M-448px-f16.gguf \
-t 4 \
--image ./example.jpeg \
-p "<image>\nWhat is in this image?" Output: |
|
Have made every attempt for your code to work. But I have this core dump anyway. |
|
Using |
|
After applying that patch, I find the sampling process being wrong. Under the file Backtrace: |
|
By cross-referencing the file Creating such a patch for original codebase is not easy. There are significant differences. I decide to release my changes of the forked version, and also generated diff files for further work. Now you can view the release here. |
|
Will have a look at later stage on my refactoring: #11292 |
|
A C++ formatter like clang-format, astyle or Uncrustify would be required for code refactoring. |
Uh oh!
There was an error while loading. Please reload this page.