feat: support internvl by qlylangyu · Pull Request #9403 · ggml-org/llama.cpp

qlylangyu · 2024-09-10T07:15:24Z

I have read the contributing guidelines
Self-reported review complexity:
- Low
- Medium
- High

ngxson · 2024-09-10T13:40:32Z

I'm not very familiar with vision models, but I wonder if there is a particular reason to duplicate clip.cpp, instead of reusing llava/clip.cpp

James4Ever0 · 2025-01-07T10:14:28Z

Your code is incomplete and unable to compile. Do you have updates since the last commit?

Procedure:

cd /tmp
git clone https://github.com/qlylangyu/llama.cpp llama.cpp-internvl
cd llama.cpp
git checkout internvl
# edit the file examples/CMakeLists.txt and add the line "add_subdirectory(internvl)"
mkdir build
cd build 
cmake ..
make llama-internvl-cli

Error log:

[ 93%] Building CXX object examples/internvl/CMakeFiles/llama-internvl-cli.dir/internvl-cli.cpp.o
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp: In function ‘const char* sample(llama_sampling_context*, llama_context*, int*)’:
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:52:28: error: ‘llama_sampling_sample’ was not declared in this scope; did you mean ‘llama_sampler_sample’?
   52 |     const llama_token id = llama_sampling_sample(ctx_sampling, ctx_llama, NULL);
      |                            ^~~~~~~~~~~~~~~~~~~~~
      |                            llama_sampler_sample
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:53:5: error: ‘llama_sampling_accept’ was not declared in this scope; did you mean ‘llama_sampler_accept’?
   53 |     llama_sampling_accept(ctx_sampling, ctx_llama, id, true);
      |     ^~~~~~~~~~~~~~~~~~~~~
      |     llama_sampler_accept
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp: In function ‘void print_usage(int, char**, const gpt_params&)’:
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:122:5: error: ‘gpt_params_print_usage’ was not declared in this scope
  122 |     gpt_params_print_usage(argc, argv, params);
      |     ^~~~~~~~~~~~~~~~~~~~~~
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp: In function ‘internvl_image_embed* load_image(internvl_context*, gpt_params*, const string&)’:
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:138:94: error: ‘struct gpt_params’ has no member named ‘n_threads’
  138 |         embed = internvl_image_embed_make_with_prompt_base64(ctx_internvl->ctx_clip, params->n_threads, prompt);
      |                                                                                              ^~~~~~~~~
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:145:89: error: ‘struct gpt_params’ has no member named ‘n_threads’
  145 |         embed = internvl_image_embed_make_with_filename(ctx_internvl->ctx_clip, params->n_threads, fname.c_str());
      |                                                                                         ^~~~~~~~~
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp: In function ‘void process_prompt(internvl_context*, internvl_image_embed*, gpt_params*, const string&)’:
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:161:26: error: ‘llama_should_add_bos_token’ was not declared in this scope; did you mean ‘llama_add_bos_token’?
  161 |     const bool add_bos = llama_should_add_bos_token(llama_get_model(ctx_internvl->ctx_llama));
      |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~
      |                          llama_add_bos_token
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:185:52: error: ‘llama_sampling_init’ was not declared in this scope; did you mean ‘llama_sampling_context’?
  185 |     struct llama_sampling_context * ctx_sampling = llama_sampling_init(params->sparams);
      |                                                    ^~~~~~~~~~~~~~~~~~~
      |                                                    llama_sampling_context
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:205:5: error: ‘llama_sampling_free’ was not declared in this scope; did you mean ‘llama_sampler_free’?
  205 |     llama_sampling_free(ctx_sampling);
      |     ^~~~~~~~~~~~~~~~~~~
      |     llama_sampler_free
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp: In function ‘int main(int, char**)’:
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:279:10: error: ‘gpt_params_parse’ was not declared in this scope; did you mean ‘gpt_params’?
  279 |     if (!gpt_params_parse(argc, argv, params)) {
      |          ^~~~~~~~~~~~~~~~
      |          gpt_params
/tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:344:9: error: ‘llama_print_timings’ was not declared in this scope; did you mean ‘llama_print_system_info’?
  344 |         llama_print_timings(ctx_internvl->ctx_llama);
      |         ^~~~~~~~~~~~~~~~~~~
      |         llama_print_system_info
make[3]: *** [examples/internvl/CMakeFiles/llama-internvl-cli.dir/build.make:76: examples/internvl/CMakeFiles/llama-internvl-cli.dir/internvl-cli.cpp.o] Error 1
make[2]: *** [CMakeFiles/Makefile2:3540: examples/internvl/CMakeFiles/llama-internvl-cli.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:3547: examples/internvl/CMakeFiles/llama-internvl-cli.dir/rule] Error 2
make: *** [Makefile:1414: llama-internvl-cli] Error 2

James4Ever0 · 2025-01-08T02:43:20Z

It looks like your code is very similar to the files under examples/llava. Unless there is a specific reason to copy large amount of code from there, you shall import or rewrite it first.

Anyway, I will check the overall model architecture, and make a working version instead of this.

I have tried to load the model using llama-llava-cli but failed.

./llama-llava-cli \
    -m ./InternVL-gguf/internlm2-1.8B-chat-q4_k.gguf \
    --mmproj ./InternVL-gguf/InternViT-300M-448px-f16.gguf \
    -t 4 \
    --image ./example.jpeg \
    -p "<image>\nWhat is in this image?"

Output:

key clip.has_text_encoder not found in file
terminate called after throwing an instance of 'std::runtime_error'
  what():  Missing required key: clip.has_text_encoder

James4Ever0 · 2025-01-08T10:09:51Z

Have made every attempt for your code to work. But I have this core dump anyway.

internvl_image_embed_make_with_filename: image loaded in     0.03 ms

internvl_image_embed_make_with_bytes: image encoded in     1.35 ms

encode_image_with_clip: image process in    11.08 ms

encode_image_with_clip: image embedding created: 256 tokens

encode_image_with_clip: image preprocessed in    11.13 ms by CLIP (    0.04 ms per image patch)

encode_image_with_clip: image encoded in 85708.52 ms by CLIP (  334.80 ms per image patch)

internvl_image_embed_make_with_filename: image encoded in 85721.07 ms

Segmentation fault (core dumped)

James4Ever0 · 2025-01-09T04:50:52Z

Using gdb backtrace gets the following result:

#0  0x00007ffff7d0975e in llama_decode_internal (lctx=..., batch_all=...)
    at /tmp/llama.cpp-internvl/src/llama.cpp:16080
#1  0x00007ffff7d17f15 in llama_decode (ctx=0x5555557cc960, batch=...)
    at /tmp/llama.cpp-internvl/src/llama.cpp:20053
#2  0x00005555555c2072 in internvl_eval_image_embed (ctx_llama=0x5555557cc960, image_embed=0x555555837220, 
    n_batch=2048, n_past=0x7fffffffcdf4)
    at /tmp/llama.cpp-internvl/examples/internvl/internvl.cpp:268
#3  0x00005555555b85bd in process_prompt (ctx_internvl=0x555555785c70, image_embed=0x555555837220, 
    params=0x7fffffffcfd0, prompt="<image>\nWhat is in this image?")
    at /tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:194
#4  0x00005555555b9375 in main (argc=13, argv=0x7fffffffe2c8)
    at /tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:362

James4Ever0 · 2025-01-10T07:42:43Z

After applying that patch, I find the sampling process being wrong.

Under the file src/llama-sampling.cpp, The value of cur_p.selected is -1 after llama_sampler_apply(smpl, &cur_p).

Backtrace:

encode_image_with_clip: image embedding created: 256 tokens

encode_image_with_clip: image preprocessed in    41.33 ms by CLIP (    0.16 ms per image patch)

encode_image_with_clip: image encoded in 273212.53 ms by CLIP ( 1067.24 ms per image patch)

internvl_image_embed_make_with_filename: image encoded in 273258.34 ms

ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically
ggml_gallocr_needs_realloc: graph has different number of nodes
ggml_gallocr_alloc_graph: reallocating buffers automatically

/tmp/llama.cpp-internvl/src/llama-sampling.cpp:239: GGML_ASSERT(cur_p.selected >= 0 && cur_p.selected < (int32_t) cur_p.size) failed
[Detaching after fork from child process 2352572]

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=<optimized out>, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007ffff724526e in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007ffff72288ff in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff78ca6a4 in ggml_abort (
    file=0x7ffff7e85870 "/tmp/llama.cpp-internvl/src/llama-sampling.cpp", line=239, 
    fmt=0x7ffff7e85854 "GGML_ASSERT(%s) failed")
    at /tmp/llama.cpp-internvl/ggml/src/ggml.c:284
#6  0x00007ffff7debcbb in llama_sampler_sample (smpl=0x555555806560, ctx=0x5555557cc960, idx=-1)
    at /tmp/llama.cpp-internvl/src/llama-sampling.cpp:239
#7  0x00005555555b77c1 in sample (smpl=0x555555806560, ctx=0x5555557cc960, n_past=0x7fffffffcdf4)
    at /tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:56
#8  0x00005555555b86a7 in process_prompt (ctx_internvl=0x555555785c70, image_embed=0x555555837220, 
    params=0x7fffffffcfd0, prompt="<image>\nWhat is in this image?")
    at /tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:209
#9  0x00005555555b9375 in main (argc=13, argv=0x7fffffffe2c8)
    at /tmp/llama.cpp-internvl/examples/internvl/internvl-cli.cpp:362

James4Ever0 · 2025-01-14T04:35:04Z

By cross-referencing the file llava-cli.cpp, it has finally returned something reasonable. Anyway I will post the refactored code patch shorty, after all these shenanigans.

Creating such a patch for original codebase is not easy. There are significant differences. I decide to release my changes of the forked version, and also generated diff files for further work.

Now you can view the release here.

James4Ever0 · 2025-01-28T14:11:24Z

@ggerganov

ngxson · 2025-01-28T18:54:35Z

Will have a look at later stage on my refactoring: #11292

James4Ever0 · 2025-02-22T13:41:53Z

A C++ formatter like clang-format, astyle or Uncrustify would be required for code refactoring.

feat: support internvl

1e8646b

github-actions bot added examples python python script changes labels Sep 10, 2024

cjsdurj mentioned this pull request Nov 11, 2024

performance problem about internvl image embedding using ggml.dll intel/ipex-llm#12376

Open

James4Ever0 mentioned this pull request Jan 6, 2025

[Feature] Implement InternVL to llama.cpp OpenGVLab/InternVL#522

Open

James4Ever0 mentioned this pull request Jan 9, 2025

Bug: llava.cpp Segmentation fault (core dumped) starting in faf69d4237c9ae4d7f572b4674d1002463e8acd3 #9436

Closed

James4Ever0 mentioned this pull request Jan 14, 2025

Support for InternVL #6803

Closed

filipemansano approved these changes Jan 28, 2025

View reviewed changes

James4Ever0 mentioned this pull request Feb 22, 2025

Task request for SWELancer-Benchmark openai/SWELancer-Benchmark#35

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support internvl#9403

feat: support internvl#9403
qlylangyu wants to merge 1 commit intoggml-org:masterfrom
qlylangyu:internvl

qlylangyu commented Sep 10, 2024 •

edited

Loading

Uh oh!

ngxson commented Sep 10, 2024

Uh oh!

James4Ever0 commented Jan 7, 2025 •

edited

Loading

Uh oh!

James4Ever0 commented Jan 8, 2025 •

edited

Loading

Uh oh!

James4Ever0 commented Jan 8, 2025

Uh oh!

James4Ever0 commented Jan 9, 2025

Uh oh!

James4Ever0 commented Jan 10, 2025 •

edited

Loading

Uh oh!

James4Ever0 commented Jan 14, 2025 •

edited

Loading

Uh oh!

James4Ever0 commented Jan 28, 2025

Uh oh!

ngxson commented Jan 28, 2025

Uh oh!

James4Ever0 commented Feb 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

qlylangyu commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson commented Sep 10, 2024

Uh oh!

James4Ever0 commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

James4Ever0 commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

James4Ever0 commented Jan 8, 2025

Uh oh!

James4Ever0 commented Jan 9, 2025

Uh oh!

James4Ever0 commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

James4Ever0 commented Jan 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

James4Ever0 commented Jan 28, 2025

Uh oh!

ngxson commented Jan 28, 2025

Uh oh!

James4Ever0 commented Feb 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

qlylangyu commented Sep 10, 2024 •

edited

Loading

James4Ever0 commented Jan 7, 2025 •

edited

Loading

James4Ever0 commented Jan 8, 2025 •

edited

Loading

James4Ever0 commented Jan 10, 2025 •

edited

Loading

James4Ever0 commented Jan 14, 2025 •

edited

Loading