Conversation
|
Training finishes for tp size 1 and both sdpa/flex attention backends. Loss/acc curves look ok, going to add qwen3_vl_moe eagle3 support into sglang/vllm (so I can eval) before adding tp size > 1 support. |
47ac5c9 to
1f08199
Compare
|
Marking this ready for review, adds Training graphs: |
|
When will verification be supported? |
There's a branch of sglang here that you can run. It's currently being cleaned up for upstreaming. We were able to confirm an accept length of almost |
|
Can you rebase this PR to the latest main? |
yes I'll get to it this week |
|
sorry for the delay @FrankLeeeee I finished the rebase |
|
@KerwinKai this PR seems to overlap with yours, can you take a look? |
Yes, I'll try. |
|
` elif ( Initialize the target model using Qwen3VLForConditionalGeneration from the Transformers library, but the class definition does not include set_aux_hidden_states_layers, causing the error: How should I modify Qwen3VLForConditionalGeneration? I noticed there is a class Eagle3TargetModel(ABC), but I’m not sure how to use it. |
you should specify backend as |
Thanks a lot! But I finally resolved the 'set_aux_hidden_states_layers' issue by |
|
Any update on this issue? |
[rank0]: File "/mnt/hwnas/xuyingbing/SpecForge/specforge/core/eagle3.py", line 380, in _prepare_data |
target_model should be Qwen3VLForConditionalGeneration instead of HFEagle3TargetModel |
|
any update on this issue? |
|
closing because #431 exists! |

Motivation
** Draft PR. This is currently WIP **
Add eagle3 support for qwen3_vl and qwen3_vl_moe models.
Modifications
Related Issues
Accuracy Test
Benchmark & Profiling
Checklist