Skip to content

Why the training code snippet contains BEV feature and doesn't have language output ? #2

@HuangChiEn

Description

@HuangChiEn

Hi, i have 2 question about the codebase :

  1. Why the training code snippet contains BEV feature ? (carllava only use RGB image as input)

    labels["input_bev_latent"] = bev

  2. Most of VLM sacrifices speed for explainable (language) output enhancing the reasoning & robustness of predicted waypoint. However, i didn't saw any language output in the codebase of ETA ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions