Skip to content

Document with Layout #1

@AppJan

Description

@AppJan

Hello, I have a question regarding the "Document with Layout" code mentioned in your README : It seems this code doesn't run—the Image class is not defined. How should I modify it, or am I doing something wrong?

Also, can this approach achieve functionality similar to Deepseek-OCR—not only converting a PDF to markdown but also providing the corresponding layout information?

import llava

# Load model
model = llava.load("./easy_deepocr_sam_clip")

prompt = [
    Image("document.pdf"), 
    "<|grounding|>Convert the document to markdown."
]
response = model.generate_content(prompt)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions