Skip to content

Allow different back-ends (mupdf?) and possibly reduce dependencies #26

@frabjous

Description

@frabjous

I was a bit surprised to see that the script uses ghostscript and poppler and imagemagick. I would have thought that the first two provide more or less feature-complete tools for working with PDFs, and that you wouldn't need both.

I haven't got a chance to look at the script in a very detailed way, but from what I can tell, poppler's tools pdfinfo and pdfseparate are mainly used to get the number of pages, page sizes, and split pages. But these are all things ghostscript can do already (look into, e.g., sDEVICE=bbox -o /dev/null for example). Conversely, if you're mainly using ghostscript to rasterize the pages, you could do this instead with poppler's pdftoppm, which can export directly to both png and (un/compressed) tiff on its own, so if you used that, you might not need ghostscript.

For yet another option, you could do either of these things or both with mutool from the mupdf project, which might be my preference, as in my experience, mupdf's libraries are faster than either ghostscript's or poppler's.

In any case, it seems that the number of dependencies could be reduced.

Or the correct solution might be to have the script detect which of these are installed, and use what is available, or allow options for the user to decide which back-end to use.

If someone else doesn't pounce on this, I might try implementing it myself, though I'm not sure when I'll get around it, and admittedly, this isn't super high-priority. But it might be fun. (If someone else beats me to it, so be it!)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions