Skip to content

Store the PIP cache in a specific directory - #1149#1153

Draft
akuker wants to merge 2 commits intodevelopfrom
akuker_pip_caching_develop
Draft

Store the PIP cache in a specific directory - #1149#1153
akuker wants to merge 2 commits intodevelopfrom
akuker_pip_caching_develop

Conversation

@akuker
Copy link
Copy Markdown
Member

@akuker akuker commented Apr 28, 2023

Updated easyinstall to download the SOURCE pip packages to $HOME/.pip_cache. The web startup script was updated to install the packages from this directory

@akuker
Copy link
Copy Markdown
Member Author

akuker commented Apr 28, 2023

@rdmark - I'm curious what you think about this change. With the start.sh changes, all of the pip packages will need to be downloaded by easyinstall.sh first. Is this going to be a problem?

We could add some error handling potentially where if the pip install from the cached dir fails, it will try to re-run the command and pull from the internet.

@rdmark
Copy link
Copy Markdown
Member

rdmark commented Apr 28, 2023

It's an interesting solution! I can see how we bypass the network instability factor this way.

A drawback is that the libraries won't ever be refreshed unless you run easyinstall again, but I guess that's acceptable?

Some of the libraries have C extensions that need to be compiled, right? How does this work on an RPi? Does pip3 kick off the compilation on runtime?

@rdmark
Copy link
Copy Markdown
Member

rdmark commented Jun 4, 2023

After thinking about it for a while I would argue that giving easyinstall.sh the responsibility to manage the caching of python libraries is a step in the wrong direction. It might make troubleshooting even hairier if f.e. the cached libraries end up in a bad state. Plus it constitutes feature creep for easyinstall. Let's leave the python lib management to pip. ;)

Just my 2 cents!

@akuker
Copy link
Copy Markdown
Member Author

akuker commented Jun 20, 2023

No objections by me! I'm going to cancel this PR for now. We can dust it off again later if needed.

@akuker akuker closed this Jun 20, 2023
@rdmark
Copy link
Copy Markdown
Member

rdmark commented Oct 23, 2023

I want to try this again! Our users are endlessly struggling with python dependency hell. A hard lock down to cached libs seem a more attractive option at this point.

@rdmark rdmark reopened this Oct 23, 2023
@rdmark rdmark force-pushed the akuker_pip_caching_develop branch from ae7e56e to 6dbef07 Compare October 23, 2023 12:16
Copy link
Copy Markdown
Contributor

@uweseimet uweseimet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Linux there is already a shared default folder for cached data. Data are usually cached in $HOME/.cache/APPLICATION_NAME, e.g $HOME/.cache/piscsi.
If possible, this convention should be followed. Not sure, though, whether this is applicable in this case. I wonder how other python applications deal with this.

@rdmark rdmark marked this pull request as draft October 29, 2023 06:25
@rdmark
Copy link
Copy Markdown
Member

rdmark commented Oct 29, 2023

Moved the cache dir to $HOME/.cache/piscsi. I think that's a fine convention.

The current blocker is that one dependent library has a C wrapper that can't be build from source with pip, so this method doesn't work fully out of the box. Some more research and testing on a real RPi is needed. I'm not targeting this for the upcoming "Bookworm" release.

@cefiar
Copy link
Copy Markdown

cefiar commented Nov 16, 2023

For binary packages you can manually specify the platform(s) to download binaries for storage in the local file cache. You'd probably want to separate them out from the existing requirements.txt package list though.

eg: For amd64 it's python3 -m pip download --only-binary=:all: --platform linux_x86_64 SomePkg

There are other options that you may want to specify as well, such as python version, ABI, etc. See https://pip.pypa.io/en/stable/cli/pip_download/ - Scroll down to the examples, specifically example 6 for multiple platforms.

Other notes on this:

  • May want to consider wrapping the pip pieces in their own functions in a separate shell file, then using source in bash to bring it into the easyinstall.sh and web/start.sh scripts (so it's all in one place).
  • Could wrap the download functions in a test to see if pypi.org is resolvable by DNS (eg: getent hosts pypi.org and check for an exit level not 0) before doing the download, to avoid pip taking ages to fail to connect when not online. This could allow you to try to download the latest files when online (eg: for dev tests) and fail through to the local cache when offline.

@rdmark
Copy link
Copy Markdown
Member

rdmark commented Dec 21, 2025

Debian is packaging more and more of these python libraries. I think one path forward is to depend on Debian's packaging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants