Skip to content

GIS-sys/DockerLLMNoInternet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

Repository for running LLMs via docker, especially on machines without access to the Internet (for example, remote server, only accessible via local network or a direct USB drive)

Preparations:

Install docker on system without internet: for example, follow the instructions on official website

Pipeline

Build

Go to the root of this repository and run:

docker build --progress=plain -t fips-llm .

You may have to try "sudo docker" for this to succeed. Be warned that this will use huge amount of traffic

You need to rerun this only if you change one of the:

  • Dockerfile
  • environment.yaml
  • requirements.txt

Run

On a machine with access to the internet:

docker run --rm -v ./src/code:/app/run -v ./src/model_cache:/app/model_cache -v ./src/huggingface:/root/.cache/huggingface -p 8080:8080 --gpus all fips-llm python -u -c 'from ai import Models; print(Models.process(model_name='\''Qwen/Qwen3-0.6B'\'', max_new_tokens=32768, prompt='\''Answer shortly: what is 2+2*2?'\''))'

On a machine without the internet:

docker run --rm -v ./src/code:/app/run -v ./src/model_cache:/app/model_cache -v ./src/huggingface:/root/.cache/huggingface -p 8080:8080 --gpus all fips-llm /bin/bash -c "HF_HUB_OFFLINE=1 python -u -c 'from ai import Models; print(Models.process(model_name='\''Qwen/Qwen3-0.6B'\'', max_new_tokens=32768, prompt='\''Answer shortly: what is 2+2*2?'\''))'"

To run just a web server:

docker run --rm -v ./src/code:/app/run -v ./src/model_cache:/app/model_cache -v ./src/huggingface:/root/.cache/huggingface -p 8080:8080 --gpus all fips-llm /bin/bash -c "HF_HUB_OFFLINE=1 uvicorn main:app --host 0.0.0.0 --port 8080"

On Windows you might need to change ./src to .\src and ./src/huggingface to .\src\huggingface like this:

docker run --rm -v .\src\code:/app/run -v .\src\model_cache:/app/model_cache -v .\src\huggingface:/root/.cache/huggingface -p 8080:8080 --gpus all fips-llm python -u -c 'from ai import Models; print(Models.process(model_name="""Qwen/Qwen3-0.6B""", max_new_tokens=32768, prompt="""Answer shortly: what is 2+2*2?"""))'

If running without specific commands, this will launch a web server for interactive use. Go to http://localhost:8080

You need to rerun this only if you change main.py AND this change downloads something in one of the volumes (like src/model_cache/ for models or huggingface/ for transformers cache)

Package

Docker image

If you have run docker build this iteration - save docker image:

docker save -o fips-llm.tar fips-llm

transfer the fips-llm.tar file and then load it on the target machine:

docker load -i fips-llm.tar

Volumes

Only transfer this folder if volumes were changed. You don't need to rebuild docker image each time volumes change

Main script

Only transfer this if the main.py was changed. You don't need to rebuild docker image each time main.py changes

In general

You could just package and deliver the whole src/ folder, but it will be very large in size, so choose wisely

Useful info

  • Use https://transfer.it/start for transferring huge files and folders between computers. BEWARE that this doesn't preserve symlinks

  • To preserve symlinks, use archives to save folder:

    tar --preserve-permissions -czvf src.tar.gz src/

    and to extract that later on the remote machine:

    tar -xzvf src.tar.gz

About

Dockerfile for building an image for LLM which can be run on the server without access to the internet

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors