Skip to content

feat(docker): add Docker distribution with GHCR CI pipeline#10

Open
anderskev wants to merge 31 commits into
mainfrom
feat/docker-distribution
Open

feat(docker): add Docker distribution with GHCR CI pipeline#10
anderskev wants to merge 31 commits into
mainfrom
feat/docker-distribution

Conversation

@anderskev
Copy link
Copy Markdown
Member

Summary

Add Docker-based distribution so users can run Pearl with docker compose up — no Elixir/Erlang toolchain required. Includes a multi-stage Dockerfile, GitHub Actions CI for building and pushing images to GHCR, and numerous robustness fixes discovered during Docker integration testing.

Changes

Added

  • Multi-stage Dockerfile (build + minimal debian-slim runtime) with non-root user and healthcheck
  • pearl service in docker-compose.yml alongside existing db service
  • GitHub Actions workflow for Docker image build and push to GHCR on main/tags
  • Pearl.Release module for runtime database creation and migrations
  • Docker entrypoint script with start, migrate, eval, and remote commands
  • Environment variable seeding for LLM provider settings on first boot
  • .dockerignore to keep build context lean
  • Node.js 22 LTS and Claude Code CLI in runtime image for Claude provider support
  • Claude CLI config volume mount (~/.claude) for authentication
  • Docker quickstart section in README

Changed

  • Decoupled wiki generation from LiveView lifecycle — tasks now survive page navigation via PubSub
  • Bumped Docker base image to Elixir 1.19.5 / OTP 28.3.1
  • runtime.exs now supports PHX_SCHEME and PHX_URL_PORT for flexible URL config

Fixed

  • LLM JSON parsing when responses include conversational preamble before JSON
  • Embedding pipeline aborts after 3 consecutive failures instead of retrying all batches
  • OpenRouter embedding requests retry with exponential backoff on transient errors
  • FK violation race condition during repo deletion while embedding is in progress
  • Earmark warnings from malformed LLM markdown output
  • Task.Supervisor.start_child compatibility across Elixir/OTP versions
  • Friendly error message when Claude CLI is not installed
  • Asset compilation order (Elixir compile before assets.deploy for colocated hooks)

Motivation

Pearl requires Elixir, Erlang, Node.js, and PostgreSQL with pgvector — a non-trivial setup for users who just want to try it. A single docker compose up with an .env file for API keys removes that barrier entirely.

Testing

  • Manual testing: docker compose up from clean state, full clone → index → wiki generation flow
  • Unit tests added for Pearl.Release module
  • Unit tests added for JSON extraction fallback in wiki generator
  • Existing test suite passes (mix test)

Manual Testing Steps

  1. Clone the repo and create a .env with OPENROUTER_API_KEY=sk-...
  2. Run docker compose up
  3. Visit http://localhost:4000
  4. Clone a repository and generate a wiki

Generated with Claude Code

anderskev and others added 27 commits February 11, 2026 18:43
Outlines approach for distributing Pearl as a pre-built GHCR Docker
image so users without hex.pm access can run it via `docker compose up`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Used by the Docker entrypoint to run Ecto migrations at container
startup without requiring the Mix toolchain.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two-stage build: Elixir/Node.js compilation stage produces a BEAM
release, runtime stage uses debian-slim with only the release binary
and runtime dependencies. Runs as non-root user with healthcheck.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds the pearl application service alongside the existing db service.
Configures DATABASE_URL, SECRET_KEY_BASE, LLM provider env vars, and
volume mounts for repo storage. Service waits for db health check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Updates application.ex to read LLM_PROVIDER, LLM_MODEL, and
EMBEDDING_MODEL env vars at startup and seed Pearl.Settings when
values are still at defaults. Also updates runtime.exs to support
PHX_SCHEME and PHX_URL_PORT for flexible Docker URL configuration.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Builds on pushes to main and version tags, pushes to ghcr.io.
PR builds validate the Dockerfile without pushing. Uses Docker
Buildx with GitHub Actions cache for faster rebuilds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Prevents local Docker compose overrides from being committed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a Quick Start section for running Pearl via Docker with no
local toolchain required. Documents LLM provider configuration
via env vars and .env files, and data persistence via volumes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The phoenix-colocated module is generated during mix compile and must
be available before esbuild runs. Reorders Dockerfile build steps to
compile the application first, then install npm deps and deploy assets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Pearl.Release.create_db/0 to create the database if it doesn't
  exist, called by the entrypoint before migrations
- Make seed_settings_from_env async with error handling so it doesn't
  block application startup if the database is temporarily unavailable
- Update entrypoint to call create_db before migrate

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mount ~/.claude into the container at /app/.claude (read-only) to
enable the Claude Code CLI provider when running via Docker Compose.
The mount path matches the pearl user's home directory (/app) set
in the Dockerfile.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Dockerfile uses bin/docker-entrypoint.sh instead. This file was
a leftover from an earlier implementation step.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MFA tuples passed to start_child with link: true caused
BadFunctionError on Elixir 1.18/OTP 27 in Docker. Anonymous
functions work across all versions.

Co-Authored-By: Claude <noreply@anthropic.com>
Switch from Earmark.as_html!/2 to as_html/2 so unclosed code fences
in generated wiki content log warnings in the return tuple instead
of printing to stderr.

Co-Authored-By: Claude <noreply@anthropic.com>
Check repo existence before each batch insert to prevent FK violation
when a user deletes a repo while the indexing pipeline is running.

Co-Authored-By: Claude <noreply@anthropic.com>
Match local dev environment to avoid version-specific behavior
differences in the Docker build.

Co-Authored-By: Claude <noreply@anthropic.com>
The existing repo-existence check in process_batch/3 reduces but does
not eliminate the race window between checking and inserting. Wrap
Repo.insert_all in a try/rescue so a foreign_key_violation returns
{:error, :repo_deleted} instead of crashing the Task.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Transient errors (429, 5xx) from OpenRouter during embedding would
fail the entire batch.  Retry up to 3 times with exponential backoff
before giving up.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switches from Enum.reduce to Enum.reduce_while so that 3 consecutive
embedding batch failures halt the pipeline early instead of retrying
every remaining batch.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The catch-all format_error rendered :cli_not_found as a cryptic
inspect string.  Add a dedicated clause that tells the user to install
the CLI or switch provider.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds Node.js 22 LTS and @anthropic-ai/claude-code to the runtime
stage so the ClaudeCode provider can find the claude binary on PATH
when running inside Docker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switch from linked Task processes with trap_exit to PubSub-based
progress reporting. Generation tasks now survive page navigation and
LiveView disconnects. On application startup, reset any repos stuck
in an in-progress status to "failed" so they can be retried.

Also hardens Claude Code CLI port handling with Port.monitor and
safe_close, and adds ANTHROPIC_API_KEY passthrough and init: true
to docker-compose for proper signal handling.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
parse_structure_response/1 failed when the LLM ignored the JSON-only
prompt instruction and returned text like "I'll analyze this
repository..." before the JSON object. Added a regex fallback that
extracts the outermost {...} when the initial parse fails.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@anderskev anderskev added the enhancement New feature or request label Feb 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 13, 2026

Walkthrough

Adds a Docker-first distribution and CI: a multi-stage Dockerfile, pearl/.dockerignore, bin/docker-entrypoint.sh, docker-compose.yml with a new pearl service and pearl_repos volume, and a GitHub Actions workflow to build and push images to ghcr.io. Updates README with a Docker quick start and LLM provider/env examples. Introduces Pearl.Release (create_db/migrate/rollback), runtime URL env config, reset_orphaned_repos, asynchronous env-based settings seeding, provider retry/robustness improvements, resilient RAG batching, PubSub-based LiveView progress, and tests for release and wiki JSON parsing.

🚥 Pre-merge checks | ✅ 3 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 72.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding Docker distribution with a GHCR CI pipeline, which is the primary focus of this PR.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, covering the Docker implementation, robustness fixes, testing, and motivation.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@docker-compose.yml`:
- Around line 40-42: The existing volume mount ~/.claude:/app/.claude only
covers older Claude CLI layouts; update docker-compose to support newer
XDG-style config paths by either adding additional read-only mounts such as
~/.config/claude:/app/.config/claude:ro and/or
~/.config/claude-code:/app/.config/claude-code:ro alongside the current
~/.claude:/app/.claude:ro, or set the CLAUDE_CONFIG_DIR environment variable
(CLAUDE_CONFIG_DIR=/app/.claude) in the service definition so the CLI is
explicitly pointed to the mounted config directory regardless of version.

In `@docs/plans/2026-02-11-docker-distribution.md`:
- Around line 30-43: Update the document to reflect the actual versions used:
replace the base image string `hexpm/elixir:1.15.7-erlang-26.2.5` with the
implemented `hexpm/elixir:1.19.5-erlang-28.3.1` (or mark it as configurable),
and change references to "Node.js 24" to "Node.js 22 LTS" (or note that Node is
22 in the Dockerfile and that Node version may vary), ensuring the strings
`hexpm/elixir:1.15.7-erlang-26.2.5` and `"Node.js 24"` from the plan are updated
or annotated to match the Dockerfile implementation.

In `@pearl/lib/pearl/repositories/repositories.ex`:
- Around line 27-45: The reset_orphaned_repos function uses DateTime.utc_now()
when updating the RepoRecord.updated_at field, causing an Ecto type mismatch
because the schema uses naive_datetime timestamps; replace DateTime.utc_now()
with NaiveDateTime.utc_now() in the Repo.update_all call (in the function
reset_orphaned_repos) so the updated_at value matches the RepoRecord
timestamps() naive_datetime type.
🧹 Nitpick comments (11)
pearl/lib/pearl/providers/openrouter.ex (1)

172-182: Consider adding jitter to the exponential backoff.

The current implementation uses pure exponential backoff. Adding randomized jitter helps avoid thundering herd problems when multiple processes retry simultaneously after a transient failure.

♻️ Optional: Add jitter to backoff delay
  defp maybe_retry_embed(texts, model, key, attempt, reason) do
    if attempt < `@embed_max_retries` do
-     delay = `@embed_base_delay_ms` * Integer.pow(2, attempt)
+     base_delay = `@embed_base_delay_ms` * Integer.pow(2, attempt)
+     jitter = :rand.uniform(div(base_delay, 2))
+     delay = base_delay + jitter

      Logger.warning(
pearl/lib/pearl/providers/claude_code.ex (1)

208-240: Consider adding port monitoring to the streaming path.

The synchronous path now handles unexpected port death via :DOWN messages, but the streaming path in chat_stream/3 would block for up to 10 minutes in next_stream_chunk/1 if the port dies unexpectedly. You could carry a monitor ref in the stream state and handle :DOWN in next_stream_chunk/1 to fail fast.

This is a nice-to-have since the timeout provides a safety net, and adding monitoring to Stream.resource adds complexity.

pearl/lib/pearl/wiki/generator.ex (1)

83-88: Consider using a non-greedy regex for more precise JSON extraction.

The greedy .+ pattern will match from the first { to the last } in the text. For most LLM outputs with a single JSON object, this works fine. However, if the LLM outputs something like Here is option 1: {} and option 2: {"pages": [...]}, the regex would capture everything between the first and last brace.

A non-greedy pattern or balanced brace matching would be more robust:

♻️ Optional: Use non-greedy pattern
 defp extract_json_object(text) do
-  case Regex.run(~r/\{.+\}/s, text) do
+  case Regex.run(~r/\{.+?\}/s, text) do
     [match] -> match
     _ -> nil
   end
 end

Note: Non-greedy won't handle nested braces correctly either. For truly robust extraction, you'd need balanced brace counting, but the current greedy approach is likely sufficient for typical LLM outputs since the outermost braces of valid JSON will be captured.

pearl/lib/pearl_web/live/home_live.ex (1)

36-36: Consider using LiveView streams for the repos collection.

The @repos assign is a regular list. Per coding guidelines, LiveView collections should use streams to avoid memory ballooning. While the current implementation works, converting to streams would improve scalability for users with many repositories.

♻️ Suggested approach for streams

In mount/3:

-     repos: repos,
+     socket
+     |> stream(:repos, repos)

Then update references throughout the template and event handlers to use stream operations like stream_insert/3 for updates and stream_delete/3 for removals.

As per coding guidelines: "Always use LiveView streams for collections instead of assigning regular lists to avoid memory ballooning and runtime termination".

pearl/test/pearl/release_test.exs (1)

11-16: Consider testing rollback behavior more thoroughly.

The current test only verifies the function exists. While avoiding actual rollbacks in the test database is reasonable, you could test the function with a mock or at least verify it returns the expected tuple shape when called on an already-at-target version.

💡 Optional: Test rollback returns expected tuple
   describe "rollback/2" do
-    test "accepts repo and version" do
-      # Just verify it doesn't crash with current version
-      # We don't actually roll back in tests, just check the function exists
-      assert is_function(&Pearl.Release.rollback/2, 2)
+    test "returns ok tuple when rolling back to current version" do
+      # Rolling back to version 0 is a no-op that returns the expected tuple shape
+      assert {:ok, _, _} = Pearl.Release.rollback(Pearl.Repo, 0)
     end
   end
docker-compose.yml (1)

25-26: Consider documenting the insecure default more prominently.

The fallback SECRET_KEY_BASE is intentionally insecure for local development convenience. While the value is clearly marked as needing change, users running in production might miss this. Consider adding a comment in docker-compose.yml or ensuring the README's .env example is more prominent.

💡 Optional: Add inline comment
     environment:
       DATABASE_URL: "ecto://postgres:postgres@db:5432/pearl_prod"
+      # ⚠️ CHANGE THIS for production! Generate with: mix phx.gen.secret
       SECRET_KEY_BASE: "${SECRET_KEY_BASE:-super-secret-key-base-that-is-at-least-64-bytes-long-for-production-use-please-change}"
pearl/lib/pearl/application.ex (2)

56-58: The 2-second sleep may be unnecessary.

Since Pearl.Repo is started in the supervision tree before this code runs, the Repo should already be available. The arbitrary delay could mask timing issues rather than solve them properly. Consider removing it or replacing with a more deterministic check.

♻️ Alternative: Remove sleep or use retry logic
   defp seed_settings_from_env do
     Task.start(fn ->
-      # Brief delay to let Repo connections stabilize
-      Process.sleep(2_000)
-
       env_to_setting = [

If there's a genuine race condition, consider retry logic instead:

defp seed_settings_from_env do
  Task.start(fn ->
    seed_with_retry(3)
  end)
end

defp seed_with_retry(0), do: :ok
defp seed_with_retry(attempts) do
  # ... seeding logic ...
rescue
  _ -> 
    Process.sleep(500)
    seed_with_retry(attempts - 1)
end

82-86: Blanket error suppression may hide legitimate issues.

Catching all exceptions and exits silently could mask configuration or database problems. Consider logging at debug level when seeding fails so operators can diagnose issues.

💡 Optional: Log failures at debug level
             rescue
-              _ -> :ok
+              e -> 
+                require Logger
+                Logger.debug("Failed to seed #{setting_key} from #{env_var}: #{inspect(e)}")
             catch
-              :exit, _ -> :ok
+              :exit, reason -> 
+                require Logger
+                Logger.debug("Failed to seed #{setting_key} from #{env_var}: exit #{inspect(reason)}")
             end
Dockerfile (2)

24-31: Add --no-install-recommends to build stage apt-get commands.

While the build stage is discarded, adding --no-install-recommends speeds up builds by avoiding unnecessary packages.

🐳 Proposed fix
 # Install build dependencies
 RUN apt-get update -y && \
-    apt-get install -y build-essential git curl && \
+    apt-get install -y --no-install-recommends build-essential git curl && \
     apt-get clean && rm -rf /var/lib/apt/lists/*

 # Install Node.js 22 LTS (needed for npm deps used by esbuild/tailwind)
 RUN curl -fsSL https://deb.nodesource.com/setup_22.x | bash - && \
-    apt-get install -y nodejs && \
+    apt-get install -y --no-install-recommends nodejs && \
     apt-get clean && rm -rf /var/lib/apt/lists/*

99-100: Pin the Claude Code CLI version for reproducible builds.

Using @latest means builds may produce different results over time as the CLI is updated. Pin to a specific version instead.

🐳 Proposed fix
 # Install Claude Code CLI globally
-RUN npm install -g `@anthropic-ai/claude-code`@latest
+# Pin version for reproducible builds; update periodically
+RUN npm install -g `@anthropic-ai/claude-code`@1.0.100
docs/plans/2026-02-11-docker-distribution.md (1)

136-142: Add language identifier to fenced code block.

Per markdownlint, fenced code blocks should specify a language for proper syntax highlighting.

📝 Proposed fix
-```
+```text
 _build/
 deps/
 .git/
 node_modules/
 .elixir_ls/
 ```

Comment thread docker-compose.yml
Comment thread docs/plans/2026-02-11-docker-distribution.md Outdated
Comment thread pearl/lib/pearl/repositories/repositories.ex
anderskev and others added 3 commits February 12, 2026 21:59
The cache-to export is the only output for PR builds (push: false),
so a transient 502 from GitHub's cache API fails the entire build.
Adding ignore-error=true makes cache write failures non-fatal.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
DateTime.utc_now() doesn't dump correctly for Ecto :naive_datetime
fields, which would raise CastError at runtime when resetting
orphaned repos on startup.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Set CLAUDE_CONFIG_DIR so the mounted ~/.claude volume works regardless
of CLI version. Update the design doc to reflect actual Elixir 1.19.5,
OTP 28.3.1, and Node.js 22 LTS versions used in the Dockerfile.

Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@docs/plans/2026-02-11-docker-distribution.md`:
- Around line 88-90: The design doc's volumes section shows the
host-to-container mount as "~/.claude:/root/.claude:ro" but the implementation
uses "~/.claude:/app/.claude:ro"; update the design doc to match the actual
mount used by the compose file by changing the mount path to
"~/.claude:/app/.claude:ro" (or vice versa if you intend to change the
implementation), ensuring the volumes entry and any references to
"/root/.claude" vs "/app/.claude" are consistent with the deployed
docker-compose configuration.
🧹 Nitpick comments (1)
docs/plans/2026-02-11-docker-distribution.md (1)

134-142: Add language specifier to fenced code block.

The .dockerignore content block should specify a language for consistent formatting.

📝 Suggested fix
 ### .dockerignore

-```
+```text
 _build/
 deps/
 .git/
 node_modules/
 .elixir_ls/
 ```

Comment on lines +88 to +90
volumes:
- ~/.claude:/root/.claude:ro
- pearl_repos:/app/repos
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Design doc shows different mount path than implementation.

The design document shows ~/.claude:/root/.claude:ro but the actual docker-compose.yml mounts to /app/.claude:

# Actual implementation in docker-compose.yml
- ~/.claude:/app/.claude:ro

This should be updated for consistency with the implementation.

📝 Suggested fix
    volumes:
-     - ~/.claude:/root/.claude:ro
+     - ~/.claude:/app/.claude:ro
      - pearl_repos:/app/repos
🤖 Prompt for AI Agents
In `@docs/plans/2026-02-11-docker-distribution.md` around lines 88 - 90, The
design doc's volumes section shows the host-to-container mount as
"~/.claude:/root/.claude:ro" but the implementation uses
"~/.claude:/app/.claude:ro"; update the design doc to match the actual mount
used by the compose file by changing the mount path to
"~/.claude:/app/.claude:ro" (or vice versa if you intend to change the
implementation), ensuring the volumes entry and any references to
"/root/.claude" vs "/app/.claude" are consistent with the deployed
docker-compose configuration.

- Wrap generation in try/rescue/catch to return structured generation_crashed errors.
- Load the app explicitly with Application.load(@app) in create_db.

Iteration: 1
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pearl/lib/pearl_web/live/home_live.ex (1)

28-39: ⚠️ Potential issue | 🟡 Minor

Keep generating derived from active work.
With multiple in-progress repos, setting generating to false when one finishes can re-enable the form while others are still running. Consider recomputing generating from progress_by_repo (or repo statuses) whenever progress updates or completes.

🤖 Fix all issues with AI agents
In `@pearl/lib/pearl_web/live/home_live.ex`:
- Around line 14-26: The repos assign is a regular list causing memory growth;
convert it to a LiveView stream using stream/3 and replace list operations with
stream_insert/3, stream_delete/3, and stream_update/3: initialize the socket
assign with stream(:repos, repos, id: & &1.id) (use in_progress_statuses and
default_progress_message to seed the stream), update template to use
phx-update="stream" and remove length(`@repos`) logic, and replace usages of
prepend_or_update_repo/2 and update_repo_in_list/3 in all handle_event and
handle_info callbacks with the appropriate stream_* calls (use stream_insert/3
to add/prepend, stream_update/3 to modify by id, and stream_delete/3 to remove).
Ensure you still subscribe to Phoenix.PubSub for in_progress_repos and derive
progress_by_repo from the stream entries rather than a separate list.
- Around line 236-254: When Task.Supervisor.start_child(Pearl.TaskSupervisor, fn
-> generate_wiki_task(repo_id, repo) end) returns {:error, reason}, you need to
remove the "Starting..." entry for repo_id from the progress_by_repo map so the
UI spinner and disabled delete button are cleared; update the error branch that
currently does {:noreply, assign(socket, generating: false, error:
format_error(reason))} to also update progress_by_repo (using repo_id) to delete
that key before returning, mirroring the cleanup performed in other completion
paths.

In `@pearl/lib/pearl/release.ex`:
- Around line 60-61: The release helper currently uses
Application.ensure_all_started/1 in load_app which boots the whole app; change
load_app to call Application.load(`@app`) instead (matching the pattern used in
create_db) so migrations use Ecto.Migrator.with_repo/2 without starting the full
supervision tree; update the load_app function to reference Application.load and
keep repos/ and create_db as-is.

Comment on lines +14 to +26
in_progress_statuses = ~w(pending cloning analyzing generating)
in_progress_repos = Enum.filter(repos, &(&1.status in in_progress_statuses))

if connected?(socket) do
for repo <- in_progress_repos do
Phoenix.PubSub.subscribe(Pearl.PubSub, "repo:progress:#{repo.id}")
end
end

progress_by_repo =
for repo <- in_progress_repos, into: %{} do
{repo.id, default_progress_message(repo.status)}
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the full home_live.ex file to understand the context
cat -n pearl/lib/pearl_web/live/home_live.ex | head -100

Repository: existential-birds/pearl

Length of output: 3969


🏁 Script executed:

# Also check the file size to understand scope
wc -l pearl/lib/pearl_web/live/home_live.ex

Repository: existential-birds/pearl

Length of output: 108


🏁 Script executed:

# Let's get the complete file
cat -n pearl/lib/pearl_web/live/home_live.ex

Repository: existential-birds/pearl

Length of output: 22402


Convert repos collection to LiveView streams.

The repos list is dynamically updated throughout the component (prepended, updated, and deleted), yet it's assigned as a regular list. This can cause memory ballooning when many repositories accumulate. Switch to using stream/3, stream_insert/3, and stream_delete/3 instead.

When converting, also update the template to:

  • Remove length(@repos) check and track count with a separate assign
  • Change for {repo, idx} <- Enum.with_index(@repos) to use stream iteration with phx-update="stream" on the parent container

Replace the helper functions prepend_or_update_repo/2 and update_repo_in_list/3 with appropriate stream operations across all handle_event and handle_info callbacks.

🤖 Prompt for AI Agents
In `@pearl/lib/pearl_web/live/home_live.ex` around lines 14 - 26, The repos assign
is a regular list causing memory growth; convert it to a LiveView stream using
stream/3 and replace list operations with stream_insert/3, stream_delete/3, and
stream_update/3: initialize the socket assign with stream(:repos, repos, id: &
&1.id) (use in_progress_statuses and default_progress_message to seed the
stream), update template to use phx-update="stream" and remove length(`@repos`)
logic, and replace usages of prepend_or_update_repo/2 and update_repo_in_list/3
in all handle_event and handle_info callbacks with the appropriate stream_*
calls (use stream_insert/3 to add/prepend, stream_update/3 to modify by id, and
stream_delete/3 to remove). Ensure you still subscribe to Phoenix.PubSub for
in_progress_repos and derive progress_by_repo from the stream entries rather
than a separate list.

Comment on lines +236 to 254
Phoenix.PubSub.subscribe(Pearl.PubSub, "repo:progress:#{repo_id}")

# Fetch metadata in parallel (runs independently of LiveView)
Task.Supervisor.start_child(
Pearl.TaskSupervisor,
fn -> fetch_metadata_task(repo_id, repo) end
)

# Main generation task (runs independently of LiveView)
case Task.Supervisor.start_child(
Pearl.TaskSupervisor,
fn -> generate_wiki_task(repo_id, repo) end
) do
{:ok, _pid} ->
{:noreply, socket}

{:error, reason} ->
Process.exit(metadata_pid, :shutdown)
{:noreply, assign(socket, generating: false, error: format_error(reason))}
end
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

wc -l pearl/lib/pearl_web/live/home_live.ex

Repository: existential-birds/pearl

Length of output: 108


🏁 Script executed:

cat -n pearl/lib/pearl_web/live/home_live.ex | head -100

Repository: existential-birds/pearl

Length of output: 3969


🏁 Script executed:

# Look for the specific section and surrounding context
cat -n pearl/lib/pearl_web/live/home_live.ex | sed -n '200,280p'

Repository: existential-birds/pearl

Length of output: 3194


🏁 Script executed:

# Search for progress_by_repo usage throughout the file
rg "progress_by_repo" pearl/lib/pearl_web/live/home_live.ex -A 2 -B 2

Repository: existential-birds/pearl

Length of output: 1873


🏁 Script executed:

# Check if there's a template file for this LiveView
fd "home_live" pearl/lib/pearl_web/live/ --type f

Repository: existential-birds/pearl

Length of output: 104


🏁 Script executed:

cat -n pearl/lib/pearl_web/live/home_live.ex | sed -n '290,340p'

Repository: existential-birds/pearl

Length of output: 1957


🏁 Script executed:

cat -n pearl/lib/pearl_web/live/home_live.ex | sed -n '140,165p'

Repository: existential-birds/pearl

Length of output: 1564


🏁 Script executed:

cat -n pearl/lib/pearl_web/live/home_live.ex | sed -n '340,360p'

Repository: existential-birds/pearl

Length of output: 818


🏁 Script executed:

cat -n pearl/lib/pearl_web/live/home_live.ex | sed -n '340,360p'

Repository: existential-birds/pearl

Length of output: 818


Clear progress state when the generation task fails to start.

If Task.Supervisor.start_child/2 returns {:error, reason}, progress_by_repo retains the "Starting..." entry, leaving the spinner visible and the delete button disabled. Remove the entry from the map in the error branch to match the cleanup pattern used in other completion paths.

Suggested fix
              {:error, reason} ->
-                {:noreply, assign(socket, generating: false, error: format_error(reason))}
+                progress_by_repo = Map.delete(socket.assigns.progress_by_repo, repo_id)
+
+                {:noreply,
+                 assign(socket,
+                   generating: false,
+                   progress_by_repo: progress_by_repo,
+                   error: format_error(reason)
+                 )}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Phoenix.PubSub.subscribe(Pearl.PubSub, "repo:progress:#{repo_id}")
# Fetch metadata in parallel (runs independently of LiveView)
Task.Supervisor.start_child(
Pearl.TaskSupervisor,
fn -> fetch_metadata_task(repo_id, repo) end
)
# Main generation task (runs independently of LiveView)
case Task.Supervisor.start_child(
Pearl.TaskSupervisor,
fn -> generate_wiki_task(repo_id, repo) end
) do
{:ok, _pid} ->
{:noreply, socket}
{:error, reason} ->
Process.exit(metadata_pid, :shutdown)
{:noreply, assign(socket, generating: false, error: format_error(reason))}
end
Phoenix.PubSub.subscribe(Pearl.PubSub, "repo:progress:#{repo_id}")
# Fetch metadata in parallel (runs independently of LiveView)
Task.Supervisor.start_child(
Pearl.TaskSupervisor,
fn -> fetch_metadata_task(repo_id, repo) end
)
# Main generation task (runs independently of LiveView)
case Task.Supervisor.start_child(
Pearl.TaskSupervisor,
fn -> generate_wiki_task(repo_id, repo) end
) do
{:ok, _pid} ->
{:noreply, socket}
{:error, reason} ->
progress_by_repo = Map.delete(socket.assigns.progress_by_repo, repo_id)
{:noreply,
assign(socket,
generating: false,
progress_by_repo: progress_by_repo,
error: format_error(reason)
)}
end
🤖 Prompt for AI Agents
In `@pearl/lib/pearl_web/live/home_live.ex` around lines 236 - 254, When
Task.Supervisor.start_child(Pearl.TaskSupervisor, fn ->
generate_wiki_task(repo_id, repo) end) returns {:error, reason}, you need to
remove the "Starting..." entry for repo_id from the progress_by_repo map so the
UI spinner and disabled delete button are cleared; update the error branch that
currently does {:noreply, assign(socket, generating: false, error:
format_error(reason))} to also update progress_by_repo (using repo_id) to delete
that key before returning, mirroring the cleanup performed in other completion
paths.

Comment on lines +60 to +61
defp repos, do: Application.fetch_env!(@app, :ecto_repos)
defp load_app, do: Application.ensure_all_started(@app)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n pearl/lib/pearl/release.ex

Repository: existential-birds/pearl

Length of output: 1961


🌐 Web query:

Ecto.Migrator.with_repo Application.load vs Application.ensure_all_started release tasks

💡 Result:

When you run migrations from a release (e.g. bin/my_app eval "MyApp.Release.migrate"), the key difference is:

  • Application.load(:my_app) loads your app’s .app metadata and configuration without starting its supervision tree. This is why Phoenix’s official “release migrations” snippet calls Application.load/1 before reading :ecto_repos from config. Starting is intentionally avoided so you don’t boot the Endpoint/web server just to migrate. [1][2]

  • Application.ensure_all_started(:my_app) loads and starts your app and all dependencies. In a Phoenix app that typically means it starts your full supervision tree (often including the Endpoint), which you usually don’t want during “migration-only” commands unless you’ve explicitly designed a “minimal start” mode. Phoenix docs explicitly call out using ensure_all_started/1 only when you want to start the whole app, and suggest a “minimal” flag approach if you need it. [1]

How this interacts with Ecto.Migrator.with_repo/3:

  • Ecto.Migrator.with_repo/3 already ensures the repo is started for migrations: it starts the applications required to run the repo (and can temporarily start/stop the repo if it wasn’t already running). So in typical release migration tasks, you often only need Application.load(:my_app) (to discover repos/config), then with_repo. [2]

Practical rule of thumb for release tasks:

  • Migration/rollback task: Application.load(@app) + Ecto.Migrator.with_repo(...) (optionally start only specific deps like :ssl if your DB connection needs it). [1][2]
  • Tasks that truly require the running app (jobs, caches, pubsub, etc.): Application.ensure_all_started(@app) (ideally with a “minimal mode” so you don’t start the Endpoint). [1]

Sources: Phoenix “Deploying with Releases” guide (shows Application.load/1 for migrations and explains when to use ensure_all_started/1) [1], Ecto SQL Ecto.Migrator.with_repo/3 docs (explains it starts required apps / repo lifecycle) [2].


Replace Application.ensure_all_started/1 with Application.load/1 in release tasks.

Application.ensure_all_started/1 boots the entire supervision tree (Endpoint, jobs, etc.), which is unnecessary for migrations and can block container startup. Ecto.Migrator.with_repo/2 automatically starts the repo when needed. The create_db/0 function already uses the correct pattern with Application.load/1 on line 21; apply the same fix to load_app/0 on line 61 to maintain consistency and avoid starting unwanted services during migration.

✅ Suggested fix
-defp load_app, do: Application.ensure_all_started(`@app`)
+defp load_app, do: Application.load(`@app`)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
defp repos, do: Application.fetch_env!(@app, :ecto_repos)
defp load_app, do: Application.ensure_all_started(@app)
defp repos, do: Application.fetch_env!(`@app`, :ecto_repos)
defp load_app, do: Application.load(`@app`)
🤖 Prompt for AI Agents
In `@pearl/lib/pearl/release.ex` around lines 60 - 61, The release helper
currently uses Application.ensure_all_started/1 in load_app which boots the
whole app; change load_app to call Application.load(`@app`) instead (matching the
pattern used in create_db) so migrations use Ecto.Migrator.with_repo/2 without
starting the full supervision tree; update the load_app function to reference
Application.load and keep repos/ and create_db as-is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant