Skip to content

fix: resolve Android/browser node startup and connection reliability issues#2

Open
imattau wants to merge 114 commits into
masterfrom
vco-social
Open

fix: resolve Android/browser node startup and connection reliability issues#2
imattau wants to merge 114 commits into
masterfrom
vco-social

Conversation

@imattau
Copy link
Copy Markdown
Owner

@imattau imattau commented Mar 17, 2026

Summary

  • Android node startup: sled 0.34 uses memory-mapped files (mmap) which fails on some Android filesystems. Added DynKadStore enum that wraps either SledStore or kad::store::MemoryStore — tries sled first, falls back to in-memory store on failure so the node always starts on Android.
  • Removed circuit relay reservation: Removed known_relay_peers HashSet and reservation code that was causing libp2p connection drops after ~60s.
  • Fixed "Node not responding" on startup: Emit Ready event (with peer ID) immediately after keypair load, before the swarm finishes building, so the UI gets the peer ID right away.
  • Browser identity persistence: Persist libp2p Ed25519 private key to localStorage under vco.libp2p_private_key so the browser node keeps the same peer ID across page reloads.
  • Browser relay auto-connect: Use stored relay address as fallback in NodeClient.connect() so relay is dialed on startup without user action.
  • Android dev port mismatch: Added scripts/android-dev.mjs which finds a free port, passes a --config JSON override to tauri android dev, and exports VITE_PORT. Added android:dev npm script. vite.config.ts now reads VITE_PORT for dynamic port binding.
  • Rust dead code: Replaced unused SyncStreamEvent enum with Infallible type alias.
  • Relay auto-dial on startup: SocialContext.tsx now auto-dials the relay address from settings when the node becomes ready.

Test plan

  • Run cargo check in src-tauri/ — passes clean
  • Start desktop app; verify peer ID appears in UI immediately on launch
  • Restart browser app; verify same peer ID is shown (identity persisted)
  • On Android device/emulator: verify node starts (no sled mmap panic in logcat)
  • Verify relay connection is established automatically on startup (no manual dial needed)
  • Run npm run android:dev and verify port is consistent between Vite and Android WebView

🤖 Generated with Claude Code

imattau added 30 commits March 2, 2026 12:42
- Implemented native Rust libp2p node with identity persistence, MDNS discovery, and Kademlia DHT.
- Integrated real DHT resolution and record publication for peer profiles and media chunks.
- Enhanced SocialContext with cryptographic envelope verification (signature/CID) and robust error handling.
- Finalized ComposePost with functional Emoji, Geolocation, and Hashtag tools.
- Secured application with restricted CSP in tauri.conf.json.
- Added comprehensive README.md with multi-platform build instructions (Desktop, Android, Web).
- Cleaned up legacy mocks and wired UI components to real swarm events.
- Switched to anyhow::Result for Send compatibility in start_node.
- Fixed invalid event permission identifiers in capabilities/default.json.
- Updated deprecated Kademlia config and cleaned up unused imports/code.
- Restored correct brace nesting in vco_node.rs.
imattau and others added 30 commits March 15, 2026 10:39
Scale down headings (text-3xl md:text-5xl), subtitles (text-base md:text-xl),
notification actor names (text-sm md:text-lg), and empty-state elements.
Add min-w-0/shrink-0 on flex children, reduce padding on mobile,
and use rounded-3xl md:rounded-[3rem] consistently.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bility

adb logcat | grep VCO now captures these logs alongside Rust output.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rust serde serializes struct fields as peer_id, network_load, channel_id
but TypeScript types expect peerId, networkLoad, channelId. This caused
peerId to always be undefined, keeping the UI stuck at "Node not responding"
even though the node was running and sending stats events.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The poll snapshot is taken synchronously after calling getStats() IPC,
before the async stats response arrives. This caused isReady to read
as false even when a valid peerId was already cached from a prior event.

Now isReady is true if client.isReady || (peerId is set and valid).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Grid cells used absolute-positioned images but had no defined height,
causing them to collapse to 0px on mobile. Fixed by adding aspect-ratio
classes directly to each cell (aspect-video for single, aspect-square
for multi-image grids) so the absolute img has a non-zero parent height.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… concurrency

Social App (libp2p v0.54):
- Unify event protocol with snake_case tags and camelCase fields for seamless TS integration.
- Integrate circuit relay v2 client into SwarmBuilder for NAT traversal.
- Add descriptive 'dialing' and 'error' events to provide immediate UI feedback.
- Support biometric unlock on mobile platforms with platform-specific cfg wrapping.
- Implement robust mock networking in NodeClient for browser-based development.
- Improve Mesh UI with active connection tags and refined toast notifications.

Relay Node:
- Enhance server to handle multiple concurrent libp2p connections.
- Add /health endpoint and improved configuration management.
- Update stress tests and e2e suites for concurrent connection validation.
Covers SyncResponder (new), sync-handler pull-phase extension, and
delta-sync integration test with 3 test cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements Negentropy range-proof reconciliation on the relay side:
- SyncResponder: new server-role responder that hydrates from store,
  exchanges range proofs, identifies missing hashes, and streams delta
  envelopes to the client with a zero-length sentinel signalling EOF
- sync-handler: extended with a backward-compatible pull phase that
  peeks at the first frame to detect new-protocol vs legacy clients
- delta-sync.test.ts: 3 integration tests proving delta delivery,
  relay deduplication, and client completeness after sync

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two root causes fixed:

1. server.ts: set inboundConnectionThreshold to maxConnections (default
   is 5 per remote host, which blocked all loopback clients beyond the
   5th — a relay must accept many connections from the same IP).

2. sync-handler.ts: treat "transport payload" errors as clean stream-end
   in both the pull-phase and push-phase catch blocks. The vco-transport
   channel throws "Stream ended before receiving a transport payload."
   when the remote closes without sending; this is normal for push-only
   sessions and should not be logged as an unexpected error.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tapping the ScanLine icon inside the multiaddr text field opens the
device camera via @tauri-apps/plugin-barcode-scanner (mobile) to scan
a QR code. The decoded multiaddr populates the input so the user can
review and tap Dial Peer. Desktop falls back to a graceful toast.

- Add @tauri-apps/plugin-barcode-scanner ^2.4.4 to package.json
- Register tauri-plugin-barcode-scanner in Cargo.toml and lib.rs (mobile-only)
- Add barcode-scanner:default to capabilities/default.json
- Add handleQrScan handler with permission/cancel/desktop error handling
- Wrap dial input in relative container; overlay ScanLine icon button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers all 10 gaps identified by the relay delta-sync test, across
4 design sections: Rust stream handler, IPC protocol, TypeScript
bisect loop + VcoStore writes, and trigger/UX.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- R1: move sync_respond definition to Section 2 (IPC), remove duplicate from Section 1
- R2: add sync_sessions field to VcoNodeState so sync_respond can reach session map
- R3: document libp2p_stream::Control acquisition before swarm event loop
- R4: clarify getAllHeaderHashes returns Uint8Array[] with hex-decode inside method
- R5: fix fragile dial_success trigger — match against stored relayAddr, not /p2p/ substring
- R6: remove unused encodeEnvelopeProto import (push phase is out of scope)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
8-task TDD plan covering Rust stream handler, Tauri commands, TypeScript
bisect loop, VcoStore writes, SettingsView UX, and final verification.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Closes all 10 gaps identified in the delta-sync spec:

Rust (vco_node.rs, lib.rs):
- Add libp2p-stream to Cargo.toml; wire StreamBehaviour into VcoBehaviour
  on both mobile and desktop build paths
- Add sync_sessions: Mutex<HashMap<String, UnboundedSender<Vec<u8>>>> to
  VcoNodeState alongside swarm_tx
- Add NodeCommand::SyncWithRelay { relay_addr, session_id }
- Add NodeEvent variants: SyncSessionReady, SyncFrame, SyncComplete,
  SyncError (all with camelCase serde)
- Implement stream open sequence: parse peer_id from multiaddr, dial relay,
  open /vco/sync/3.2.0 stream via libp2p_stream::Control, register session,
  emit SyncSessionReady
- Implement 4-byte BE length-prefix frame read loop; emit SyncFrame per
  frame; zero-length sentinel triggers SyncComplete; IO errors emit SyncError
- Implement write half: drain write_rx channel into stream write half
- Add sync_with_relay and sync_respond Tauri commands in lib.rs; register
  both in invoke_handler!; initialise sync_sessions in state setup

TypeScript (NodeClient.ts):
- Extend NodeEvent union with sync_session_ready/sync_frame/sync_complete/
  sync_error members
- Add AsyncQueue<T> helper to bridge event-driven IPC into sequential await
- Add public fields: syncInProgress, lastSyncAt, relayAddr (from localStorage)
- Add syncWithRelay(relayAddr): guards concurrency, invokes sync_with_relay,
  awaits sync_session_ready (10 s timeout), calls _runBisectLoop
- Add _runBisectLoop(sessionId): direct port of runClientDeltaSync from the
  relay delta-sync test — uses SyncRangeProofProtocol + computeRangeFingerprint
  from @vco/vco-sync; calls VcoStore.storeEnvelope on each received envelope;
  emits synthetic envelope events for live UI updates
- Wire sync_frame events into per-session AsyncQueue in handleEvent
- Auto-trigger syncWithRelay on dial_success when event.addr matches relayAddr
- Write gossipsub envelopes to VcoStore.storeEnvelope('pending') in handleEvent

TypeScript (VcoStore.ts):
- Add headerHash?: string (hex) to StoredEnvelope interface
- Add getAllHeaderHashes(): reads all envelopes and returns Uint8Array[] for
  computeRangeFingerprint
- Add storeEnvelope(env, syncStatus): single write path that encodes envelope
  proto, builds StoredEnvelope with headerHash populated, calls putEnvelope

UI (SettingsView.tsx):
- Add relay address text input wired to localStorage vco.relay_addr and
  NodeClient.relayAddr in memory
- Add "Sync now" button (visible when isReady && relayAddr non-null),
  disabled with spinner while syncInProgress
- Add last-sync status line showing relative time ("Last synced N minutes ago"
  or "Never synced")

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Reposts/boosts were silently dropped from the feed when the original
post arrived in a previous sync session and was absent from the current
in-memory batch.

- VcoStore: bump DB_VERSION to 4 and add a `by_header_hash` IDB index
  on the `envelopes` object store; add `getEnvelopeByCid(cidHex)` for
  efficient point-lookup by headerHash.
- FeedProcessor.process(): accept an optional `extraPostsByCid` map that
  is merged into the batch-local post cache before processing, keeping
  the method pure/synchronous.
- SocialContext: in `handleInboundEnvelope`, detect incoming repost
  envelopes and pre-seed `extraPostsByCid` via an async VcoStore lookup
  before calling FeedProcessor, so cross-session originals resolve.
- Tests: update VcoStore DB version assertion; add
  "repost of an envelope from a prior batch resolves from VcoStore".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Removed libp2p-stream v0.4.0-alpha which required libp2p-swarm v0.47.0,
incompatible with libp2p v0.54's libp2p-swarm v0.45.1. Replaced the
stream_control.open_stream() API with a direct tokio::net::TcpStream
connection extracted from the relay multiaddr, preserving the same
bidirectional framed read/write sync session behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Upgrade Android Gradle Plugin from 8.5.1 to 8.6.0 and compileSdk/targetSdk
from 34 to 35 in all three Gradle files (build.gradle.kts, app/build.gradle.kts,
buildSrc/build.gradle.kts) to satisfy androidx.camera 1.5.1 requirements
pulled in by tauri-plugin-barcode-scanner.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bView crash

@vco/vco-sync and @vco/vco-transport import Node.js-only APIs (node:crypto,
Buffer, libp2p native modules) that do not exist in the Android WebView
renderer. Statically importing them at module load time caused a bundle-time
crash that produced a white screen before React could render.

Fix: convert the top-level static imports of SyncRangeProofProtocol,
computeRangeFingerprint (from @vco/vco-sync) and decodeEnvelopeProto
(from @vco/vco-core) into dynamic imports inside the methods that use them
(_runBisectLoop and handleEvent). Also externalize @vco/vco-sync and
@vco/vco-transport in vite.config.ts so Rollup never attempts to bundle
their Node.js-only dependency chains into the WebView bundle at all.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CAMERA permission was missing from AndroidManifest.xml and
handleQrScan() called scan() directly without checking or requesting
camera permission first. Add the manifest declaration and gate the
scan behind checkPermissions()/requestPermissions().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- vco-relay server.ts: prefer LAN TCP addresses in /address recommended
  field so QR codes encode a universally connectable address on local
  networks (not QUIC/UDP or WebSocket)
- SettingsView.tsx: QR scan now populates both dialAddr and relayAddr so
  the relay sync field is populated immediately after scanning
- vco_node.rs: replace raw TcpStream-from-user-multiaddr approach with
  swarm dial + pending_syncs map; ConnectionEstablished extracts the
  actual confirmed TCP address from the endpoint, enabling QUIC transport
  negotiation while still opening the framed sync stream to the real addr

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- test audit:
  - vco-crypto: added tests for X25519 and AES-GCM.
  - vco-core: added edge case tests for envelope validation and ZKP.
  - vco-social: implemented E2EE, DB error handling, and biometric security tests.
  - vco-sync: added negentropy-adapter tests and fixed orchestrator tests.
  - improved test stability by standardizing on node environment with robust polyfills.

- security remediation:
  - KeyringService: increased PBKDF2 iterations to 600,000 with automatic migration.
  - E2EEService: implemented HKDF-SHA256 for secure symmetric key derivation.
  - BiometricService: removed insecure plaintext password storage in localStorage.
  - vco-crypto: added HKDF-SHA256 and SHA-256 support.
  - verified absence of common vulnerabilities (eval, XSS, insecure communication).
- Introduced Platform interface and StandardPlatform to centralize environment-specific dependencies (Tauri, localStorage, crypto, etc.).
- Refactored core services (VcoStore, KeyringService, BiometricService, NodeClient) to use the Platform abstraction.
- Implemented robust schema decoding in FeedProcessor.process using explicit schema checks.
- Standardized social schema URIs to v1 across constants and tests.
- Developed MockPlatform and refactored all major vco-social tests to use it, eliminating dozens of manual polyfills and mocks.
- Updated VcoStore.storeEnvelope to support explicit channelId for better testability.
Rust backend:
- Replace raw TcpStream sync transport with libp2p substream
  (/vco/sync/1.0.0 protocol via SyncStreamBehaviour) — sync now runs
  over the encrypted/muxed swarm connection, preventing premature idle
  timeout that caused the UI to show disconnected after sync
- Configure ping keep-alive at 30s interval (below 60s idle timeout)
  so relay connection persists between syncs
- Emit stats event on ConnectionClosed so UI reflects disconnection
  immediately without waiting for the next poll tick

TypeScript:
- NodeClient: emit sync_error + zero-length sentinel on bad sync frames
  (was silently swallowed); move sessionQueues.delete to finally block
- NodeClient: replace stale synthetic stats emission in syncWithRelay
  finally block with getStats() for authoritative state
- NetworkService: remove synchronous snapshot() from poll tick — was
  racing against async getStats() response and pushing empty connections
- FeedProcessor: normalize CID reads with fromHex (was atob on hex);
  add repostedOriginalCids dedup so reposts don't appear twice
- FeedView: use toHex(item.cid) for React keys (was Uint8Array.toString)
- SocialContext: normalize all putEnvelope CID writes to hex (was base64);
  fix FeedProcessor.process() argument order (followingSet was receiving
  creatorIdHex string); refresh frozen placeholder display names when
  profiles resolve; fix loadMoreFeed pagination to use storage-time cursor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…issues

- Remove circuit relay reservation code that caused libp2p connection drops
- Emit peer ID in Ready event immediately on keypair load (before swarm
  starts) so UI no longer shows "Node not responding" on startup
- Persist libp2p Ed25519 private key in localStorage so browser identity
  survives page reload (key: vco.libp2p_private_key)
- Auto-connect to relay on browser startup using stored relay address
- Add DynKadStore enum wrapping SledStore or kad::MemoryStore: sled 0.34
  uses mmap which fails on some Android filesystems; DynKadStore::open()
  tries sled first and falls back gracefully to in-memory store so the
  node always starts on Android
- Replace SyncStreamEvent enum with Infallible to fix dead code warning
- Add scripts/android-dev.mjs: finds a free port, passes --config JSON
  override to tauri android dev, and exports VITE_PORT to fix dev port
  mismatch between Vite and the Android WebView
- Read VITE_PORT env var in vite.config.ts for dynamic port binding
- Auto-dial relay from settings on SocialContext startup

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…kService

- Update Services.test.ts to use event handler pattern matching new
  NetworkService.startPolling implementation (event-driven via onEvent)
- Add Swarm.test.ts test for invalid multiaddress format error in browser mode
- Broaden error toast in SettingsView to show all error events, not just dial errors
- Include @libp2p/websockets dependency in package-lock.json

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant