Skip to content

Latest commit

 

History

History
85 lines (55 loc) · 4.03 KB

File metadata and controls

85 lines (55 loc) · 4.03 KB

🛠️ CausalFlow Development Log

"Coding is not just about writing lines of code, it's about composing a symphony of logic."

This log documents the journey of building CausalFlow, a Bayesian Network Workbench, from concept to a fully functional interactive platform.

📅 Timeline & Milestones

Phase 1: The Foundation - Data Ingestion

(Focus: Robustness & Validation)

The first challenge was to ensure the system starts with high-quality data. We adopted a "Strict Validator" approach.

  • Action: Implemented DataProcessor in Python with rigorous checks using Pandas.
  • Key Decision: Rejected any continuous variables (unique values > 15) to maintain the interpretability of Discrete Bayesian Networks. This "fail-fast" strategy saves users from confusing results later.
  • Outcome: A solid backend endpoint /api/upload_csv that returns structured metadata (columns, states).

Phase 2: The Skeleton - Network Construction

(Focus: Interactivity & Graph Theory)

Visualizing the Directed Acyclic Graph (DAG) was crucial. We chose React Flow for its flexibility.

  • Action: Built the FlowEditor component allowing users to drag-and-drop nodes.
  • Challenge: Preventing cycles in the graph (Bayesian Networks must be Acyclic).
  • Solution: Implemented a cycle detection algorithm in the backend. Every time a user connects two nodes, the backend validates the edge before confirming.
  • Feature: Added "Auto Learn" using pgmpy's HillClimb search to suggest structures for users who don't know where to start.

Phase 2.5: Bridging the Gap - Hybrid Modeling

(Focus: User Experience)

We realized users might not always have a CSV ready, or might want to add hypothetical variables.

  • Action: Implemented "Manual Node Creation" in the UI.
  • UX Improvement: Designed a sidebar that dynamically switches between "Upload Mode" and "Build Mode", ensuring the "Add Node" button is always accessible without cluttering the interface.
  • Debug: Fixed z-index issues where the graph edges would overlap the sidebar, ensuring a polished, layered UI.

Phase 3: The Brain - Inference Engine

(Focus: Computation & Visualization)

The heart of the project. Making the math "come alive".

  • Action: Integrated pgmpy's VariableElimination for exact inference.
  • Visualization: Custom React Nodes using Recharts to display probability distributions inside the nodes.
  • Innovation: The "Four-Color State System":
    • Gray: No data.
    • Blue: Prior probabilities (the baseline).
    • Red: Evidence (what we observed).
    • Green: Posterior (how the world changed).
  • Result: Users can click any bar to set evidence, and watch the entire network ripple with updates in real-time.

UI Polish & Final Touches

(Focus: Aesthetics & Usability)

  • Refinement: Dynamic sidebar that handles both file uploads and node management gracefully.
  • Z-Index War: Solved a classic CSS layering issue where sidebar borders were clipping action buttons.
  • Drag & Drop: Enabled node dragging even after model training to allow users to organize their thoughts spatially.

💻 Tech Stack Summary

Layer Technology Role
Frontend React 18 + Vite The reactive UI framework.
Canvas React Flow Handling the complex node-link interactions.
Backend FastAPI (Python) High-performance API for statistical computing.
Math pgmpy + Pandas The heavy lifting for Bayesian analysis.
State Zustand Keeping the graph and inference state in sync.

🚀 Future Roadmap

  • Export: Support for BIF / XMLBIF formats to interoperate with other tools (GeNIe, Netica).
  • Continuous Variables: Support for Hybrid Bayesian Networks.
  • Undo/Redo: Time-travel debugging for graph edits.

Log by shuqi - 2026-02-07