StatViz is a browser-based, node-driven data analysis workspace for CSV datasets. It helps users move from raw tabular data to insights, hypotheses, statistical tests, and interpretable result nodes on an interactive canvas.
Capstone Project — Dipan Bag, Spring 2026
StatViz is designed around a visible analysis graph:
Datasetnode for the uploaded fileDataset Summarynode for completeness, preview charts, and dataset-level detailsInsightnodes for AI-suggested analytical directionsHypothesisnodes for testable claimsResultnodes for statistical outputs and interpretationNext Step/ follow-up nodes for continued analysis
Instead of hiding the workflow behind menus, the app keeps the reasoning trail visible.
-
CSV upload in the browser
- drag and drop
- click anywhere on the blank canvas to upload
- use the built-in sample exercise dataset directly from the empty canvas
-
Dataset description and summary
- AI-generated editable dataset description
- completeness section focused on columns with missing values
- mixed visual preview cards for numeric and categorical columns
More Detailsbranch with dataset-health metrics and short AI focus guidance
-
AI-generated insights
- relationship insights
- group-difference insights
- distribution-shape insights
- outlier-candidate insights
-
Hypothesis generation
- create hypotheses from insight nodes
- create custom hypotheses manually
- inline-edit hypothesis statements before testing
-
Statistical testing
- in-browser supported tests via
jstat, including:- Pearson correlation
- Welch’s two-sample t-test
- chi-square test of independence
- one-way ANOVA
- AI-assisted fallback when a test is unsupported or estimated
- in-browser supported tests via
-
Result workflow
- AI-assisted result summaries
- chart-based result interpretation
- accept / reject on result nodes
- re-run test from the result node
- accepted results can generate a
Next Stepnode and a follow-up editable hypothesis - rejected results can generate an alternative sibling hypothesis
-
Ask AI
- dataset-aware right-sidebar assistant
- can reason over the current graph, results, and branches
- supports scoped follow-ups through graph context
-
Quick analysis summary
- fixed top-right summary toggle
- short AI-generated overview of the analysis done so far
| Layer | Library / Service |
|---|---|
| UI framework | React + Vite |
| Canvas / graph | @xyflow/react (React Flow) |
| State management | Zustand |
| Charts | Recharts + custom SVG charts |
| Statistics | jstat |
| Layout | @dagrejs/dagre |
| AI services | OpenAI Chat Completions API |
| Styling | Plain CSS |
cd frontend
npm install
npm run devThen open:
http://localhost:5173
The active app route is:
/statviz
Note:
- the app uses a user-provided OpenAI API key
- the key is stored in browser session state for the running session
The empty canvas includes a Use Sample Dataset option.
The app expects the sample exercise CSV at:
frontend/public/sample/exercise/Exercise.csv
This sample is also referenced by the landing page and shared sample-dataset config.
- Open StatViz.
- Upload a CSV or use the sample dataset.
- Review the dataset description and summary.
- Open
More Detailsif needed for dataset-health metrics. - Generate insight nodes from the summary.
- Generate or author a hypothesis.
- Run the suggested test.
- Review the result node and charts.
- Accept or reject the result.
- Continue with a next-step recommendation or an alternative sibling hypothesis.
frontend/src/
├── app/
├── pages/
│ └── LandingPage.jsx
├── sampleDatasets.js
├── constants/
├── modes/data/
│ ├── DataModeApp.jsx
│ ├── DataModeApp.css
│ ├── store/
│ │ ├── useDataModeStore.js
│ │ └── analysisContext.js
│ ├── components/
│ │ ├── DataCanvas.jsx
│ │ ├── UploadPopup.jsx
│ │ ├── ApiKeyModal.jsx
│ │ └── ChatPanel.jsx
│ ├── nodes/
│ │ ├── DatasetNode.jsx
│ │ ├── DatasetSummaryNode.jsx
│ │ ├── DatasetDetailsNode.jsx
│ │ ├── InsightNode.jsx
│ │ ├── HypothesisNode.jsx
│ │ ├── CustomHypothesisNode.jsx
│ │ ├── ResultNode.jsx
│ │ ├── NextStepNode.jsx
│ │ ├── InterpretationNode.jsx
│ │ ├── ColumnChart.jsx
│ │ ├── charts/
│ │ │ ├── InsightChart.jsx
│ │ │ ├── ResultChart.jsx
│ │ │ ├── chartData.js
│ │ │ └── charts.css
│ │ ├── nodes.css
│ │ └── index.js
│ ├── api/
│ │ ├── descriptionService.js
│ │ ├── datasetDetailsService.js
│ │ ├── insightService.js
│ │ ├── hypothesisService.js
│ │ ├── customHypothesisService.js
│ │ ├── followupService.js
│ │ ├── analysisSummaryService.js
│ │ ├── chartTypeService.js
│ │ ├── statisticsService.js
│ │ └── chatTools.js
│ └── utils/
│ ├── csvParser.js
│ ├── layoutGraph.js
│ └── mockGraph.js
└── main.jsx
- The app is browser-first: parsing, charting, graph state, and supported statistics happen client-side.
- AI is used for description, insight generation, hypothesis generation, follow-ups, summaries, and interpretation.
- Some result charts and statistical explanation surfaces are still evolving as the visualization system is refined.
The project is set up for static frontend deployment through GitHub Pages.
The hosted route uses:
/mindmapper/statviz
SPA routing is supported through the 404.html redirect pattern used in the frontend public/ folder.