A single-file, client-side tool for computing statistically rigorous CSAT scores from Intercom CSV exports — with confidence intervals, bias indicators, and per-agent filtering.
Built on the methodology described in CSAT: An Emperor With No Clothes by Ron Sielinski (Microsoft Data Science).
Raw CSAT scores are statistically naked without a confidence interval. A score of 89% based on 4 responses is not the same as 89% based on 1,800 responses — but most dashboards treat them identically. This tool makes that uncertainty visible and actionable.
Import — Drop any Intercom CSV conversation export directly into the browser. No upload, no server, no data leaves your machine.
Calculate — From the raw Conversation rating column, the tool computes:
- CSAT % — share of ratings 4 or 5 out of total responses
- Confidence interval — using Student's t-distribution on the raw 1–5 scores
- Margin of error — expressed in percentage points at your chosen confidence level
- Lower and upper bounds — visualized on a 0–100% scale
- Reliability badge — Low / Medium / High, based on interval width
- Score distribution — breakdown of each star rating (1–5)
- Verbatim remarks — customer comments surfaced alongside their score
Filter — Slice results by agent type (Teammate vs Chatbot) or by individual agent name. The CI recalculates in real time.
Confidence level — Toggle between 90%, 95%, and 99%.
The tool applies the Student's t confidence interval for a population mean:
CI = x̄ ± t(α/2, n−1) × (s / √n)
| Variable | Description |
|---|---|
x̄ |
Mean of raw scores (1–5 scale) |
s |
Sample standard deviation |
n |
Number of rated conversations |
df |
Degrees of freedom = n − 1 |
t(α/2, df) |
Critical value from Student's t-distribution |
σ_x̄ |
Standard error = s / √n |
Bounds are then mapped from the 1–5 scale to 0–100% for display.
Why n = rated conversations, not total conversations?
A conversation without a rating carries zero information about satisfaction. Including unrated conversations innwould artificially deflate the standard error and produce a falsely narrow — and statistically dishonest — interval. The model measures the population of respondents, not the population of all customers.
The CI addresses sampling uncertainty — how much the score would vary across different random samples of the same size. It does not address participation bias — whether the people who rated are representative of all customers.
Response rate (rated / total conversations) is a separate health metric that should be tracked alongside the CI. Low response rates, particularly in culturally restrained markets like Germany or the Netherlands, can significantly distort absolute CSAT scores even when the CI is tight.
The tool is built for Intercom's native CSAT export. To get your file:
- In Intercom, go to Reports → Customer Satisfaction
- Set your desired date range
- Scroll down to the Conversation ratings section
- Click Export CSV
That's it — drop the downloaded file directly into the tool.
The tool expects a standard Intercom conversation export with at least this column:
| Column | Required | Description |
|---|---|---|
Conversation rating |
✅ | Integer 1–5 |
Teammate rated |
Optional | Agent name for filtering |
Agent rated type |
Optional | Teammate or Chatbot |
Conversation rating remark |
Optional | Customer verbatim |
User name |
Optional | Customer name shown with remark |
Other columns are ignored.
No installation, no dependencies, no build step.
- Download or clone this repo
- Open
index.htmlin any modern browser - Drop your Intercom CSV export into the upload zone
- Read the results
Or access the hosted version at: https://[your-username].github.io/[repo-name]
Once hosted on GitHub Pages, embed the tool directly in any Notion page:
- In Notion, type
/embed - Paste the GitHub Pages URL
- Click Embed link
The tool will render fully interactive inside Notion.
- Pure HTML/CSS/JS — single file, zero dependencies, zero network requests (fonts aside)
- Client-side only — CSV data never leaves the browser
- t-distribution computed via Abramowitz & Stegun numerical approximation (accurate to ~4 decimal places)
- Tested on Chrome, Firefox, Safari
- Sielinski, R. (2021). CSAT: An Emperor With No Clothes. Microsoft Data Science at Medium.
- Harzing, A.W. (2006). Response styles in cross-national mail survey research: A 26-country study. International Journal of Cross-Cultural Management, 6(2), 243–266.
- Ipsos (2020). When Difference Doesn't Mean Different: Understanding Cultural Bias in CX Data.
MIT — do whatever you want with it.