Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 29 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,8 +235,8 @@ Output:

**Key features:**
- **Kebab-case property names**: Use standard CSS property names like `font-size`, `background-color`, etc.
- **Dangerous patterns blocked**: `url()` with external URLs, `expression()`, `javascript:` protocol in CSS values, `@import`
- **Blocked properties**: `behavior`, `-moz-binding` (known dangerous properties)
- **Safe patterns recognized**: Standard CSS properties in kebab-case format
- **Examples not recognized**: `url()` with external URLs, `expression()`, `javascript:` protocol in CSS values, `@import`
- **Type safety**: Values are strings
- **Semicolon sanitization**: Prevents multi-property injection by accepting only the first value

Expand Down Expand Up @@ -1032,19 +1032,19 @@ Treebark is designed with **security as a priority**. Multiple layers of protect
### XSS Prevention

**Tag Allowlist:**
Only safe HTML tags are permitted. Dangerous tags like `<script>`, `<iframe>`, `<object>`, `<embed>`, `<style>`, and `<form>` are blocked:
Only a curated set of safe HTML tags are recognized. Examples of tags not on the allowlist like `<script>`, `<iframe>`, `<object>`, `<embed>`, `<style>`, and `<form>` are logged as errors:

```javascript
// ❌ Blocked - logs error and renders nothing
// ❌ Not on allowlist - logs error and renders nothing
{ script: 'alert("xss")' }
{ iframe: { src: 'evil.com' } }
```

**Attribute Allowlist:**
Only safe attributes are allowed per tag. Event handlers like `onclick`, `onload`, `onerror` are blocked:
Only safe attributes are recognized per tag. Event handlers like `onclick`, `onload`, `onerror` are not on the allowlist:

```javascript
// ❌ Blocked - logs warning and attribute is omitted
// ❌ Not on allowlist - logs warning and attribute is omitted
{ div: { onclick: 'alert(1)', $children: ['text'] } }
// Renders: <div>text</div>
```
Expand All @@ -1059,23 +1059,23 @@ All content and attribute values are automatically HTML-escaped to prevent injec

### Style Attribute Protection

The `style` attribute uses a **structured object format** that blocks multiple attack vectors:
The `style` attribute uses a **structured object format** that only recognizes safe patterns:

**Dangerous CSS patterns blocked:**
- `url()` - External URLs blocked (data: URIs allowed)
- `expression()` - IE expression injection blocked
- `javascript:` - JavaScript protocol blocked
- `@import` - CSS imports blocked
**Examples of patterns not recognized:**
- `url()` - External URLs not recognized (data: URIs allowed)
- `expression()` - IE expression injection not recognized
- `javascript:` - JavaScript protocol not recognized
- `@import` - CSS imports not recognized

**Dangerous CSS properties blocked:**
- `behavior` - IE behavior property blocked
- `-moz-binding` - Firefox XBL binding blocked
**Examples of properties not recognized:**
- `behavior` - IE behavior property not recognized
- `-moz-binding` - Firefox XBL binding not recognized

**Semicolon injection prevented:**
Only the first CSS value before a semicolon is used, preventing multi-property injection:

```javascript
// ❌ Injection attempt blocked
// ❌ Pattern not recognized
{
div: {
style: {
Expand All @@ -1092,20 +1092,20 @@ Only the first CSS value before a semicolon is used, preventing multi-property i

The `href` and `src` attributes validate URL protocols to prevent XSS attacks:

**Safe protocols allowed:**
**Safe protocols recognized:**
- `http:`, `https:` - Standard web protocols
- `mailto:`, `tel:`, `sms:` - Communication protocols
- `ftp:`, `ftps:` - File transfer protocols
- Relative URLs (e.g., `/path`, `#anchor`, `?query`, `page.html`)

**Dangerous protocols blocked:**
- `javascript:` - JavaScript execution blocked
- `data:` - Data URIs blocked (can contain HTML/scripts)
- `vbscript:` - VBScript execution blocked
- `file:` - Local file access blocked
**Examples of protocols not recognized:**
- `javascript:` - JavaScript execution not recognized
- `data:` - Data URIs not recognized (can contain HTML/scripts)
- `vbscript:` - VBScript execution not recognized
- `file:` - Local file access not recognized

```javascript
// ❌ Blocked - logs warning and attribute is omitted
// ❌ Not on allowlist - logs warning and attribute is omitted
{ a: { href: 'javascript:alert(1)', $children: ['Click'] } }
// Renders: <a>Click</a>

Expand All @@ -1117,7 +1117,7 @@ The `href` and `src` attributes validate URL protocols to prevent XSS attacks:

### Prototype Chain Protection

Access to JavaScript prototype chain properties is blocked in template interpolation to prevent information leakage:
Access to JavaScript prototype chain properties is actively blocked in template interpolation to prevent information leakage:

**Blocked properties:**
- `constructor` - Prevents access to object constructor
Expand All @@ -1132,12 +1132,12 @@ Access to JavaScript prototype chain properties is blocked in template interpola
// Warning logged: Access to property "constructor" is blocked for security reasons
```

**Security principle:** Defense in depth with multiple layers:
1. Tag and attribute allowlists
**Security principle:** Curated output through defense in depth with multiple layers:
1. Curated tags and attributes
2. HTML escaping for all content
3. Structured style objects with pattern blocking
4. URL protocol validation
5. Prototype chain access prevention
3. Curated style patterns
4. Curated URL protocols
5. Prototype chain blocking

## Format Notes

Expand Down
56 changes: 28 additions & 28 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,8 +241,8 @@ Output:

**Key features:**
- **Kebab-case property names**: Use standard CSS property names like `font-size`, `background-color`, etc.
- **Dangerous patterns blocked**: `url()` with external URLs, `expression()`, `javascript:` protocol in CSS values, `@import`
- **Blocked properties**: `behavior`, `-moz-binding` (known dangerous properties)
- **Safe patterns recognized**: Standard CSS properties in kebab-case format
- **Examples not recognized**: `url()` with external URLs, `expression()`, `javascript:` protocol in CSS values, `@import`
- **Type safety**: Values are strings
- **Semicolon sanitization**: Prevents multi-property injection by accepting only the first value

Expand Down Expand Up @@ -1038,19 +1038,19 @@ Treebark is designed with **security as a priority**. Multiple layers of protect
### XSS Prevention

**Tag Allowlist:**
Only safe HTML tags are permitted. Dangerous tags like `<script>`, `<iframe>`, `<object>`, `<embed>`, `<style>`, and `<form>` are blocked:
Only a curated set of safe HTML tags are recognized. Examples of tags not on the allowlist like `<script>`, `<iframe>`, `<object>`, `<embed>`, `<style>`, and `<form>` are logged as errors:

```javascript
// ❌ Blocked - logs error and renders nothing
// ❌ Not on allowlist - logs error and renders nothing
{ script: 'alert("xss")' }
{ iframe: { src: 'evil.com' } }
```

**Attribute Allowlist:**
Only safe attributes are allowed per tag. Event handlers like `onclick`, `onload`, `onerror` are blocked:
Only safe attributes are recognized per tag. Event handlers like `onclick`, `onload`, `onerror` are not on the allowlist:

```javascript
// ❌ Blocked - logs warning and attribute is omitted
// ❌ Not on allowlist - logs warning and attribute is omitted
{ div: { onclick: 'alert(1)', $children: ['text'] } }
// Renders: <div>text</div>
```
Expand All @@ -1065,23 +1065,23 @@ All content and attribute values are automatically HTML-escaped to prevent injec

### Style Attribute Protection

The `style` attribute uses a **structured object format** that blocks multiple attack vectors:
The `style` attribute uses a **structured object format** that only recognizes safe patterns:

**Dangerous CSS patterns blocked:**
- `url()` - External URLs blocked (data: URIs allowed)
- `expression()` - IE expression injection blocked
- `javascript:` - JavaScript protocol blocked
- `@import` - CSS imports blocked
**Examples of patterns not recognized:**
- `url()` - External URLs not recognized (data: URIs allowed)
- `expression()` - IE expression injection not recognized
- `javascript:` - JavaScript protocol not recognized
- `@import` - CSS imports not recognized

**Dangerous CSS properties blocked:**
- `behavior` - IE behavior property blocked
- `-moz-binding` - Firefox XBL binding blocked
**Examples of properties not recognized:**
- `behavior` - IE behavior property not recognized
- `-moz-binding` - Firefox XBL binding not recognized

**Semicolon injection prevented:**
Only the first CSS value before a semicolon is used, preventing multi-property injection:

```javascript
// ❌ Injection attempt blocked
// ❌ Pattern not recognized
{
div: {
style: {
Expand All @@ -1104,14 +1104,14 @@ The `href` and `src` attributes validate URL protocols to prevent XSS attacks:
- `ftp:`, `ftps:` - File transfer protocols
- Relative URLs (e.g., `/path`, `#anchor`, `?query`, `page.html`)

**Dangerous protocols blocked:**
- `javascript:` - JavaScript execution blocked
- `data:` - Data URIs blocked (can contain HTML/scripts)
- `vbscript:` - VBScript execution blocked
- `file:` - Local file access blocked
**Dangerous protocols rejected:**
- `javascript:` - JavaScript execution rejected
- `data:` - Data URIs rejected (can contain HTML/scripts)
- `vbscript:` - VBScript execution rejected
- `file:` - Local file access rejected

```javascript
// ❌ Blocked - logs warning and attribute is omitted
// ❌ Not allowed - logs warning and attribute is omitted
{ a: { href: 'javascript:alert(1)', $children: ['Click'] } }
// Renders: <a>Click</a>

Expand All @@ -1123,7 +1123,7 @@ The `href` and `src` attributes validate URL protocols to prevent XSS attacks:

### Prototype Chain Protection

Access to JavaScript prototype chain properties is blocked in template interpolation to prevent information leakage:
Access to JavaScript prototype chain properties is actively blocked in template interpolation to prevent information leakage:

**Blocked properties:**
- `constructor` - Prevents access to object constructor
Expand All @@ -1138,12 +1138,12 @@ Access to JavaScript prototype chain properties is blocked in template interpola
// Warning logged: Access to property "constructor" is blocked for security reasons
```

**Security principle:** Defense in depth with multiple layers:
1. Tag and attribute allowlists
**Security principle:** Curated output through defense in depth with multiple layers:
1. Curated tags and attributes
2. HTML escaping for all content
3. Structured style objects with pattern blocking
4. URL protocol validation
5. Prototype chain access prevention
3. Curated style patterns
4. Curated URL protocols
5. Prototype chain blocking

## Format Notes

Expand Down
2 changes: 1 addition & 1 deletion nodejs/packages/markdown-it-treebark/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -352,7 +352,7 @@ div:

## Security

Treebark is safe by default and only allows whitelisted HTML tags and attributes. Dangerous elements like `<script>`, `<iframe>`, and event handlers are blocked.
Treebark is safe by default and produces curated output. Only whitelisted HTML tags and attributes are recognized. Elements like `<script>`, `<iframe>`, and event handlers are not on the allowlist.

## Error Handling

Expand Down
57 changes: 31 additions & 26 deletions spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ div:
- `table`: `summary`
- `th`/`td`: `scope`, `colspan`, `rowspan`
- `blockquote`: `cite`
- Blocked: event handlers (`on*` attributes like `onclick`, `onload`)
- Examples: event handlers (`on*` attributes like `onclick`, `onload`)
- See [Security](#14-security) section for comprehensive security details

---
Expand Down Expand Up @@ -340,7 +340,7 @@ JavaScript allows both `array[0]` and `array["0"]` syntax. Since the path is spl
**Special tags:**
`comment`, `if`

Blocked tags:
Tags not on the allowlist:
`script`, `iframe`, `embed`, `object`, `applet`,
`form`, `input`, `button`, `select`,
`video`, `audio`,
Expand Down Expand Up @@ -931,16 +931,16 @@ Treebark implements multiple layers of security to prevent XSS attacks and other

### 14.1 Tag Allowlist

Only safe HTML tags are permitted. Dangerous tags are blocked and logged as errors:
Only a curated set of safe HTML tags are recognized. Examples include: `div`, `span`, `p`, `h1`-`h6`, `ul`, `ol`, `li`, `a`, `img`, `table`, and other common semantic HTML elements.

**Blocked tags:**
**Examples of tags not on the allowlist:**
- `script`, `iframe`, `object`, `embed`, `applet` - XSS vectors
- `form`, `input`, `button`, `select`, `textarea` - Form hijacking
- `style`, `link`, `meta`, `base` - Style/metadata injection
- `video`, `audio`, `canvas` - Media-based attacks
- `svg`, `math` - Vector-based attacks

**Case variations blocked:** Tag names are case-sensitive. `ScRiPt`, `IFRAME`, etc. are also blocked.
**Only recognized case:** Tag names are case-sensitive. Only lowercase tag names like `script` (which itself is not on the allowlist) would be recognized; case variations like `ScRiPt`, `IFRAME` are not recognized.

**Example:**
```javascript
Expand All @@ -951,11 +951,7 @@ Only safe HTML tags are permitted. Dangerous tags are blocked and logged as erro

### 14.2 Attribute Allowlist

Only safe attributes are permitted per tag. Event handlers are blocked:

**Blocked attributes:**
- `onclick`, `onload`, `onerror`, `onmouseover`, etc. - All `on*` event handlers
- Case variations: `onClick`, `ONCLICK`, etc. are also blocked
Only safe attributes are recognized per tag. The allowlist includes:

**Allowed attributes per tag:**
- Global: `id`, `class`, `style`, `title`, `aria-*`, `data-*`, `role`
Expand All @@ -965,6 +961,10 @@ Only safe attributes are permitted per tag. Event handlers are blocked:
- `th`/`td`: `scope`, `colspan`, `rowspan`
- `blockquote`: `cite`

**Examples of attributes not on the allowlist:**
- `onclick`, `onload`, `onerror`, `onmouseover`, etc. - All `on*` event handlers
- Case variations: `onClick`, `ONCLICK`, etc. are not recognized

**Example:**
```javascript
{ div: { onclick: 'alert(1)', $children: ['text'] } }
Expand All @@ -984,15 +984,20 @@ All content and attribute values are automatically HTML-escaped to prevent injec

### 14.4 Style Attribute Protection

The `style` attribute uses a structured object format that blocks multiple attack vectors:
The `style` attribute uses a structured object format that only recognizes safe patterns:

**Safe CSS patterns recognized:**
- Standard CSS properties in kebab-case format
- Color values, numeric values with units
- `data:` URIs for inline images (base64 encoded)

**Dangerous CSS patterns blocked:**
**Examples of patterns not recognized:**
- `url()` with external URLs (data: URIs allowed for inline images)
- `expression()` - IE expression injection
- `javascript:` protocol in CSS values
- `@import` - CSS imports

**Dangerous CSS properties blocked:**
**Examples of properties not recognized:**
- `behavior` - IE behavior property (can execute code)
- `-moz-binding` - Firefox XBL binding (can execute code)

Expand Down Expand Up @@ -1020,13 +1025,13 @@ Only the first CSS value before a semicolon is used:

The `href` and `src` attributes validate URL protocols to prevent XSS attacks:

**Safe protocols allowed:**
**Safe protocols recognized:**
- `http:`, `https:` - Standard web protocols
- `mailto:`, `tel:`, `sms:` - Communication protocols
- `ftp:`, `ftps:` - File transfer protocols
- Relative URLs: `/path`, `#anchor`, `?query`, `page.html`

**Dangerous protocols blocked:**
**Examples of protocols not recognized:**
- `javascript:` - JavaScript execution
- `data:` - Data URIs (can contain HTML/scripts)
- `vbscript:` - VBScript execution
Expand All @@ -1036,7 +1041,7 @@ The `href` and `src` attributes validate URL protocols to prevent XSS attacks:
**Example:**
```javascript
{ a: { href: 'javascript:alert(1)', $children: ['Click'] } }
// Logs warning: Attribute "href" contains blocked protocol "javascript:"
// Logs warning: Attribute "href" contains unsafe protocol "javascript:"
// Renders: <a>Click</a> (href omitted)

{ a: { href: 'https://example.com', $children: ['Safe'] } }
Expand All @@ -1045,12 +1050,12 @@ The `href` and `src` attributes validate URL protocols to prevent XSS attacks:

### 14.6 Prototype Chain Protection

Access to JavaScript prototype chain properties is blocked in template interpolation:
Access to JavaScript prototype chain properties is actively blocked in template interpolation to prevent information leakage:

**Blocked properties:**
- `constructor` - Object constructor access
- `__proto__` - Prototype chain access
- `prototype` - Prototype property access
- `constructor` - Object constructor access blocked
- `__proto__` - Prototype chain access blocked
- `prototype` - Prototype property access blocked

**Example:**
```javascript
Expand All @@ -1064,15 +1069,15 @@ Access to JavaScript prototype chain properties is blocked in template interpola

### 14.7 Defense in Depth

Treebark implements multiple overlapping security layers:
Treebark provides a curated, secure output through multiple overlapping layers:

1. **Tag allowlist** - Only safe HTML tags permitted
2. **Attribute allowlist** - Only safe attributes permitted per tag
1. **Curated tag set** - Only safe HTML tags recognized
2. **Curated attributes** - Only safe attributes recognized per tag
3. **HTML escaping** - All content and attribute values escaped
4. **Structured style objects** - Prevents CSS string injection
5. **CSS pattern blocking** - Blocks dangerous CSS patterns and properties
6. **URL protocol validation** - Blocks dangerous protocols in href/src
7. **Prototype chain blocking** - Prevents access to internal object properties
5. **Curated CSS patterns** - Only safe CSS patterns and properties recognized
6. **Curated URL protocols** - Only safe protocols recognized in href/src
7. **Prototype chain blocking** - Actively blocks access to internal object properties

This defense-in-depth approach ensures that even if one layer is bypassed, others remain to protect against attacks.