From 589da78f23b64887eedb9a4cd892dcc4864d21ca Mon Sep 17 00:00:00 2001 From: Alexander Polyankin Date: Thu, 30 Apr 2026 10:12:56 -0400 Subject: [PATCH 1/6] skill polish --- examples/v1/{field-values.json => field_values.json} | 0 examples/v1/{metadata.json => table_metadata.json} | 0 2 files changed, 0 insertions(+), 0 deletions(-) rename examples/v1/{field-values.json => field_values.json} (100%) rename examples/v1/{metadata.json => table_metadata.json} (100%) diff --git a/examples/v1/field-values.json b/examples/v1/field_values.json similarity index 100% rename from examples/v1/field-values.json rename to examples/v1/field_values.json diff --git a/examples/v1/metadata.json b/examples/v1/table_metadata.json similarity index 100% rename from examples/v1/metadata.json rename to examples/v1/table_metadata.json From bfe5a66783be3f40f6e83aa5a7882e52aaa7ddc7 Mon Sep 17 00:00:00 2001 From: Alexander Polyankin Date: Thu, 30 Apr 2026 10:13:02 -0400 Subject: [PATCH 2/6] skill polish --- .github/workflows/validate.yml | 4 +- README.md | 68 ++++++++------------------------ bin/cli.test.ts | 4 +- core-spec/v1/spec.md | 8 ++-- src/extract-field-values.test.ts | 14 +++---- src/extract-metadata.test.ts | 2 +- 6 files changed, 32 insertions(+), 68 deletions(-) diff --git a/.github/workflows/validate.yml b/.github/workflows/validate.yml index 36be3de..1314cae 100644 --- a/.github/workflows/validate.yml +++ b/.github/workflows/validate.yml @@ -18,10 +18,10 @@ jobs: - run: bun install - name: Extract metadata - run: bun run bin/cli.ts extract-metadata examples/v1/metadata.json /tmp/databases + run: bun run bin/cli.ts extract-metadata examples/v1/table_metadata.json /tmp/databases - name: Extract field values - run: bun run bin/cli.ts extract-field-values examples/v1/metadata.json examples/v1/field-values.json /tmp/databases + run: bun run bin/cli.ts extract-field-values examples/v1/table_metadata.json examples/v1/field_values.json /tmp/databases - name: Diff examples run: diff -r examples/v1/databases /tmp/databases diff --git a/README.md b/README.md index b191aff..fed8356 100644 --- a/README.md +++ b/README.md @@ -2,13 +2,13 @@ Metabase represents database metadata — synced databases, their tables, and their fields — as a tree of YAML files. Files are diff-friendly: numeric IDs are omitted entirely, and foreign keys use natural-key tuples like `["Sample Database", "PUBLIC", "ORDERS"]` instead of database identifiers. -This repository contains the specification, examples, and a CLI that converts the JSON returned by Metabase's `GET /api/database/metadata` endpoint into YAML. +This repository contains the specification, examples, and a CLI that converts the `table_metadata.json` downloaded from a Metabase workspace page into YAML. ## Specification The format is defined in **[core-spec/v1/spec.md](core-spec/v1/spec.md)** (v1.0.4). It covers entity keys, field types, folder structure, sampled field values, and the shape of each entity. -Reference output for the Sample Database lives in **[examples/v1/](examples/v1/)** — both the raw `metadata.json` returned by the endpoint and the extracted YAML tree. +Reference output for the Sample Database lives in **[examples/v1/](examples/v1/)** — both the raw `table_metadata.json` and the extracted YAML tree. ### Entities @@ -20,15 +20,7 @@ Reference output for the Sample Database lives in **[examples/v1/](examples/v1/) ## Obtaining metadata -Metadata is fetched on demand from a running Metabase instance via `GET /api/database/metadata`. The response is a flat JSON document with three arrays — `databases`, `tables`, and `fields` — streamed so that even warehouses with very large schemas can be exported without exhausting server memory. - -Authenticate with either a session token (`X-Metabase-Session`) or an API key (`X-API-Key`): - -```sh -curl "$METABASE_URL/api/database/metadata" \ - -H "X-API-Key: $METABASE_API_KEY" \ - -o metadata.json -``` +Metadata is downloaded as `table_metadata.json` from the Metabase workspace page (Workspaces → the relevant workspace → "Download table_metadata.json"). The file is a flat JSON document with three arrays — `databases`, `tables`, and `fields` — that even warehouses with very large schemas can produce without exhausting server memory. ### Extracting metadata to YAML @@ -38,23 +30,19 @@ The CLI turns that JSON into the human- and agent-friendly YAML tree described i bunx @metabase/database-metadata extract-metadata ``` -- `` — path to the `metadata.json` produced by the API. +- `` — path to the `table_metadata.json` downloaded from the workspace page. - `` — destination directory. Database folders are created directly under it. ### Extracting field values -Metabase keeps a sampled list of distinct values for each field that's low-cardinality enough to enumerate (the same list that powers filter dropdowns in the UI). Fetch it and extract it alongside the metadata: +Metabase keeps a sampled list of distinct values for each field that's low-cardinality enough to enumerate (the same list that powers filter dropdowns in the UI). Download `field_values.json` from the same workspace page and extract it alongside the metadata: ```sh -curl "$METABASE_URL/api/database/field-values" \ - -H "X-API-Key: $METABASE_API_KEY" \ - -o field-values.json - bunx @metabase/database-metadata extract-field-values ``` -- `` — the same `metadata.json` used by `extract-metadata`. Field values reference fields by numeric ID, which the CLI resolves to natural keys using the metadata. -- `` — path to the `field-values.json` returned by the endpoint. +- `` — the same `table_metadata.json` used by `extract-metadata`. Field values reference fields by numeric ID, which the CLI resolves to natural keys using the metadata. +- `` — path to the `field_values.json` downloaded from the workspace page. - `` — destination directory; typically the same one used for `extract-metadata`, so values files land next to the table YAMLs they belong to. One YAML file is written per field that has values. Fields with empty samples are skipped; field IDs not present in the metadata are reported as orphans and skipped. See the spec's [Field Values](core-spec/v1/spec.md#field-values) section for the on-disk shape and when agents should consult these files. @@ -75,11 +63,11 @@ The following is the **default** workflow for a project that wants to use Metaba ### 1. A `.metabase/` directory at the repo root -Create a top-level `.metabase/` directory and **add it to `.gitignore`**. This is where the raw `metadata.json` and the extracted `databases/` YAML tree live: +Create a top-level `.metabase/` directory and **add it to `.gitignore`**. This is where the raw `table_metadata.json` and the extracted `databases/` YAML tree live: ``` .metabase/ -├── metadata.json +├── table_metadata.json └── databases/ └── … ``` @@ -94,44 +82,20 @@ On a large data warehouse the metadata export can easily reach **hundreds of meg Each developer (or a CI job) fetches metadata on demand from their own Metabase instance instead. -### 3. Credentials via a gitignored `.env` file - -Check in an **`.env.template`** at the repo root with placeholders: - -```env -METABASE_URL=https://metabase.example.com -METABASE_API_KEY= -``` +### 3. Download from the workspace page and extract -Each developer copies it to `.env` (also gitignored) and fills in the real values: +Each developer downloads `table_metadata.json` (and optionally `field_values.json`) from the Metabase workspace page and drops them into `.metabase/`. Then run the extractors: ```sh -cp .env.template .env -# edit .env to set METABASE_URL and METABASE_API_KEY -``` - -### 4. Fetch and extract on demand - -With `.env` populated, the end-to-end flow is: - -```sh -set -a; source .env; set +a - mkdir -p .metabase -curl -sf "$METABASE_URL/api/database/metadata" \ - -H "X-API-Key: $METABASE_API_KEY" \ - -o .metabase/metadata.json - -curl -sf "$METABASE_URL/api/database/field-values" \ - -H "X-API-Key: $METABASE_API_KEY" \ - -o .metabase/field-values.json +# Drop table_metadata.json (and optionally field_values.json) from the workspace page into .metabase/ rm -rf .metabase/databases -bunx @metabase/database-metadata extract-metadata .metabase/metadata.json .metabase/databases -bunx @metabase/database-metadata extract-field-values .metabase/metadata.json .metabase/field-values.json .metabase/databases +bunx @metabase/database-metadata extract-metadata .metabase/table_metadata.json .metabase/databases +bunx @metabase/database-metadata extract-field-values .metabase/table_metadata.json .metabase/field_values.json .metabase/databases ``` -After this, tools and agents should read the YAML tree under `.metabase/databases/` — not `metadata.json` or `field-values.json`, which exist only as input to the extractors. +After this, tools and agents should read the YAML tree under `.metabase/databases/` — not `table_metadata.json` or `field_values.json`, which exist only as input to the extractors. ## Publishing to NPM @@ -145,7 +109,7 @@ The workflow requires an `NPM_RELEASE_TOKEN` secret with publish access to the ` ```sh bun install -bun bin/cli.ts extract-metadata examples/v1/metadata.json /tmp/.metabase/databases +bun bin/cli.ts extract-metadata examples/v1/table_metadata.json /tmp/.metabase/databases ``` ### Scripts diff --git a/bin/cli.test.ts b/bin/cli.test.ts index f3af7c2..bc357f2 100644 --- a/bin/cli.test.ts +++ b/bin/cli.test.ts @@ -5,8 +5,8 @@ import { join, resolve } from "path"; const REPO_ROOT = resolve(import.meta.dirname, ".."); const CLI = "bin/cli.ts"; -const EXAMPLE_INPUT = "examples/v1/metadata.json"; -const EXAMPLE_FIELD_VALUES = "examples/v1/field-values.json"; +const EXAMPLE_INPUT = "examples/v1/table_metadata.json"; +const EXAMPLE_FIELD_VALUES = "examples/v1/field_values.json"; type RunResult = { stdout: string; diff --git a/core-spec/v1/spec.md b/core-spec/v1/spec.md index 2f8e4e1..c629fb1 100644 --- a/core-spec/v1/spec.md +++ b/core-spec/v1/spec.md @@ -8,7 +8,7 @@ Metabase database metadata is a read-only snapshot of databases, tables, and fie The format is designed to be **portable** and **reviewable**: numeric IDs are omitted or replaced with human-readable natural keys (database name, `[database, schema, table]` tuples, etc.). Files can be diffed, grepped, and edited by hand. -The raw API response (`metadata.json`) is a single flat JSON document with `databases`, `tables`, and `fields` arrays, optimized for transport rather than reading. It can be arbitrarily large — tens or hundreds of megabytes on warehouses with many tables — and is not intended for direct consumption. Tools and humans should read the extracted YAML tree under `databases/` instead, where each entity lives in its own small file and foreign keys are resolved to natural-key tuples. +The raw `table_metadata.json` (downloaded from the Metabase workspace page) is a single flat JSON document with `databases`, `tables`, and `fields` arrays, optimized for transport rather than reading. It can be arbitrarily large — tens or hundreds of megabytes on warehouses with many tables — and is not intended for direct consumption. Tools and humans should read the extracted YAML tree under `databases/` instead, where each entity lives in its own small file and foreign keys are resolved to natural-key tuples. ## Table of Contents @@ -268,13 +268,13 @@ Field values are **sampled, not exhaustive**: Metabase caps the list (typically ### Extraction order -**Field values must be extracted *after* metadata, never before or in isolation.** The raw `field-values.json` references fields by numeric `field_id` only; resolving those IDs to the natural-key tuples used everywhere in this format requires the metadata index. The extractor takes both `metadata.json` and `field-values.json` as inputs, and the two **must come from the same Metabase instance at the same point in time** — a stale metadata file paired with a fresh values file (or vice versa) will silently drop entries as orphans whenever a field has been added, removed, or had its ID reassigned. +**Field values must be extracted *after* metadata, never before or in isolation.** The raw `field_values.json` references fields by numeric `field_id` only; resolving those IDs to the natural-key tuples used everywhere in this format requires the metadata index. The extractor takes both `table_metadata.json` and `field_values.json` as inputs, and the two **must come from the same Metabase workspace download at the same point in time** — a stale metadata file paired with a fresh values file (or vice versa) will silently drop entries as orphans whenever a field has been added, removed, or had its ID reassigned. The recommended workflow is therefore strictly sequential: -1. Fetch `metadata.json` from the Metabase instance. +1. Download `table_metadata.json` from the Metabase workspace page. 2. Run `extract-metadata` to write the database/table/field YAML tree. -3. Fetch `field-values.json` from the **same** instance, ideally back-to-back with step 1. +3. Download `field_values.json` from the **same** workspace, ideally back-to-back with step 1. 4. Run `extract-field-values` against the same output folder to drop per-field values files into the existing tree. Agents reading the tree can rely on this ordering: any `{table}/{field}.yaml` file is guaranteed to have a corresponding entry in the parent `{table}.yaml`'s `fields` array. diff --git a/src/extract-field-values.test.ts b/src/extract-field-values.test.ts index 4f6eec4..021a492 100644 --- a/src/extract-field-values.test.ts +++ b/src/extract-field-values.test.ts @@ -13,8 +13,8 @@ import yaml from "js-yaml"; import { extractFieldValues } from "./extract-field-values.js"; const REPO_ROOT = resolve(import.meta.dirname, ".."); -const EXAMPLE_METADATA = join(REPO_ROOT, "examples/v1/metadata.json"); -const EXAMPLE_FIELD_VALUES = join(REPO_ROOT, "examples/v1/field-values.json"); +const EXAMPLE_METADATA = join(REPO_ROOT, "examples/v1/table_metadata.json"); +const EXAMPLE_FIELD_VALUES = join(REPO_ROOT, "examples/v1/field_values.json"); describe("extractFieldValues", () => { let workdir: string; @@ -110,7 +110,7 @@ describe("extractFieldValues", () => { }, ], }; - const fieldValuesFile = join(workdir, "field-values.json"); + const fieldValuesFile = join(workdir, "field_values.json"); writeFileSync(fieldValuesFile, JSON.stringify(fieldValues)); const stats = extractFieldValues({ @@ -137,7 +137,7 @@ describe("extractFieldValues", () => { }, ], }; - const fieldValuesFile = join(workdir, "field-values.json"); + const fieldValuesFile = join(workdir, "field_values.json"); writeFileSync(fieldValuesFile, JSON.stringify(fieldValues)); const stats = extractFieldValues({ @@ -154,7 +154,7 @@ describe("extractFieldValues", () => { }); it("joins nested JSON field paths with dots in the filename", () => { - const metadataPath = join(workdir, "metadata.json"); + const metadataPath = join(workdir, "table_metadata.json"); writeFileSync( metadataPath, JSON.stringify({ @@ -168,7 +168,7 @@ describe("extractFieldValues", () => { }), ); - const fieldValuesPath = join(workdir, "field-values.json"); + const fieldValuesPath = join(workdir, "field_values.json"); writeFileSync( fieldValuesPath, JSON.stringify({ @@ -219,7 +219,7 @@ describe("extractFieldValues", () => { }, ], }; - const fieldValuesFile = join(workdir, "field-values.json"); + const fieldValuesFile = join(workdir, "field_values.json"); writeFileSync(fieldValuesFile, JSON.stringify(fieldValues)); extractFieldValues({ diff --git a/src/extract-metadata.test.ts b/src/extract-metadata.test.ts index d2e2d24..9fc4dba 100644 --- a/src/extract-metadata.test.ts +++ b/src/extract-metadata.test.ts @@ -13,7 +13,7 @@ import yaml from "js-yaml"; import { extractMetadata } from "./extract-metadata.js"; const REPO_ROOT = resolve(import.meta.dirname, ".."); -const EXAMPLE_INPUT = join(REPO_ROOT, "examples/v1/metadata.json"); +const EXAMPLE_INPUT = join(REPO_ROOT, "examples/v1/table_metadata.json"); describe("extractMetadata", () => { let workdir: string; From 761aca314947c133430720582451440d2a7d319c Mon Sep 17 00:00:00 2001 From: Alexander Polyankin Date: Thu, 30 Apr 2026 10:13:53 -0400 Subject: [PATCH 3/6] update version --- package.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/package.json b/package.json index 7089109..7e2264b 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@metabase/database-metadata", - "version": "1.0.4", + "version": "1.0.5", "description": "CLI tool to extract Metabase database metadata into YAML files", "license": "SEE LICENSE IN LICENSE.txt", "repository": { From 829cb4e7ad7f4f9bd960e813420daede7ebd7630 Mon Sep 17 00:00:00 2001 From: Alexander Polyankin Date: Thu, 30 Apr 2026 10:16:37 -0400 Subject: [PATCH 4/6] fix --- src/{extract-metadata.test.ts => extract-table-metadata.test.ts} | 0 src/{extract-metadata.ts => extract-table-metadata.ts} | 0 2 files changed, 0 insertions(+), 0 deletions(-) rename src/{extract-metadata.test.ts => extract-table-metadata.test.ts} (100%) rename src/{extract-metadata.ts => extract-table-metadata.ts} (100%) diff --git a/src/extract-metadata.test.ts b/src/extract-table-metadata.test.ts similarity index 100% rename from src/extract-metadata.test.ts rename to src/extract-table-metadata.test.ts diff --git a/src/extract-metadata.ts b/src/extract-table-metadata.ts similarity index 100% rename from src/extract-metadata.ts rename to src/extract-table-metadata.ts From 210ebe678ab7bfaa00d5da69248e52274e60d945 Mon Sep 17 00:00:00 2001 From: Alexander Polyankin Date: Thu, 30 Apr 2026 10:16:38 -0400 Subject: [PATCH 5/6] fix --- .github/workflows/validate.yml | 2 +- README.md | 10 +++++----- bin/cli.test.ts | 6 +++--- bin/cli.ts | 8 ++++---- core-spec/v1/spec.md | 2 +- src/extract-table-metadata.test.ts | 14 +++++++------- src/extract-table-metadata.ts | 2 +- src/index.ts | 4 ++-- 8 files changed, 24 insertions(+), 24 deletions(-) diff --git a/.github/workflows/validate.yml b/.github/workflows/validate.yml index 1314cae..3517c52 100644 --- a/.github/workflows/validate.yml +++ b/.github/workflows/validate.yml @@ -18,7 +18,7 @@ jobs: - run: bun install - name: Extract metadata - run: bun run bin/cli.ts extract-metadata examples/v1/table_metadata.json /tmp/databases + run: bun run bin/cli.ts extract-table-metadata examples/v1/table_metadata.json /tmp/databases - name: Extract field values run: bun run bin/cli.ts extract-field-values examples/v1/table_metadata.json examples/v1/field_values.json /tmp/databases diff --git a/README.md b/README.md index fed8356..17d3872 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ Metadata is downloaded as `table_metadata.json` from the Metabase workspace page The CLI turns that JSON into the human- and agent-friendly YAML tree described in the spec: ```sh -bunx @metabase/database-metadata extract-metadata +bunx @metabase/database-metadata extract-table-metadata ``` - `` — path to the `table_metadata.json` downloaded from the workspace page. @@ -41,9 +41,9 @@ Metabase keeps a sampled list of distinct values for each field that's low-cardi bunx @metabase/database-metadata extract-field-values ``` -- `` — the same `table_metadata.json` used by `extract-metadata`. Field values reference fields by numeric ID, which the CLI resolves to natural keys using the metadata. +- `` — the same `table_metadata.json` used by `extract-table-metadata`. Field values reference fields by numeric ID, which the CLI resolves to natural keys using the metadata. - `` — path to the `field_values.json` downloaded from the workspace page. -- `` — destination directory; typically the same one used for `extract-metadata`, so values files land next to the table YAMLs they belong to. +- `` — destination directory; typically the same one used for `extract-table-metadata`, so values files land next to the table YAMLs they belong to. One YAML file is written per field that has values. Fields with empty samples are skipped; field IDs not present in the metadata are reported as orphans and skipped. See the spec's [Field Values](core-spec/v1/spec.md#field-values) section for the on-disk shape and when agents should consult these files. @@ -91,7 +91,7 @@ mkdir -p .metabase # Drop table_metadata.json (and optionally field_values.json) from the workspace page into .metabase/ rm -rf .metabase/databases -bunx @metabase/database-metadata extract-metadata .metabase/table_metadata.json .metabase/databases +bunx @metabase/database-metadata extract-table-metadata .metabase/table_metadata.json .metabase/databases bunx @metabase/database-metadata extract-field-values .metabase/table_metadata.json .metabase/field_values.json .metabase/databases ``` @@ -109,7 +109,7 @@ The workflow requires an `NPM_RELEASE_TOKEN` secret with publish access to the ` ```sh bun install -bun bin/cli.ts extract-metadata examples/v1/table_metadata.json /tmp/.metabase/databases +bun bin/cli.ts extract-table-metadata examples/v1/table_metadata.json /tmp/.metabase/databases ``` ### Scripts diff --git a/bin/cli.test.ts b/bin/cli.test.ts index bc357f2..b5a9d2a 100644 --- a/bin/cli.test.ts +++ b/bin/cli.test.ts @@ -47,7 +47,7 @@ describe("cli", () => { }); }); - describe("extract-metadata", () => { + describe("extract-table-metadata", () => { let workdir: string; beforeEach(() => { @@ -60,7 +60,7 @@ describe("cli", () => { it("extracts the bundled example into YAML files", () => { const { stdout, exitCode } = runCli([ - "extract-metadata", + "extract-table-metadata", EXAMPLE_INPUT, workdir, ]); @@ -73,7 +73,7 @@ describe("cli", () => { }); it("errors when arguments are missing", () => { - const { stderr, exitCode } = runCli(["extract-metadata"]); + const { stderr, exitCode } = runCli(["extract-table-metadata"]); expect(exitCode).toBe(1); expect(stderr).toContain( " and arguments are required", diff --git a/bin/cli.ts b/bin/cli.ts index 9145591..d29b4d0 100644 --- a/bin/cli.ts +++ b/bin/cli.ts @@ -3,7 +3,7 @@ import { parseArgs } from "node:util"; import { extractFieldValues } from "../src/extract-field-values.js"; -import { extractMetadata } from "../src/extract-metadata.js"; +import { extractTableMetadata } from "../src/extract-table-metadata.js"; import { extractSpec } from "../src/extract-spec.js"; type ParsedValues = { @@ -14,7 +14,7 @@ type ParsedValues = { const HELP = `Usage: database-metadata [arguments] [options] Commands: - extract-metadata Extract metadata JSON into YAML files + extract-table-metadata Extract metadata JSON into YAML files Writes one YAML per database + one per table with fields nested inside. @@ -50,7 +50,7 @@ function handleExtractMetadata(positionals: string[]): void { process.exit(1); } - const stats = extractMetadata({ inputFile, outputFolder }); + const stats = extractTableMetadata({ inputFile, outputFolder }); console.log( `Extracted ${stats.databases} databases, ${stats.tables} tables, ${stats.fields} fields`, ); @@ -96,7 +96,7 @@ function main(): void { } switch (command) { - case "extract-metadata": + case "extract-table-metadata": return handleExtractMetadata(positionals); case "extract-field-values": return handleExtractFieldValues(positionals); diff --git a/core-spec/v1/spec.md b/core-spec/v1/spec.md index c629fb1..b2c299f 100644 --- a/core-spec/v1/spec.md +++ b/core-spec/v1/spec.md @@ -273,7 +273,7 @@ Field values are **sampled, not exhaustive**: Metabase caps the list (typically The recommended workflow is therefore strictly sequential: 1. Download `table_metadata.json` from the Metabase workspace page. -2. Run `extract-metadata` to write the database/table/field YAML tree. +2. Run `extract-table-metadata` to write the database/table/field YAML tree. 3. Download `field_values.json` from the **same** workspace, ideally back-to-back with step 1. 4. Run `extract-field-values` against the same output folder to drop per-field values files into the existing tree. diff --git a/src/extract-table-metadata.test.ts b/src/extract-table-metadata.test.ts index 9fc4dba..ad03ed3 100644 --- a/src/extract-table-metadata.test.ts +++ b/src/extract-table-metadata.test.ts @@ -10,12 +10,12 @@ import { tmpdir } from "os"; import { join, resolve } from "path"; import yaml from "js-yaml"; -import { extractMetadata } from "./extract-metadata.js"; +import { extractTableMetadata } from "./extract-table-metadata.js"; const REPO_ROOT = resolve(import.meta.dirname, ".."); const EXAMPLE_INPUT = join(REPO_ROOT, "examples/v1/table_metadata.json"); -describe("extractMetadata", () => { +describe("extractTableMetadata", () => { let workdir: string; beforeEach(() => { @@ -27,7 +27,7 @@ describe("extractMetadata", () => { }); it("extracts the bundled sample database to YAML", () => { - const stats = extractMetadata({ + const stats = extractTableMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir, }); @@ -49,7 +49,7 @@ describe("extractMetadata", () => { }); it("strips numeric ids and uses natural-key db_id on tables", () => { - extractMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir }); + extractTableMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir }); const tablePath = join( workdir, "Sample Database", @@ -69,7 +69,7 @@ describe("extractMetadata", () => { }); it("rewrites fk_target_field_id as a natural-key tuple", () => { - extractMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir }); + extractTableMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir }); const tablePath = join( workdir, "Sample Database", @@ -103,7 +103,7 @@ describe("extractMetadata", () => { }), ); const out = join(workdir, "out"); - extractMetadata({ inputFile: input, outputFolder: out }); + extractTableMetadata({ inputFile: input, outputFolder: out }); expect( existsSync(join(out, "weird__SLASH__name", "weird__SLASH__name.yaml")), @@ -111,7 +111,7 @@ describe("extractMetadata", () => { }); it("regenerates output that matches the bundled examples", () => { - extractMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir }); + extractTableMetadata({ inputFile: EXAMPLE_INPUT, outputFolder: workdir }); const checkedIn = readFileSync( join( diff --git a/src/extract-table-metadata.ts b/src/extract-table-metadata.ts index a502ec7..526a530 100644 --- a/src/extract-table-metadata.ts +++ b/src/extract-table-metadata.ts @@ -110,7 +110,7 @@ function buildStats(metadata: RawMetadata): ExtractMetadataResult { }; } -export function extractMetadata({ +export function extractTableMetadata({ inputFile, outputFolder, }: ExtractMetadataOptions): ExtractMetadataResult { diff --git a/src/index.ts b/src/index.ts index c904ea4..dc82dcf 100644 --- a/src/index.ts +++ b/src/index.ts @@ -1,8 +1,8 @@ export { - extractMetadata, + extractTableMetadata, type ExtractMetadataOptions, type ExtractMetadataResult, -} from "./extract-metadata.js"; +} from "./extract-table-metadata.js"; export { extractFieldValues, type ExtractFieldValuesOptions, From bf60c1fece015a0b4c8f7f8c2af47180f9f7d54b Mon Sep 17 00:00:00 2001 From: Alexander Polyankin Date: Thu, 30 Apr 2026 10:37:48 -0400 Subject: [PATCH 6/6] fixes --- README.md | 24 ++++++++++++------------ core-spec/v1/spec.md | 4 ++-- 2 files changed, 14 insertions(+), 14 deletions(-) diff --git a/README.md b/README.md index 17d3872..5e9aaf1 100644 --- a/README.md +++ b/README.md @@ -61,18 +61,18 @@ Omit `--file` to write `spec.md` into the current directory. The following is the **default** workflow for a project that wants to use Metabase metadata. It is a convention, not a requirement — teams are free to organize things differently. -### 1. A `.metabase/` directory at the repo root +### 1. A `.metadata/` directory at the repo root -Create a top-level `.metabase/` directory and **add it to `.gitignore`**. This is where the raw `table_metadata.json` and the extracted `databases/` YAML tree live: +Create a top-level `.metadata/` directory and **add it to `.gitignore`**. This is where the raw `table_metadata.json` and the extracted `databases/` YAML tree live: ``` -.metabase/ +.metadata/ ├── table_metadata.json └── databases/ └── … ``` -### 2. Why `.metabase/` should not be committed +### 2. Why `.metadata/` should not be committed On a large data warehouse the metadata export can easily reach **hundreds of megabytes or several gigabytes**. Committing it: @@ -84,18 +84,18 @@ Each developer (or a CI job) fetches metadata on demand from their own Metabase ### 3. Download from the workspace page and extract -Each developer downloads `table_metadata.json` (and optionally `field_values.json`) from the Metabase workspace page and drops them into `.metabase/`. Then run the extractors: +Each developer downloads `table_metadata.json` (and optionally `field_values.json`) from the Metabase workspace page and drops them into `.metadata/`. Then run the extractors: ```sh -mkdir -p .metabase -# Drop table_metadata.json (and optionally field_values.json) from the workspace page into .metabase/ +mkdir -p .metadata +# Drop table_metadata.json (and optionally field_values.json) from the workspace page into .metadata/ -rm -rf .metabase/databases -bunx @metabase/database-metadata extract-table-metadata .metabase/table_metadata.json .metabase/databases -bunx @metabase/database-metadata extract-field-values .metabase/table_metadata.json .metabase/field_values.json .metabase/databases +rm -rf .metadata/databases +bunx @metabase/database-metadata extract-table-metadata .metadata/table_metadata.json .metadata/databases +bunx @metabase/database-metadata extract-field-values .metadata/table_metadata.json .metadata/field_values.json .metadata/databases ``` -After this, tools and agents should read the YAML tree under `.metabase/databases/` — not `table_metadata.json` or `field_values.json`, which exist only as input to the extractors. +After this, tools and agents should read the YAML tree under `.metadata/databases/` — not `table_metadata.json` or `field_values.json`, which exist only as input to the extractors. ## Publishing to NPM @@ -109,7 +109,7 @@ The workflow requires an `NPM_RELEASE_TOKEN` secret with publish access to the ` ```sh bun install -bun bin/cli.ts extract-table-metadata examples/v1/table_metadata.json /tmp/.metabase/databases +bun bin/cli.ts extract-table-metadata examples/v1/table_metadata.json /tmp/.metadata/databases ``` ### Scripts diff --git a/core-spec/v1/spec.md b/core-spec/v1/spec.md index b2c299f..68a98a6 100644 --- a/core-spec/v1/spec.md +++ b/core-spec/v1/spec.md @@ -123,10 +123,10 @@ Common semantic types, grouped by purpose: ## Folder Structure -By convention, metadata is extracted under a `.metabase/databases/` directory, with each database occupying its own folder. The exporter itself doesn't enforce this location; it writes the tree below into whatever folder the caller passes. +By convention, metadata is extracted under a `.metadata/databases/` directory, with each database occupying its own folder. The exporter itself doesn't enforce this location; it writes the tree below into whatever folder the caller passes. ``` -.metabase/ +.metadata/ └── databases/ └── {database}/ ├── {database}.yaml