Ellen Ammann Dataset Manager

This project is a lightweight, local web application designed to help non-technical users build a structured Knowledge Base (KB) and an Evaluation Questionnaire (QA) for Ellen Ammann. The resulting datasets are intended to be used for a Retrieval-Augmented Generation (RAG) system.

Application Preview

Below is how the application should look when it is running. You can switch between the Knowledge Base and the Evaluation Questionnaire tabs.

Knowledge Base (KB) Interface

Evaluation Questionnaire (QA) Interface

Installation & Setup

Windows Quick Start

Download/Clone: Download this repository to your local machine.
Build: Double-click on build.bat. This will install all necessary Node.js dependencies.
Start: Once the build is complete, double-click on start.bat to launch the application.
You will see the message: Server is running on http://localhost:3000. Then open web browser such as Google Chrome and past the link : http://localhost:3000

Alternative Editing

While the web tool is recommended for visualising and managing the database, you can also simply edit the raw data files directly:

ellen_ammann_kb.jsonl (Knowledge Base)
ellen_ammann_eval_qa.jsonl (Evaluation Questionnaire)

If you have any questions or need assistance, please feel free to reach out!

Getting Started

If you are looking to learn how to use this tool to build your dataset, please read the User Manual first! The manual covers starting the application, making edits, deleting entries, and safely backing up your progress.

Codebase Overview

The codebase is a simple, no-build-required, full-stack application. It prioritizes ease of use and aesthetics without overly complex dependencies.

Tech Stack

Backend: Node.js with Express.js (server.js)
Frontend: Vanilla HTML (public/index.html), CSS (public/style.css), and JavaScript (public/app.js)
Data Storage: Local .jsonl files (JSON Lines format)

Directory Structure

server.js: The main backend server file. It exposes REST API endpoints (GET, POST, DELETE) to interact with the JSONL files.
public/: Contains the frontend assets.
- index.html: The semantic structure of the editor.
- style.css: A premium, dark-mode, glassmorphism design system.
- app.js: The client-side logic that handles fetching data, rendering lists, populating forms, and submitting changes via the fetch API.
data/: A generated folder that holds ring-buffer backups of the datasets.
.bat scripts: Helper scripts for Windows users to easily build, start, stop, and reset the application.

How it uses JSONL Data

The core of this application is its interaction with JSON Lines (.jsonl) files. A .jsonl file operates differently than a standard .json file: instead of being one massive JSON array, each individual line is a fully valid, standalone JSON object.

This is ideal for large datasets (like training LLMs or building KBs) because a program can read or append one line at a time without having to parse the entire file into memory stringify it all at once.

Backend Data Handling

Reading: When the frontend requests data (e.g., GET /api/kb), the backend Node.js server reads ellen_ammann_kb.jsonl. It splits the text file by line breaks (\n), parses each line using JSON.parse(), and sends the resulting array of objects to the frontend.
Writing/Editing: When a user saves a record, they send a JSON object to the backend (POST /api/kb). The backend reads the existing .jsonl file, checks if the record_id already exists, and either replaces that specific object in the array (if editing) or pushes a new object (if appending). It then converts the entire array back into a \n delineated string and overwrites the file.
Deleting: Similar to writing, the backend filters out the deleted ID and rewrites the remaining records to the file.

Before any modification, the server.js script automatically generates a timestamped backup in the data/ folder.

JSONL Data Structures

The system manages two distinct types of data:

1. Knowledge Base (`ellen_ammann_kb.jsonl`)

This file stores factual claims, biographical events, quotes, and summaries about Ellen Ammann.

JSON Architecture:

{
  "record_id": "ea_fact_0105",
  "record_type": "fact",  // e.g., fact, event, person_profile, quote
  "category": "personal_life", // e.g., personal_life, political_life
  "subject": "Ellen Ammann",
  "text": "Born in Stockholm (1870), she moved to Munich after marrying...",
  "predicate": "born", // Optional
  "object": "1870-07-01 in Stockholm", // Optional
  "time": "1870-07-01", // Optional
  "location": "Stockholm, Sweden", // Optional
  "source_ids": ["src_ndb_1953"], // Array of evidence sources
  "confidence": "High",
  "status": "Asserted", // Or "Disputed"
  "conflict_set_id": "" // Populated if status is "Disputed"
}

2. Evaluation Questionnaire (`ellen_ammann_eval_qa.jsonl`)

This file stores ground-truth questions and answers used to evaluate how well the RAG model retrieves the aforementioned KB facts.

JSON Architecture:

{
  "qid": "Q01",
  "question": "In what year was Ellen Ammann born?",
  "ground_truth_answer": "Ellen Ammann was born in 1870.",
  "supporting_record_ids": ["ea_fact_0105"], // Points back to the KB record_id
  "supporting_source_ids": ["src_ndb_1953"]
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data/kb		data/kb
images		images
public		public
.gitignore		.gitignore
README.md		README.md
User_Manual.md		User_Manual.md
build.bat		build.bat
ellen_ammann_eval_qa.jsonl		ellen_ammann_eval_qa.jsonl
ellen_ammann_kb.jsonl		ellen_ammann_kb.jsonl
package-lock.json		package-lock.json
package.json		package.json
reset_database.bat		reset_database.bat
server.js		server.js
start.bat		start.bat
stop.bat		stop.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ellen Ammann Dataset Manager

Application Preview

Knowledge Base (KB) Interface

Evaluation Questionnaire (QA) Interface

Installation & Setup

Windows Quick Start

Alternative Editing

Getting Started

Codebase Overview

Tech Stack

Directory Structure

How it uses JSONL Data

Backend Data Handling

JSONL Data Structures

1. Knowledge Base (`ellen_ammann_kb.jsonl`)

2. Evaluation Questionnaire (`ellen_ammann_eval_qa.jsonl`)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ellen Ammann Dataset Manager

Application Preview

Knowledge Base (KB) Interface

Evaluation Questionnaire (QA) Interface

Installation & Setup

Windows Quick Start

Alternative Editing

Getting Started

Codebase Overview

Tech Stack

Directory Structure

How it uses JSONL Data

Backend Data Handling

JSONL Data Structures

1. Knowledge Base (ellen_ammann_kb.jsonl)

2. Evaluation Questionnaire (ellen_ammann_eval_qa.jsonl)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Knowledge Base (`ellen_ammann_kb.jsonl`)

2. Evaluation Questionnaire (`ellen_ammann_eval_qa.jsonl`)

Packages