Characterize model scaling with additional training data

## Overview

Understand how model performance scales as the amount of training data increases. This will inform data collection priorities and set expectations for future training runs.

## Tasks

- [ ] Define evaluation metric(s) to track (e.g., validation loss, MAE on charge density)
- [ ] Train models on increasing subsets of available data (e.g., 10%, 25%, 50%, 75%, 100%)
- [ ] Plot learning curves as a function of dataset size
- [ ] Identify whether the model is data-limited or compute-limited at current scale
- [ ] Summarize findings and recommend next steps for data acquisition if needed

## Acceptance Criteria

- Scaling curves produced and documented
- Clear conclusion on whether more data is expected to yield meaningful gains

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Characterize model scaling with additional training data #87

Overview

Tasks

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Characterize model scaling with additional training data #87

Description

Overview

Tasks

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions