Autocrop

Extract individual photos from album page scans and enhance them using AI.

Got a stack of old photo albums? Snap a picture of each page, point this tool at the folder, and get back individual photos — cropped, rotated, dated, geotagged, colorized, and restored.

How it works

Album page scan ──▶ ai_parse.py ──▶ autocrop_meta.json ──▶ editor.py ──▶ crop_exif.py ──▶ ai_enhance.py
                    (Vision AI)     (bbox + metadata)      (web editor)   (crop+EXIF)      (restore)

ai_parse.py sends each page to a Vision AI model (Gemini or GPT-4o) which returns bounding boxes, orientation, dates, locations, and captions — saved to autocrop_meta.json
editor.py serves a web-based editor to visually review and adjust bounding boxes, rotation, dates, locations (with map), and captions
crop_exif.py reads the metadata file and produces cropped photos with EXIF metadata
ai_enhance.py sends each cropped photo to Gemini which restores it: removes album corners, fixes scratches, improves contrast, and colorizes B&W photos

Quick start

pip install -r requirements.txt
export GEMINI_API_KEY="your-key"

# Step 1: Analyze pages and create metadata
python ai_parse.py ./album_pages/ -p gemini --default-location "New York"

# Step 2: Review and adjust in web editor
python editor.py ./album_pages/

# Step 3: Apply metadata to produce cropped photos (editor does this via Apply button, or run manually)
python crop_exif.py ./album_pages/

# Step 4: Enhance extracted photos
python ai_enhance.py ./album_pages/cropped/ -o ./enhanced/

Or use --auto-apply for the legacy one-shot workflow:

python ai_parse.py ./album_pages/ -p gemini --auto-apply

Individual photos (no album pages)

If your photos are already scanned or photographed one by one, skip the AI detection step and go straight to the editor:

# Step 1: Review and adjust in web editor
python editor.py ./my_photos/

# Step 2: Enhance
python ai_enhance.py ./my_photos/cropped/ -o ./enhanced/

The editor will treat each image as a single full-page photo and pre-fill date, location, and caption from EXIF. You can fine-tune the crop, fix rotation, and edit metadata — then hit Apply. Originals stay untouched; results go to cropped/.

`ai_parse.py` — Detect photos using AI

Modes

Default: create-metadata — analyze pages and save results to autocrop_meta.json (no cropping):

python ai_parse.py ./album_pages/ -p gemini --default-location "Berlin"

Auto-apply — legacy one-shot mode (analyze + crop + save in one step):

python ai_parse.py ./album_pages/ -p gemini --auto-apply

Examples

# Single page
python ai_parse.py page.jpg -p gemini

# Directory of pages, 4 in parallel
python ai_parse.py ./album_pages/ -p gemini -j 4

# With fallback location for geocoding
python ai_parse.py ./album_pages/ -p gemini --default-location "Berlin"

# Restrict geocoding to specific countries
python ai_parse.py ./album_pages/ -p gemini --default-location "Karaganda" --countries "kz,ru"

Features:

Photo detection via Gemini or GPT-4o Vision API
Reads handwritten dates, locations, and captions
GPS geocoding via OpenStreetMap Nominatim
Date inheritance: undated photos get the date from the nearest dated page
Parallel page processing (-j N)

Options:

Option	Description
`--auto-apply`	Analyze + crop + save in one shot (legacy behavior)
`-o, --output DIR`	Output directory (default: `<input>/cropped`)
`-p, --provider`	`openai` or `gemini` (default: `openai`)
`-m, --model`	Vision model name
`--api-key`	API key (default: from env var)
`--default-location`	Fallback city/region for geocoding
`--countries`	Restrict geocoding to these countries (comma-separated ISO codes, e.g. `kz,ru,ua,us`)
`--no-location-spread`	Don't apply a recognized location from one photo to other photos on the same page
`-j, --jobs N`	Parallel pages (default: 4)

`crop_exif.py` — Crop photos and write EXIF

Reads autocrop_meta.json and produces cropped, rotated JPEG files with EXIF metadata (date, GPS, caption).

python crop_exif.py ./album_pages/
python crop_exif.py ./album_pages/ -o ./output/

No AI calls, no API key needed. Works with the metadata file created by ai_parse.py or editor.py.

Options:

Option	Description
`-o, --output DIR`	Output directory (default: `<input>/cropped`)

`editor.py` — Web-based metadata editor

Visual editor for reviewing and adjusting photo detection results before cropping.

python editor.py ./album_pages/
python editor.py ./album_pages/ -o ./output/ --port 8080

Works with or without autocrop_meta.json — if no metadata exists, each image is treated as a single full-page photo with date, location, and caption pre-filled from the source file's EXIF. GPS coordinates are automatically resolved to place names in the background via reverse geocoding.

When autocrop_meta.json exists but photos have empty fields, the editor also backfills date, location, and caption from the source image EXIF.

Opens a web app at http://localhost:8080 with two modes accessible via header toggle:

Cut photos — adjust bounding boxes:

Drag corners and edges to resize photo boundaries
Drag inside a box to move it
Double-click to add a new photo
Delete key to remove selected photo (at least 1 photo per page required)
First photo auto-selected on page load

Edit photos — edit photo details:

Preview cropped and rotated photos
Rotate photos left/right
Edit date and caption (changing a value suggests applying it to other photos with the same old value)
Select location on an interactive map (OpenStreetMap)
Save Draft or Apply to produce final cropped photos
Warns about unsaved changes on page close

Navigation: Google-style page selector in header, arrow keys for prev/next page.

Options:

Option	Description
`-o, --output DIR`	Output directory for cropped photos (default: `<input>/cropped`)
`--port N`	HTTP server port (default: 8080)
`--no-browser`	Don't open browser automatically

`ai_enhance.py` — Restore & colorize

Sends each photo to Gemini which restores it in one API call: removes album corner holders, fixes scratches and defects, improves contrast, and colorizes B&W photos.

# Single photo
python ai_enhance.py photo.jpg

# Directory, 3 parallel workers
python ai_enhance.py ./cropped/ -j 3 --rpm 10

# Batch mode (50% cheaper, slower)
python ai_enhance.py ./cropped/ --batch

# Pro model for higher quality
python ai_enhance.py ./cropped/ -m gemini-3-pro-image-preview

Two modes:

Real-time (default): parallel workers with rate limiting, results immediately
Batch (--batch): submits all photos as one job, 50% cheaper, waits for completion. Failed photos are automatically retried.

Options:

Option	Description
`-o, --output DIR`	Output directory (default: `<input>/enhanced`)
`-m, --model`	Gemini image model (default: `gemini-3.1-flash-image-preview`)
`--api-key`	Gemini API key (default: `GEMINI_API_KEY` env var)
`-j, --jobs N`	Parallel workers, real-time mode only (default: 3)
`--rpm N`	Max requests per minute, real-time mode only (default: 10)
`--batch`	Use Batch API (50% cheaper). `-j` and `--rpm` are ignored
`--poll N`	Batch poll interval in seconds (default: 30)

`edit_meta.py` — Fix metadata interactively (CLI)

Terminal-based metadata editor for already-cropped photos.

python edit_meta.py ./cropped/
python edit_meta.py ./enhanced/ --countries "kz,ru"

The script displays a numbered table of all photos with their metadata, then accepts commands:

 #  File                              Date        Loc  Caption
 1  198905_page1_photo_01.jpg         1989:05       ✓  Walking with mom
 2  198905_page1_photo_02.jpg         1989:05       ✓  —
 3  page2_photo_01.jpg                —             —  Think...

Enter commands: <number><D|L|C> <value>  (empty line to apply)
> 3D 1989:06
> 3L Berlin, Germany
> 3L 49.5186, 72.8238
> 1C Walking in the park
>

Command	Description
`3D 1989:06`	Set date for photo #3 (also renames the file)
`3L Berlin, Germany`	Set location by name (geocodes to GPS coordinates)
`3L 49.5186, 72.8238`	Set location by coordinates (writes GPS directly, no geocoding)
`1C Walking in the park`	Set caption

Options:

Option	Description
`--countries`	Restrict geocoding to these countries (comma-separated ISO codes)

Cost estimate

All processing happens via API calls — no local GPU needed.

ai_parse.py (Gemini Flash):

~$0.0003 per page (~3K input tokens)
100-page album: ~$0.03

ai_enhance.py (Gemini Flash Image):

~$0.04 per photo (image generation)
With --batch: ~$0.02 per photo (50% off)
200 photos: ~$4-8

Rate limits (Gemini image generation)

Tier	RPM	RPD
Free	2	50
Tier 1 (paid)	10	500
Tier 2	20	2,000

Supported providers (ai_parse)

Provider	Default Model	Env Variable
OpenAI	`gpt-4o`	`OPENAI_API_KEY`
Gemini	`gemini-3-flash-preview`	`GEMINI_API_KEY`

Any model name can be passed via -m.

Known issues

Google Photos captions: There is no way to programmatically set a photo description that Google Photos will display in its UI. We write captions as EXIF ImageDescription and XMP dc:description, but Google Photos shows them under "Other" instead of the description field. The Google Photos Library API accepts a description field, but it is only visible via the API — not in the web or mobile UI. Apple Photos displays XMP descriptions correctly.

Installation

pip install -r requirements.txt

Requires Python 3.9+. No GPU needed.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Autocrop

How it works

Quick start

Individual photos (no album pages)

`ai_parse.py` — Detect photos using AI

Modes

Examples

`crop_exif.py` — Crop photos and write EXIF

`editor.py` — Web-based metadata editor

`ai_enhance.py` — Restore & colorize

`edit_meta.py` — Fix metadata interactively (CLI)

Cost estimate

Rate limits (Gemini image generation)

Supported providers (ai_parse)

Known issues

Installation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ai_enhance.py		ai_enhance.py
ai_parse.py		ai_parse.py
crop_exif.py		crop_exif.py
edit_meta.py		edit_meta.py
editor.py		editor.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Autocrop

How it works

Quick start

Individual photos (no album pages)

ai_parse.py — Detect photos using AI

Modes

Examples

crop_exif.py — Crop photos and write EXIF

editor.py — Web-based metadata editor

ai_enhance.py — Restore & colorize

edit_meta.py — Fix metadata interactively (CLI)

Cost estimate

Rate limits (Gemini image generation)

Supported providers (ai_parse)

Known issues

Installation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`ai_parse.py` — Detect photos using AI

`crop_exif.py` — Crop photos and write EXIF

`editor.py` — Web-based metadata editor

`ai_enhance.py` — Restore & colorize

`edit_meta.py` — Fix metadata interactively (CLI)

Packages