Here, I have implemented cursor-based pagination properly, not the watered-down OFFSET version you see in most tutorials, but actual keyset pagination that holds up at scale.
If you've ever had a query like SELECT * FROM items OFFSET 50000 LIMIT 20 crawl to a halt on a large table, this is the fix for that.
- cursor-based pagination using
(created_at, id)as the keyset - filtering by
statusandtagvia SQL WHERE clauses - composite indexes on the columns actually used for filtering
- async SQLAlchemy 2.0 with asyncpg (no sync sessions anywhere)
- 13 tests that run against SQLite — no postgres needed for the test suite
- docker setup that actually works first try (mostly)
the easy way — Docker:
git clone https://github.com/AwkJay/fastapi-pagination-api.git
cd fastapi-pagination-api
cp .env.example .env
docker-compose up --buildOnce both containers are up, the API is at http://localhost:8000.
Interactive docs (Swagger UI) at http://localhost:8000/docs — easier than curl for manual testing.
if you already have postgres running locally on port 5432, you'll hit a port conflict. change the postgres ports in docker-compose.yml to
"5433:5432"and rundocker-compose down && docker-compose up --build.
without Docker:
pip install -r requirements.txt
# set DATABASE_URL in .env to point at your local postgres
uvicorn app.main:app --reloadpytest tests/ -vUses SQLite + aiosqlite so there's nothing to set up. All 13 pass in ~1.3s.
The tricky one was test_second_page_no_overlap — SQLite resolves timestamps to the second, so items created in a fast loop all get the same created_at. Fixed by injecting explicit spaced timestamps in the test data rather than fighting the clock.
create an item
curl -X POST "http://localhost:8000/api/v1/items" \
-H "Content-Type: application/json" \
-d '{"name": "some item", "status": "active", "tag": "python"}'first page
curl "http://localhost:8000/api/v1/items?limit=10"next page — take next_cursor from the previous response and pass it back:
curl "http://localhost:8000/api/v1/items?limit=10&cursor=eyJ0cyI6..."filter — note the quotes around the URL, & is special in bash without them:
curl "http://localhost:8000/api/v1/items?status=active&tag=python"single item
curl "http://localhost:8000/api/v1/items/{id}"health
curl "http://localhost:8000/api/v1/health"create an item
curl -X POST "http://localhost:8000/api/v1/items" \
-H "Content-Type: application/json" \
-d '{"name": "some item", "status": "active", "tag": "python"}'first page
curl "http://localhost:8000/api/v1/items?limit=10"next page — take next_cursor from the previous response and pass it back:
curl "http://localhost:8000/api/v1/items?limit=10&cursor=eyJ0cyI6..."filter — note the quotes around the URL, & is special in bash without them:
curl "http://localhost:8000/api/v1/items?status=active&tag=python"single item
curl "http://localhost:8000/api/v1/items/{id}"health
curl "http://localhost:8000/api/v1/health"OFFSET pagination tells the DB "skip the first N rows, give me the next 20." The problem is the DB still has to scan and count those N rows every time, so page 500 is 500x slower than page 1. On a table with a million rows this gets bad fast.
Cursor pagination instead encodes the position of the last item you saw — in this case the created_at timestamp and the id as a tiebreaker, packed into a base64 string. The next query becomes:
WHERE (created_at, id) < ('2026-01-15T10:30:00', '3f2504e0-...')
ORDER BY created_at DESC, id DESC
LIMIT 20The database hits the ix_item_created_id composite index and jumps straight to that position. No scanning, no counting skipped rows. Performance stays flat whether you're on page 1 or page 10,000.
The tradeoff: you can't jump to "page 47" directly. You have to walk forward. For most real use cases (infinite scroll, APIs, data pipelines) that's fine.
{
"items": [
{
"id": "3f2504e0-4f89-11d3-9a0c-0305e82c3301",
"name": "example item",
"status": "active",
"tag": "python",
"score": 4.2,
"created_at": "2026-01-15T10:30:00+00:00"
}
],
"next_cursor": "eyJ0cyI6ICIyMDI0LTAxLTE1VDEwOjMwOjAwIiwgImlkIjogIjNmMjUwNGUwIn0",
"has_more": true,
"total_count": 1500,
"limit": 10
}total_count is always the full filtered count, not the page count. It runs as a separate COUNT(*) query with the same WHERE filters applied but without the cursor — so you always know the dataset size regardless of which page you're on.
has_more uses the limit+1 trick: the query fetches one extra row. If it comes back, there's more data and that extra row gets dropped from the response. Cheaper than a second COUNT query.