prdai · prdai · Apr 12, 2026 · Mar 26, 2026 · Apr 1, 2026 · Apr 1, 2026
diff --git a/cf-region-proxy/README.md b/cf-region-proxy/README.md
@@ -0,0 +1,16 @@
+# cf-region-proxy
+
+a cloudflare workers-based regional proxy that routes requests through specific edge regions, monitors latency across all regions, and retries through alternative regions on failure.
+
+born out of thinking about how cloudflare ai gateway handles retry logic at the region level- and realizing there's no native way to do programmable per-request region routing on cloudflare.
+
+## sections
+
+the design document is broken down into these parts:
+
+- [why i thought of this idea](./why.mdx) - the origin story behind this, what made me think about it
+- [the problem](./problem.mdx) - the actual problem, backed with some context
+- [solution & requirements](./solution.mdx) - what the solution looks like and the exact requirements
+- [technical design](./technical-design.mdx) - the architecture, how the system actually works under the hood
+- [references](./references.mdx) - links and resources that were used or referenced
+- [raw inputs](./raw.mdx) - the unedited inputs that were used to put this together, for transparency
diff --git a/cf-region-proxy/problem.mdx b/cf-region-proxy/problem.mdx
@@ -0,0 +1,32 @@
+# the problem
+
+cloudflare workers are fast, globally distributed, and easy to deploy. but the region routing story is basically nonexistent at the request level.
+
+## no per-request region control for workers
+
+workers run at the nearest PoP to whoever's making the request. that's great for general latency, but it means you can't say "this request should go through eastern north america" from application code. the only tool that approximates this is placement hints- and those are static per deployment, not per request.
+
+durable objects have `locationHint` and `jurisdiction`, but those are soft controls for state placement, not request routing. they're also best-effort for hints, and hard controls are limited to two coarse options (eu, fedramp).
+
+## the latency visibility gap
+
+there's no built-in way to know how fast your target api or service responds from each cloudflare region. cloudflare can tell you a lot about edge performance, but it doesn't expose "here's what your upstream looks like from wnam vs apac."
+
+if you're trying to route intelligently- pick the fastest region for a given target, or understand where latency is coming from- you have to measure it yourself.
+
+## retry logic is region-blind
+
+when a request fails or a region degrades, the only real option in standard workers is to retry the same request. there's no awareness of "maybe enam is slow right now, try weur." you either retry in place or you don't retry at all.
+
+for use cases like llm provider calls (where a model might be throttled or slow in one region), or compliance-sensitive routing (where traffic must stay in certain jurisdictions), or just basic resilience- this is a meaningful gap.
+
+## what's missing
+
+a lightweight, programmable layer that:
+- routes requests to specific cloudflare regions on demand
+- monitors real latency from each region to a target url
+- surfaces the fastest region automatically based on fresh measurements
+- handles retries across regions intelligently
+- respects jurisdiction constraints when they matter
+
+none of this exists as a built-in primitive on cloudflare. it has to be built.
diff --git a/cf-region-proxy/raw.mdx b/cf-region-proxy/raw.mdx
@@ -0,0 +1,131 @@
+# raw inputs
+
+these are the raw unedited inputs that were provided to put together the design document- just for transparency and an open look at how the ideas were communicated before being cleaned up.
+
+---
+
+## initial context - cloudflare region routing research
+
+the conversation started with asking about how to pick specific regions for durable objects or workers in cloudflare. here's the breakdown that came out of that:
+
+### workers
+
+- you can't directly pin a region by default
+- cloudflare workers run globally, deployed to ~300+ PoPs worldwide
+- requests execute closest to the user automatically
+- no config like `region=ap-south-1`
+- the only way to constrain is via Regional Services (Data Localization Suite) - enterprise config, not code
+- restrictions: doesn't apply to subrequests, cron, queues. not fine-grained (region groups, not exact city)
+
+### durable objects
+
+- stateful actors, so placement matters
+- default: first request decides the region, then it stays there permanently
+- 3 levels of control:
+
+#### location hint (soft control)
+
+```js
+const stub = env.MY_DO.get(id, { locationHint: "enam" });
+```
+
+available values:
+
+| code | region |
+|------|--------|
+| wnam | western north america |
+| enam | eastern north america |
+| sam | south america |
+| weur | western europe |
+| eeur | eastern europe |
+| apac | asia-pacific |
+| oc | oceania |
+| afr | africa |
+| me | middle east |
+
+- best-effort only, not guaranteed
+- only affects first creation
+- broad geographic clusters, not precise regions
+
+#### jurisdiction (hard control, but coarse)
+
+```js
+const euNamespace = env.MY_DO.jurisdiction("eu");
+```
+
+options: `"eu"` (european union), `"fedramp"` (us compliant infra)
+
+- overrides locationHint
+- very limited options
+
+#### first-request placement (real control in practice)
+
+- trigger the first request from the region you want
+- DO is created near first request and doesn't move later
+
+### placement hints for workers
+
+```toml
+[placement]
+hostname = "api.mybackend.com"
+```
+
+- cloudflare probes the backend and picks lowest-latency PoP automatically
+- global, not per-request. static per deployment.
+
+### what you cannot do
+
+- choose exact country/city
+- move a DO after creation
+- guarantee latency location perfectly
+- pin workers per request manually
+
+---
+
+## the idea - first pass
+
+so my plan is to build this thing so that we can route any specific region that the person mentions, and then also setup retry logic for other regions we want to try to and also a way to monitor the speed sort of right, in which we can reach the specific url they wanna hit like that root maybe? or their api endpoint or something like that, and based on that we can decide the fastest one, i assume? we can't replicate that entire sort of request then it would get duplicated to multiple things right or we need to ask them for the health end point i guess so that we can have a cron trigger to check the specific url and the specific like fastest thing we can get maybe??? i am just thinking... like that would be on the worker level, that's all we can really do right? placement hints is all we can do, and juristictions as well s a additional thing? that's all i guess anything else out here
+
+### corrections and clarifications from research
+
+- can't test by sending real requests to all regions - duplicates writes, triggers side effects, breaks APIs, gets rate limited
+- correct approach: never probe with real traffic
+- 3 real approaches for health checking:
+  - option A: health endpoint (`GET /health`, `GET /ping`, `HEAD /`) - cleanest
+  - option B: synthetic probes (`HEAD` requests) - no side effects, fast, safe
+  - option C: passive latency from real traffic - rolling averages
+- retry logic must separate unsafe (POST/PUT/DELETE) vs safe (GET/HEAD) requests
+- can only retry if idempotent OR backend supports idempotency keys
+
+---
+
+## the idea - refined
+
+THIS IS A CLOUDFLARE WORKERS FOR REDIRECTING THIS SHIT RIGHT, IT WILL BE BUILD ON CLOUDFLARE HONO OR IT MIGHT BE ONLY BARE BONE CLOUDFLARE CODE, AND THEN WE WILL HAVE DURABLE OBJECTS AND WE CAN SORT OF SPECIFY THE DIRECTION IT SHOULD GO RIGHT, WHAT WE ARE GONNA DO IS USE THIS EDGE NETWORK SO THAT WE CAN SEND A REQUEST FROM ANY REGION AND THEN REDIRECT IT AND SEND THAT REQUEST TO A DIFF REGION AND MAKE IT GO, TO whatever the closest data center there is, this will help with compliance and sort of retries or region locked things for example, the original idea was related to retry logic for llm call providers, so that if one region doesn't work we can go to the other region but this can be applied for any sort of requests of sorts, we would be like monitoring the calls we take, and then we will also have like a cron job with a HEAD request to see in all the regions the performance for that specific thing and based on that we will check the latency and maintain that and it will take that into account and do everything if that makes sense, so lets say we pick the one with the fastest thing like that is there like the region sort of right, and then based on that go, and we will have a trigger per minute to check that domain in all of that, we will have a cloudflare kv ttl with like 10 mins which gets refreshed when we re call that domain again, but we check what the performance is for the last min so we can check how it is, i think there is some changes to do, this is the basic idea.
+
+## origin story context
+
+the context of the idea is that i have a interview at WSO2 in the IIT Career Fair 2026, and was being interviewed by https://www.linkedin.com/in/malith-jayasinghe/ so he was asking me about cloudflare ai gateway and i was like retry logic and i didn't understand how they did it so i thought of this sort of a approach for it for the region level thing but i don't think they do that, but yeah.
+
+---
+
+## the idea - further refinement (april 2026)
+
+what this will do is forward requests to a specific region the person want's to- that's simply it sort of right- and we will have this workflow with a cron job that every min we will trigger and go through all of the urls that it went to, and that person can select a auto region so we go to that, we just use the HEAD request to the thing and then we check through that type of a thing- understood? so yeah that's kind of the idea so that auto we have a cron trigger in the bg that will go through the stuff and send HEAD requests and see what responds faster and then it goes to that specific region from the last time period sort of, that is on Cloudflare KV, other than that i don't think there is a lot more for us to do per say- its very simple but its nice, and also they have a choice on top of the regions that cloudflare supports which there was like 6 i think they can specifically pick eu or us as well like juristications wise sort of a situation, web search as well and reverify https://developers.cloudflare.com/durable-objects/reference/data-location/#supported-locations-1
+
+---
+
+Parameter	Location
+wnam	Western North America
+enam	Eastern North America
+sam	South America 2
+weur	Western Europe
+eeur	Eastern Europe
+apac	Asia-Pacific
+oc	Oceania
+afr	Africa 2
+me	Middle East 2
+
+Parameter	Location
+eu	The European Union
+fedramp	FedRAMP-compliant data centers
diff --git a/cf-region-proxy/references.mdx b/cf-region-proxy/references.mdx
@@ -0,0 +1,12 @@
+# references
+
+things that were used or referenced while putting this together.
+
+- [cloudflare workers overview](https://developers.cloudflare.com/workers/) - the runtime this whole thing runs on
+- [durable objects - data location](https://developers.cloudflare.com/durable-objects/reference/data-location/) - supported location hints and jurisdictions for durable objects, the core mechanism for region routing
+- [durable objects - location hints](https://developers.cloudflare.com/durable-objects/reference/data-location/#provide-a-location-hint) - how to pass `locationHint` when getting a do stub
+- [durable objects - jurisdictions](https://developers.cloudflare.com/durable-objects/reference/data-location/#restrict-durable-objects-to-a-jurisdiction) - `eu` and `fedramp` jurisdiction controls
+- [cloudflare kv](https://developers.cloudflare.com/kv/) - the key-value store used for latency state
+- [cloudflare cron triggers](https://developers.cloudflare.com/workers/configuration/cron-triggers/) - how to schedule the background latency monitor
+- [cloudflare workers - smart placement](https://developers.cloudflare.com/workers/configuration/smart-placement/) - placement hints at the deployment level (different from per-request, included for contrast)
+- [cloudflare ai gateway](https://developers.cloudflare.com/ai-gateway/) - the product that sparked the original idea about retry logic at the region level
diff --git a/cf-region-proxy/solution.mdx b/cf-region-proxy/solution.mdx
@@ -0,0 +1,79 @@
+# solution & requirements
+
+a cloudflare workers-based regional proxy that routes requests to specific regions, monitors latency via a background cron, and automatically selects the fastest region using fresh measurements stored in kv.
+
+## the concept
+
+the proxy sits in front of any upstream url. when a request comes in, you either:
+
+1. **specify a region** - the request gets routed through that exact cloudflare region via a durable object with a location hint or jurisdiction
+2. **use auto mode** - the proxy picks the fastest region based on the latest latency data from the background cron
+
+latency data is stored in cloudflare kv with a 10-minute ttl, refreshed each time the cron runs. the cron fires every minute and sends `HEAD` requests to the target url from each supported region- safe, low-overhead, no side effects.
+
+## region selection
+
+### supported regions
+
+cloudflare durable objects support the following location hints:
+
+| code | location |
+|------|----------|
+| wnam | western north america |
+| enam | eastern north america |
+| sam | south america |
+| weur | western europe |
+| eeur | eastern europe |
+| apac | asia-pacific |
+| oc | oceania |
+| afr | africa |
+| me | middle east |
+
+### jurisdiction constraints
+
+on top of region hints, users can also specify a jurisdiction for hard compliance requirements:
+
+| code | jurisdiction |
+|------|--------------|
+| eu | european union |
+| fedramp | fedramp-compliant data centers |
+
+jurisdiction overrides region hint when set. these are coarse controls but useful for compliance scenarios.
+
+## latency monitoring
+
+the cron job runs every minute. for each registered target url, it spawns a durable object per region (using location hints) and sends a `HEAD` request to the url. response time is recorded and written to kv.
+
+why `HEAD`? it's:
+- safe (no side effects, no request body)
+- fast (just headers, no response body to parse)
+- widely supported
+
+the kv key structure per target url stores the latest latency per region and a timestamp. the auto-routing logic reads this to pick the winner for the most recent measurement window.
+
+## retry logic
+
+when a request fails, the proxy can retry through a different region. retry behavior:
+
+- **safe methods (GET, HEAD)**: always retryable, cycle through regions by ascending latency
+- **unsafe methods (POST, PUT, DELETE)**: only retry if the client passes an idempotency key header - otherwise a failed request is not retried to avoid duplicate side effects
+
+## what this is not
+
+this is not a cdn. it's not a load balancer in the traditional sense. it's a programmable routing layer that uses cloudflare's edge network to get requests into the right region. the upstream still handles the actual compute.
+
+## requirements
+
+### must have
+- request routing to a specified cloudflare region (via durable object location hint)
+- jurisdiction support (`eu`, `fedramp`) as an override
+- auto-routing mode that picks the fastest region from kv latency data
+- background cron (every 1 minute) that sends `HEAD` requests to registered urls from each region and writes results to kv
+- kv ttl of 10 minutes for latency records, refreshed on each cron run
+- retry across regions for idempotent requests on failure
+
+### nice to have
+- passive latency tracking from real traffic (rolling average alongside cron data)
+- per-request idempotency key passthrough for safe retries on unsafe methods
+- simple dashboard or log to see current region latency state
+- configurable cron interval and kv ttl