diff --git a/docs/guides/proxy_management.mdx b/docs/guides/proxy_management.mdx
index a271a063fb..89e25469f0 100644
--- a/docs/guides/proxy_management.mdx
+++ b/docs/guides/proxy_management.mdx
@@ -83,35 +83,17 @@ Your crawlers will now use the selected proxies for all connections.
### IP Rotation and session management
+Every call to
+
`proxyConfiguration.newUrl()`
-allows you to pass a `sessionId` parameter. It will then be used to create a
-`sessionId`-`proxyUrl` pair, and subsequent `newUrl()` calls with the same
-`sessionId` will always return the same `proxyUrl`. This is extremely useful in
-scraping, because you want to create the impression of a real user. See the
-[session management guide](../guides/session-management) and
-`SessionPool` class
-for more information on how keeping a real session helps you avoid blocking.
-
-When no `sessionId` is provided, your proxy URLs are rotated round-robin, whereas Apify Proxy manages their rotation using black magic to get the best performance.
-
-
+returns an independent proxy URL. For Apify Proxy that URL embeds a fresh random
+session id, so consecutive calls resolve to different IP addresses; for custom
+`proxyUrls` the URLs are rotated round-robin.
-
-
-```javascript
-const proxyConfiguration = await Actor.createProxyConfiguration({
- /* opts */
-});
-const sessionPool = await SessionPool.open({
- /* opts */
-});
-const session = await sessionPool.getSession();
-const proxyUrl = proxyConfiguration.newUrl(session.id);
-```
-
-
+Session continuity (using the same IP across multiple requests, e.g. to keep a logged-in session alive) is handled one level up by Crawlee's `SessionPool`: once a `Session` is paired with a proxy URL, the crawler reuses that pairing for subsequent requests tied to the same session. See the
+[session management guide](../guides/session-management) for more details.
```javascript
const proxyConfiguration = await Actor.createProxyConfiguration({
@@ -125,8 +107,6 @@ const crawler = new PuppeteerCrawler({
});
```
-
-
## Apify Proxy vs. Your own proxies
The `ProxyConfiguration` class covers both Apify Proxy and custom proxy URLs so that
diff --git a/docs/upgrading/upgrading_v4.md b/docs/upgrading/upgrading_v4.md
new file mode 100644
index 0000000000..ff4269d1c1
--- /dev/null
+++ b/docs/upgrading/upgrading_v4.md
@@ -0,0 +1,88 @@
+---
+id: upgrading-to-v4
+title: Upgrading to v4
+---
+
+This page summarizes the breaking changes between Apify SDK v3 and v4. Apify SDK v4 adopts the redesigned Crawlee v4 interfaces (`Configuration`, `EventManager`, `StorageClient`, `ProxyConfiguration`), so most of the changes here track the corresponding Crawlee v4 changes.
+
+## Configuration
+
+The `Configuration` class no longer exposes `.get(key)` / `.set(key, value)`. Configuration values are resolved eagerly at construction time and exposed as plain typed properties.
+
+Before (v3):
+
+```ts
+import { Configuration } from 'apify';
+
+const config = Configuration.getGlobalConfig();
+const token = config.get('token');
+config.set('token', 'new-token');
+```
+
+After (v4):
+
+```ts
+import { Configuration } from 'apify';
+
+// Construct with overrides — Configuration is immutable.
+const config = new Configuration({ token: 'new-token' });
+const token = config.token;
+```
+
+Resolution order (highest to lowest priority): constructor options → environment variables → `crawlee.json` → schema defaults.
+
+Empty-string environment variables are treated as unset (they fall through to the schema default) rather than being coerced to `0` / `''` / `false`. For example, `ACTOR_MAX_TOTAL_CHARGE_USD=""` now resolves to `undefined` instead of `0`.
+
+## ProxyConfiguration: `newUrl()` / `newProxyInfo()` no longer take `sessionId`
+
+The `sessionId` parameter has been removed from both `ProxyConfiguration.newUrl()` and `ProxyConfiguration.newProxyInfo()`. Each call now returns an independent URL; for Apify Proxy the SDK mints a fresh random session id internally for every URL it hands out, so consecutive calls resolve to different IPs.
+
+Before (v3):
+
+```ts
+const proxyConfiguration = await Actor.createProxyConfiguration({
+ groups: ['RESIDENTIAL'],
+});
+
+// Sticky pairing: same sessionId → same proxy URL → same IP.
+const url1 = await proxyConfiguration.newUrl('mySession');
+const url2 = await proxyConfiguration.newUrl('mySession'); // === url1
+```
+
+After (v4):
+
+```ts
+const proxyConfiguration = await Actor.createProxyConfiguration({
+ groups: ['RESIDENTIAL'],
+});
+
+// Every call returns an independent URL with its own session id.
+const url1 = await proxyConfiguration.newUrl();
+const url2 = await proxyConfiguration.newUrl(); // !== url1
+```
+
+Session continuity (reusing the same IP across multiple requests) is now handled one level up by Crawlee's `SessionPool`: a `Session` stores the proxy URL it was paired with and the crawler reuses that URL for subsequent requests bound to the same session. When using `CheerioCrawler`, `PlaywrightCrawler`, etc. with `useSessionPool: true`, this is automatic — no code changes are required on the consumer side.
+
+`ProxyInfo` no longer carries a `sessionId` field. If you used it for logging or analytics, parse the `session-` segment out of `proxyInfo.username` instead (it is included for Apify Proxy URLs).
+
+The `tieredProxyUrls` and `tieredProxyConfig` options on `ProxyConfigurationOptions` were dropped in Crawlee v4 ([apify/crawlee#3599](https://github.com/apify/crawlee/pull/3599)) and the SDK no longer threads them through. Migrate to named sessions via `SessionPool` if you relied on tiered rotation.
+
+## EventManager
+
+`PlatformEventManager` now extends Crawlee v4's `EventManager` and integrates with the new service locator. Use `Configuration.getGlobalConfig()` (or pass a `Configuration` instance explicitly) when constructing it directly — the constructor no longer accepts a `config` override via the `override` keyword pattern because Crawlee's base class manages the configuration through `serviceLocator` instead of a `config` field.
+
+If you only interact with events through `Actor.on()` / `Actor.off()` / `Actor.events`, no code changes are needed.
+
+## StorageClient
+
+The SDK's storage layer was adapted to the new Crawlee v4 `StorageClient` interface. The Apify platform client is wrapped via an internal `ApifyStorageClient` adapter that implements `createDatasetClient`, `createKeyValueStoreClient`, and `createRequestQueueClient`.
+
+`KeyValueStore.getPublicUrl()` is now asynchronous (it signs URLs server-side when running on the Apify platform). Update call sites accordingly:
+
+```ts
+// v3
+const url = store.getPublicUrl('myKey');
+
+// v4
+const url = await store.getPublicUrl('myKey');
+```
diff --git a/packages/apify/src/proxy_configuration.ts b/packages/apify/src/proxy_configuration.ts
index 231cfa6db0..e3033d07f8 100644
--- a/packages/apify/src/proxy_configuration.ts
+++ b/packages/apify/src/proxy_configuration.ts
@@ -1,8 +1,6 @@
-import type {
- ProxyConfigurationOptions as CoreProxyConfigurationOptions,
- ProxyInfo as CoreProxyInfo,
-} from '@crawlee/core';
+import type { ProxyConfigurationOptions as CoreProxyConfigurationOptions } from '@crawlee/core';
import { ProxyConfiguration as CoreProxyConfiguration } from '@crawlee/core';
+import type { ProxyInfo as CoreProxyInfo } from '@crawlee/types';
import { gotScraping } from 'got-scraping';
import ow from 'ow';
@@ -12,12 +10,17 @@ import { cryptoRandomObjectId } from '@apify/utilities';
import { Actor } from './actor.js';
import { Configuration } from './configuration.js';
-// https://docs.apify.com/proxy/datacenter-proxy#username-parameters
-const MAX_SESSION_ID_LENGTH = 50;
const CHECK_ACCESS_REQUEST_TIMEOUT_MILLIS = 4_000;
const CHECK_ACCESS_MAX_ATTEMPTS = 2;
const COUNTRY_CODE_REGEX = /^[A-Z]{2}$/;
+// Apify Proxy session identifier embedded in the proxy username — opaque to
+// users; a fresh one is minted for every URL the SDK hands out so that the
+// returned proxy URLs are independent.
+const SESSION_ID_LENGTH = 12;
+
+type NewUrlOptions = Parameters[0];
+
export interface ProxyConfigurationOptions
extends CoreProxyConfigurationOptions {
/**
@@ -56,15 +59,6 @@ export interface ProxyConfigurationOptions
* configurate the proxy by UI input schema. You should use the `countryCode` option in your crawler code.
*/
apifyProxyCountry?: string;
-
- /**
- * Multiple different ProxyConfigurationOptions stratified into tiers. Crawlee crawlers will switch between those tiers
- * based on the blocked request statistics.
- */
- tieredProxyConfig?: Omit<
- ProxyConfigurationOptions,
- keyof CoreProxyConfigurationOptions | 'tieredProxyConfig'
- >[];
}
/**
@@ -91,9 +85,6 @@ export interface ProxyConfigurationOptions
* requestHandler({ proxyInfo }) {
* // Getting used proxy URL
* const proxyUrl = proxyInfo.url;
- *
- * // Getting ID of used Session
- * const sessionIdentifier = proxyInfo.sessionId;
* }
* })
*
@@ -104,7 +95,7 @@ export interface ProxyInfo extends CoreProxyInfo {
* An array of proxy groups to be used by the [Apify Proxy](https://docs.apify.com/proxy).
* If not provided, the proxy will select the groups automatically.
*/
- groups: string[];
+ groups?: string[];
/**
* If set and relevant proxies are available in your Apify account, all proxied requests will
@@ -193,10 +184,6 @@ export class ProxyConfiguration extends CoreProxyConfiguration {
apifyProxyCountry:
ow.optional.string.matches(COUNTRY_CODE_REGEX),
password: ow.optional.string,
- tieredProxyUrls: ow.optional.array.ofType(
- ow.array.ofType(ow.string),
- ),
- tieredProxyConfig: ow.optional.array.ofType(ow.object),
}),
);
@@ -206,19 +193,8 @@ export class ProxyConfiguration extends CoreProxyConfiguration {
countryCode,
apifyProxyCountry,
password = config.proxyPassword,
- tieredProxyConfig,
- tieredProxyUrls,
} = options;
- this.tieredProxyUrls ??= tieredProxyUrls;
-
- if (tieredProxyConfig) {
- this.tieredProxyUrls = this._generateTieredProxyUrls(
- tieredProxyConfig,
- options,
- );
- }
-
const groupsToUse = groups.length ? groups : apifyProxyGroups;
const countryCodeToUse = countryCode || apifyProxyCountry;
const hostname = config.proxyHostname;
@@ -241,7 +217,7 @@ export class ProxyConfiguration extends CoreProxyConfiguration {
this.port = port;
this.usesApifyProxy = !this.proxyUrls && !this.newUrlFunction;
- if (proxyUrls && proxyUrls.some((url) => url.includes('apify.com'))) {
+ if (proxyUrls && proxyUrls.some((url) => url?.includes('apify.com'))) {
this.log.warning(
'Some Apify proxy features may work incorrectly. Please consider setting up Apify properties instead of `proxyUrls`.\n' +
'See https://sdk.apify.com/docs/guides/proxy-management#apify-proxy-configuration',
@@ -287,143 +263,65 @@ export class ProxyConfiguration extends CoreProxyConfiguration {
}
/**
- * This function creates a new {@apilink ProxyInfo} info object.
- * It is used by CheerioCrawler and PuppeteerCrawler to generate proxy URLs and also to allow the user to inspect
- * the currently used proxy via the requestHandler parameter `proxyInfo`.
- * Use it if you want to work with a rich representation of a proxy URL.
- * If you need the URL string only, use {@apilink ProxyConfiguration.newUrl}.
- * @param [sessionId]
- * Represents the identifier of user {@apilink Session} that can be managed by the {@apilink SessionPool} or
- * you can use the Apify Proxy [Session](https://docs.apify.com/proxy#sessions) identifier.
- * When the provided sessionId is a number, it's converted to a string. Property sessionId of
- * {@apilink ProxyInfo} is always returned as a type string.
- *
- * All the HTTP requests going through the proxy with the same session identifier
- * will use the same target proxy server (i.e. the same IP address).
- * The identifier must not be longer than 50 characters and include only the following: `0-9`, `a-z`, `A-Z`, `"."`, `"_"` and `"~"`.
- * @return Represents information about used proxy and its configuration.
+ * Returns a new {@apilink ProxyInfo} object with a fresh proxy URL. Each call mints an
+ * independent URL; for Apify Proxy a random session id is embedded so consecutive
+ * calls resolve to different IPs.
*/
override async newProxyInfo(
- sessionId?: string | number,
- options?: Parameters[1],
+ options?: NewUrlOptions,
): Promise {
- if (typeof sessionId === 'number') sessionId = `${sessionId}`;
- ow(
- sessionId,
- ow.optional.string
- .maxLength(MAX_SESSION_ID_LENGTH)
- .matches(APIFY_PROXY_VALUE_REGEX),
- );
-
- const proxyInfo = await super.newProxyInfo(sessionId, options);
- if (!proxyInfo) return proxyInfo;
-
- const { groups, countryCode, password, port, hostname } = (
- this.usesApifyProxy ? this : new URL(proxyInfo.url)
- ) as ProxyConfiguration;
-
- return {
- ...proxyInfo,
- sessionId,
- groups,
- countryCode,
- // this.password is not encoded, but the password from the URL will be, we need to normalize
- password: this.usesApifyProxy
- ? (password ?? '')
- : decodeURIComponent(password!),
- hostname,
- port: port!,
+ const url = await this.newUrl(options);
+ if (!url) return undefined;
+
+ const parsed = new URL(url);
+ const result: ProxyInfo = {
+ url,
+ username: decodeURIComponent(parsed.username),
+ password: decodeURIComponent(parsed.password),
+ hostname: parsed.hostname,
+ port: parsed.port,
};
+ if (this.usesApifyProxy) {
+ result.groups = this.groups;
+ if (this.countryCode !== undefined)
+ result.countryCode = this.countryCode;
+ }
+ return result;
}
/**
- * Returns a new proxy URL based on provided configuration options and the `sessionId` parameter.
- * @param [sessionId]
- * Represents the identifier of user {@apilink Session} that can be managed by the {@apilink SessionPool} or
- * you can use the Apify Proxy [Session](https://docs.apify.com/proxy#sessions) identifier.
- * When the provided sessionId is a number, it's converted to a string.
- *
- * All the HTTP requests going through the proxy with the same session identifier
- * will use the same target proxy server (i.e. the same IP address).
- * The identifier must not be longer than 50 characters and include only the following: `0-9`, `a-z`, `A-Z`, `"."`, `"_"` and `"~"`.
- * @return A string with a proxy URL, including authentication credentials and port number.
- * For example, `http://bob:password123@proxy.example.com:8000`
+ * Returns a new proxy URL. For Apify Proxy, each call generates a URL with a fresh
+ * random session id, so consecutive calls return independent URLs. For custom
+ * `proxyUrls`, the URLs are rotated round-robin.
*/
override async newUrl(
- sessionId?: string | number,
- options?: Parameters[1],
+ options?: NewUrlOptions,
): Promise {
- if (typeof sessionId === 'number') sessionId = `${sessionId}`;
- ow(
- sessionId,
- ow.optional.string
- .maxLength(MAX_SESSION_ID_LENGTH)
- .matches(APIFY_PROXY_VALUE_REGEX),
- );
- if (this.newUrlFunction) {
- return (
- (await this._callNewUrlFunction(sessionId, {
- request: options?.request,
- })) ?? undefined
- );
- }
- if (this.proxyUrls) {
- return this._handleCustomUrl(sessionId);
- }
-
- if (this.tieredProxyUrls) {
- return (
- this._handleTieredUrl(
- sessionId ?? cryptoRandomObjectId(6),
- options,
- ).proxyUrl ?? undefined
- );
+ if (this.newUrlFunction || this.proxyUrls) {
+ return super.newUrl(options);
}
-
- return this.composeDefaultUrl(sessionId);
- }
-
- protected _generateTieredProxyUrls(
- tieredProxyConfig: NonNullable<
- ProxyConfigurationOptions['tieredProxyConfig']
- >,
- globalOptions: ProxyConfigurationOptions,
- ) {
- return tieredProxyConfig.map((config) => [
- new ProxyConfiguration({
- ...globalOptions,
- ...config,
- tieredProxyConfig: undefined,
- }).composeDefaultUrl(),
- ]);
+ return this.composeDefaultUrl(cryptoRandomObjectId(SESSION_ID_LENGTH));
}
/**
* Returns proxy username.
*/
- protected _getUsername(sessionId?: string): string {
- let username;
+ protected _getUsername(sessionId: string): string {
const { groups, countryCode } = this;
const parts: string[] = [];
if (groups && groups.length) {
parts.push(`groups-${groups.join('+')}`);
}
- if (sessionId) {
- parts.push(`session-${sessionId}`);
- }
+ parts.push(`session-${sessionId}`);
if (countryCode) {
parts.push(`country-${countryCode}`);
}
- username = parts.join(',');
-
- if (parts.length === 0) username = 'auto';
-
- return username;
+ return parts.join(',');
}
- protected composeDefaultUrl(sessionId?: string): string {
+ protected composeDefaultUrl(sessionId: string): string {
const username = this._getUsername(sessionId);
const url = new URL(`http://${this.hostname}:${this.port}`);
url.username = `${username}`;
diff --git a/test/apify/proxy_configuration.test.ts b/test/apify/proxy_configuration.test.ts
index 8c61a63177..718d9db1bb 100644
--- a/test/apify/proxy_configuration.test.ts
+++ b/test/apify/proxy_configuration.test.ts
@@ -1,25 +1,26 @@
import { Actor, ProxyConfiguration } from 'apify';
import { UserClient } from 'apify-client';
-import { type Dictionary, Request, sleep } from 'crawlee';
+import { type Dictionary } from 'crawlee';
import { gotScraping } from 'got-scraping';
import { APIFY_ENV_VARS, LOCAL_APIFY_ENV_VARS } from '@apify/consts';
+import { resetGlobalState } from '../resetGlobalState.js';
+
const groups = ['GROUP1', 'GROUP2'];
const hostname = LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_HOSTNAME];
const port = Number(LOCAL_APIFY_ENV_VARS[APIFY_ENV_VARS.PROXY_PORT]);
const password = 'test12345';
const countryCode = 'CZ';
-const sessionId = 538909250932;
const basicOpts = {
groups,
countryCode,
password,
};
-const basicOptsProxyUrl =
- 'http://groups-GROUP1+GROUP2,session-538909250932,country-CZ:test12345@proxy.apify.com:8000';
-const proxyUrlNoSession =
- 'http://groups-GROUP1+GROUP2,country-CZ:test12345@proxy.apify.com:8000';
+// Apify Proxy URLs always carry a fresh random `session-XXXX` segment; tests
+// match against this pattern rather than a hard-coded session id.
+const apifyProxyUrlPattern =
+ /^http:\/\/groups-GROUP1\+GROUP2,session-[A-Za-z0-9]+,country-CZ:test12345@proxy\.apify\.com:8000$/;
vitest.mock('got-scraping', async () => {
return {
@@ -54,48 +55,45 @@ describe('ProxyConfiguration', () => {
expect(proxyConfiguration.port).toBe(port);
});
- test('newUrl() should return proxy URL', async () => {
+ test('newUrl() returns an Apify Proxy URL with a random session id', async () => {
const proxyConfiguration = new ProxyConfiguration(basicOpts);
- expect(await proxyConfiguration.newUrl(sessionId)).toBe(
- basicOptsProxyUrl,
- );
+ const url1 = await proxyConfiguration.newUrl();
+ const url2 = await proxyConfiguration.newUrl();
+
+ expect(url1).toMatch(apifyProxyUrlPattern);
+ expect(url2).toMatch(apifyProxyUrlPattern);
+ // Consecutive calls must produce independent URLs.
+ expect(url1).not.toBe(url2);
});
- test('newProxyInfo() should return ProxyInfo object', async () => {
+ test('newProxyInfo() returns a ProxyInfo object with a fresh URL', async () => {
const proxyConfiguration = new ProxyConfiguration(basicOpts);
- const url = basicOptsProxyUrl;
- const proxyInfo = {
- sessionId: `${sessionId}`,
- url,
- groups,
- countryCode,
- password,
- hostname,
- port,
- username: 'groups-GROUP1+GROUP2,session-538909250932,country-CZ',
- };
- expect(await proxyConfiguration.newProxyInfo(sessionId)).toEqual(
- proxyInfo,
+ const info = await proxyConfiguration.newProxyInfo();
+ expect(info).toBeDefined();
+ expect(info!.url).toMatch(apifyProxyUrlPattern);
+ expect(info!.groups).toEqual(groups);
+ expect(info!.countryCode).toBe(countryCode);
+ expect(info!.password).toBe(password);
+ expect(info!.hostname).toBe(hostname);
+ expect(info!.port).toBe(String(port));
+ expect(info!.username).toMatch(
+ /^groups-GROUP1\+GROUP2,session-[A-Za-z0-9]+,country-CZ$/,
);
});
- test('newProxyInfo() works with special characters', async () => {
+ test('newProxyInfo() works with custom proxyUrls and special characters', async () => {
const url = 'http://user%40name:pass%40word@proxy.com:1111';
const proxyConfiguration = new ProxyConfiguration({ proxyUrls: [url] });
- const proxyInfo = {
- sessionId: `${sessionId}`,
+ expect(await proxyConfiguration.newProxyInfo()).toEqual({
url,
username: 'user@name',
password: 'pass@word',
hostname: 'proxy.com',
port: '1111',
- };
- expect(await proxyConfiguration.newProxyInfo(sessionId)).toEqual(
- proxyInfo,
- );
+ });
});
test('actor UI input schema should work', () => {
@@ -168,37 +166,6 @@ describe('ProxyConfiguration', () => {
expect(() => new ProxyConfiguration({ countryCode: 1111 })).toThrow();
});
- test('newUrl() should throw on invalid session argument', async () => {
- const proxyConfiguration = new ProxyConfiguration();
- await Promise.all([
- expect(async () =>
- proxyConfiguration.newUrl('a-b'),
- ).rejects.toThrow(),
- expect(proxyConfiguration.newUrl('a$b')).rejects.toThrow(),
- // @ts-expect-error invalid input
- expect(proxyConfiguration.newUrl({})).rejects.toThrow(),
- // @ts-expect-error invalid input
- expect(proxyConfiguration.newUrl(new Date())).rejects.toThrow(),
- expect(
- proxyConfiguration.newUrl(Array(51).fill('x').join('')),
- ).rejects.toThrow(),
-
- expect(proxyConfiguration.newUrl('a_b')).resolves.not.toThrow(),
- expect(
- proxyConfiguration.newUrl('0.34252352'),
- ).resolves.not.toThrow(),
- expect(proxyConfiguration.newUrl('aaa~BBB')).resolves.not.toThrow(),
- expect(proxyConfiguration.newUrl('a_1_b')).resolves.not.toThrow(),
- expect(proxyConfiguration.newUrl('a_2')).resolves.not.toThrow(),
- expect(proxyConfiguration.newUrl('a')).resolves.not.toThrow(),
- expect(proxyConfiguration.newUrl('1')).resolves.not.toThrow(),
- expect(proxyConfiguration.newUrl(123456)).resolves.not.toThrow(),
- expect(
- proxyConfiguration.newUrl(Array(50).fill('x').join('')),
- ).resolves.not.toThrow(),
- ]);
- });
-
test('should throw on invalid newUrlFunction', async () => {
const newUrlFunction = () => {
return 'http://proxy.com:1111*invalid_url';
@@ -243,7 +210,6 @@ describe('ProxyConfiguration', () => {
'http://proxy.com:4444',
);
- // TODO enable strictNullChecks in tests
// through newProxyInfo()
expect((await proxyConfiguration.newProxyInfo())?.url).toEqual(
'http://proxy.com:3333',
@@ -256,46 +222,6 @@ describe('ProxyConfiguration', () => {
);
});
- test('async newUrlFunction should work correctly', async () => {
- const customUrls = [
- 'http://proxy.com:1111',
- 'http://proxy.com:2222',
- 'http://proxy.com:3333',
- 'http://proxy.com:4444',
- 'http://proxy.com:5555',
- 'http://proxy.com:6666',
- ];
- const newUrlFunction = async () => {
- await sleep(5);
- return customUrls.pop() ?? null;
- };
- const proxyConfiguration = new ProxyConfiguration({
- newUrlFunction,
- });
-
- // through newUrl()
- expect(await proxyConfiguration.newUrl()).toEqual(
- 'http://proxy.com:6666',
- );
- expect(await proxyConfiguration.newUrl()).toEqual(
- 'http://proxy.com:5555',
- );
- expect(await proxyConfiguration.newUrl()).toEqual(
- 'http://proxy.com:4444',
- );
-
- // through newProxyInfo()
- expect((await proxyConfiguration.newProxyInfo())!.url).toEqual(
- 'http://proxy.com:3333',
- );
- expect((await proxyConfiguration.newProxyInfo())!.url).toEqual(
- 'http://proxy.com:2222',
- );
- expect((await proxyConfiguration.newProxyInfo())!.url).toEqual(
- 'http://proxy.com:1111',
- );
- });
-
describe('With proxyUrls options', () => {
test('should rotate custom URLs correctly', async () => {
const proxyConfiguration = new ProxyConfiguration({
@@ -347,62 +273,6 @@ describe('ProxyConfiguration', () => {
);
});
- test('should rotate custom URLs with sessions correctly', async () => {
- const sessions = [
- 'sesssion_01',
- 'sesssion_02',
- 'sesssion_03',
- 'sesssion_04',
- 'sesssion_05',
- 'sesssion_06',
- ];
- const proxyConfiguration = new ProxyConfiguration({
- proxyUrls: [
- 'http://proxy.com:1111',
- 'http://proxy.com:2222',
- 'http://proxy.com:3333',
- ],
- });
-
- // @ts-expect-error TODO private property?
- const { proxyUrls } = proxyConfiguration;
- // should use same proxy URL
- expect(await proxyConfiguration.newUrl(sessions[0])).toEqual(
- proxyUrls![0],
- );
- expect(await proxyConfiguration.newUrl(sessions[0])).toEqual(
- proxyUrls![0],
- );
- expect(await proxyConfiguration.newUrl(sessions[0])).toEqual(
- proxyUrls![0],
- );
-
- // should rotate different proxies
- expect(await proxyConfiguration.newUrl(sessions[1])).toEqual(
- proxyUrls![1],
- );
- expect(await proxyConfiguration.newUrl(sessions[2])).toEqual(
- proxyUrls![2],
- );
- expect(await proxyConfiguration.newUrl(sessions[3])).toEqual(
- proxyUrls![0],
- );
- expect(await proxyConfiguration.newUrl(sessions[4])).toEqual(
- proxyUrls![1],
- );
- expect(await proxyConfiguration.newUrl(sessions[5])).toEqual(
- proxyUrls![2],
- );
-
- // should remember already used session
- expect(await proxyConfiguration.newUrl(sessions[1])).toEqual(
- proxyUrls![1],
- );
- expect(await proxyConfiguration.newUrl(sessions[3])).toEqual(
- proxyUrls![0],
- );
- });
-
test('should throw cannot combine custom proxies with Apify Proxy', async () => {
const proxyUrls = [
'http://proxy.com:1111',
@@ -485,81 +355,17 @@ describe('ProxyConfiguration', () => {
}
});
});
-
- describe('With tieredProxyUrls', () => {
- test('proxy configuration accepts the tiered urls (Crawlee style)', async () => {
- const proxyConfiguration = new ProxyConfiguration({
- tieredProxyUrls: [
- ['http://proxy.com:1111'],
- ['http://proxy.com:2222'],
- ['http://proxy.com:3333'],
- ['http://proxy.com:4444'],
- ],
- });
-
- // through newUrl()
- expect(
- await proxyConfiguration.newUrl('abc', {
- request: new Request({ url: 'http://example.com' }) as any,
- }),
- ).toEqual('http://proxy.com:1111');
-
- // through newProxyInfo()
- expect(
- (await proxyConfiguration.newProxyInfo('abc', {
- request: new Request({
- url: 'http://example.com',
- }) as any,
- }))!.url,
- ).toEqual('http://proxy.com:1111');
- });
-
- test('shorthand tieredProxyConfig gets correctly expanded', async () => {
- const proxyConfiguration = new ProxyConfiguration({
- password: 'password',
- countryCode: 'DE',
- tieredProxyConfig: [
- {
- groups: ['GROUP1'],
- countryCode: 'CZ',
- },
- {
- groups: ['GROUP2'],
- countryCode: 'US',
- },
- {
- groups: ['GROUP3', 'GROUP4'],
- },
- {
- groups: ['GROUP3', 'GROUP4'],
- countryCode: undefined,
- },
- ],
- });
-
- // eslint-disable-next-line dot-notation
- expect(proxyConfiguration['tieredProxyUrls']).toEqual([
- [
- 'http://groups-GROUP1,country-CZ:password@proxy.apify.com:8000',
- ],
- [
- 'http://groups-GROUP2,country-US:password@proxy.apify.com:8000',
- ],
- [
- 'http://groups-GROUP3+GROUP4,country-DE:password@proxy.apify.com:8000',
- ],
- ['http://groups-GROUP3+GROUP4:password@proxy.apify.com:8000'],
- ]);
- });
- });
});
describe('Actor.createProxyConfiguration()', () => {
const userData = { proxy: { password } };
+ beforeEach(() => {
+ resetGlobalState();
+ });
+
test('should work with all options', async () => {
const status = { connected: true };
- const proxyUrl = proxyUrlNoSession;
const url = 'http://proxy.apify.com/?format=json';
gotScrapingSpy.mockResolvedValueOnce({ body: status } as any);
@@ -580,7 +386,7 @@ describe('Actor.createProxyConfiguration()', () => {
expect(gotScrapingSpy).toBeCalledWith({
url,
- proxyUrl,
+ proxyUrl: expect.stringMatching(apifyProxyUrlPattern),
timeout: { request: 4000 },
responseType: 'json',
});
@@ -704,7 +510,11 @@ describe('Actor.createProxyConfiguration()', () => {
await Actor.createProxyConfiguration();
expect(gotScrapingSpy).toBeCalledWith({
url: `${process.env.APIFY_PROXY_STATUS_URL}/?format=json`,
- proxyUrl: `http://auto:${password}@${process.env.APIFY_PROXY_HOSTNAME}:8000`,
+ proxyUrl: expect.stringMatching(
+ new RegExp(
+ `^http://session-[A-Za-z0-9]+:${password}@${process.env.APIFY_PROXY_HOSTNAME}:8000$`,
+ ),
+ ),
responseType: 'json',
timeout: {
request: 4000,
@@ -713,71 +523,4 @@ describe('Actor.createProxyConfiguration()', () => {
gotScrapingSpy.mockRestore();
});
-
- describe('With tieredProxyUrls', () => {
- test('proxy configuration accepts the tiered urls (Crawlee style)', async () => {
- const proxyConfiguration = await Actor.createProxyConfiguration({
- tieredProxyUrls: [
- ['http://proxy.com:1111'],
- ['http://proxy.com:2222'],
- ['http://proxy.com:3333'],
- ['http://proxy.com:4444'],
- ],
- });
-
- // through newUrl()
- expect(
- await proxyConfiguration!.newUrl('abc', {
- request: new Request({ url: 'http://example.com' }) as any,
- }),
- ).toEqual('http://proxy.com:1111');
-
- // through newProxyInfo()
- expect(
- (await proxyConfiguration!.newProxyInfo('abc', {
- request: new Request({
- url: 'http://example.com',
- }) as any,
- }))!.url,
- ).toEqual('http://proxy.com:1111');
- });
-
- test('shorthand tieredProxyConfig gets correctly expanded', async () => {
- const proxyConfiguration = await Actor.createProxyConfiguration({
- password: 'password',
- countryCode: 'DE',
- tieredProxyConfig: [
- {
- groups: ['GROUP1'],
- countryCode: 'CZ',
- },
- {
- groups: ['GROUP2'],
- countryCode: 'US',
- },
- {
- groups: ['GROUP3', 'GROUP4'],
- },
- {
- groups: ['GROUP3', 'GROUP4'],
- countryCode: undefined,
- },
- ],
- });
-
- // eslint-disable-next-line dot-notation
- expect(proxyConfiguration!['tieredProxyUrls']).toEqual([
- [
- 'http://groups-GROUP1,country-CZ:password@proxy.apify.com:8000',
- ],
- [
- 'http://groups-GROUP2,country-US:password@proxy.apify.com:8000',
- ],
- [
- 'http://groups-GROUP3+GROUP4,country-DE:password@proxy.apify.com:8000',
- ],
- ['http://groups-GROUP3+GROUP4:password@proxy.apify.com:8000'],
- ]);
- });
- });
});