Skip to content

Comments

[MNT] Mock live server calls in tests_classification_task.py#1677

Open
omkar-334 wants to merge 5 commits intoopenml:mainfrom
omkar-334:test-class
Open

[MNT] Mock live server calls in tests_classification_task.py#1677
omkar-334 wants to merge 5 commits intoopenml:mainfrom
omkar-334:test-class

Conversation

@omkar-334
Copy link

Metadata

Details

Used from openml.testing import create_request_response function to mock the API responses.

Copilot AI review requested due to automatic review settings February 20, 2026 16:11
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces live OpenML server calls in the classification task tests with mocked requests.Session.get responses, aligning the tests with the unit-test goal in issue #1649.

Changes:

  • Patch requests.Session.get in tests/test_tasks/test_classification_task.py and serve deterministic responses from local fixtures.
  • Add XML fixtures for task 119 and dataset 20 metadata (description + features).
  • Add an ARFF fixture for the diabetes dataset used by get_X_and_y().

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/test_tasks/test_classification_task.py Replaces live HTTP calls with requests.Session.get mocks backed by local fixture files.
tests/files/mock_responses/tasks/task_description_119.xml Fixture for task/119 response.
tests/files/mock_responses/datasets/data_description_20.xml Fixture for data/20 dataset description response (includes download URL).
tests/files/mock_responses/datasets/data_features_20.xml Fixture for data/features/20 response (includes nominal target labels).
tests/files/mock_responses/datasets/diabetes.arff Fixture for the dataset download used by OpenMLDataset.get_data().
Comments suppressed due to low confidence (1)

tests/test_tasks/test_classification_task.py:22

  • _TASK_RESPONSES is constructed at import time, which reads multiple files (including a ~780-line ARFF) during test collection. This adds unnecessary IO overhead even when these tests are deselected/skipped. Consider building the responses inside each test (or via a fixture/helper that reads lazily) so file IO only happens when the test runs.
_MOCK_DIR = Path(__file__).parent.parent / "files" / "mock_responses"

_TASK_RESPONSES = [
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "tasks" / "task_description_119.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_description_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_features_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "diabetes.arff"),
]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 25 to 55
@@ -18,22 +31,25 @@ def setUp(self, n_levels: int = 1):
self.task_type = TaskType.SUPERVISED_CLASSIFICATION
self.estimation_procedure = 5

@pytest.mark.uses_test_server()
def test_download_task(self):
@mock.patch.object(requests.Session, "get")
def test_download_task(self, mock_get):
mock_get.side_effect = _TASK_RESPONSES[:3]
task = super().test_download_task()
assert task.task_id == self.task_id
assert task.task_type_id == TaskType.SUPERVISED_CLASSIFICATION
assert task.dataset_id == 20
assert task.estimation_procedure_id == self.estimation_procedure

@pytest.mark.uses_test_server()
def test_class_labels(self):
@mock.patch.object(requests.Session, "get")
def test_class_labels(self, mock_get):
mock_get.side_effect = _TASK_RESPONSES[:3]
task = get_task(self.task_id)
assert task.class_labels == ["tested_negative", "tested_positive"]


@pytest.mark.uses_test_server()
def test_get_X_and_Y():
@mock.patch.object(requests.Session, "get")
def test_get_X_and_Y(mock_get):
mock_get.side_effect = _TASK_RESPONSES
task = get_task(119)
X, Y = task.get_X_and_y()
assert X.shape == (768, 8)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenMLClassificationTaskTest still inherits OpenMLSupervisedTaskTest.test_get_X_and_Y (marked uses_test_server) because it is not overridden in the class. That inherited test will still execute and perform live HTTP calls, so this PR doesn't fully remove server dependencies for this module. Consider overriding test_get_X_and_Y as a method on OpenMLClassificationTaskTest (with the same request mocking) and removing/renaming the module-level test_get_X_and_Y to avoid the inherited live-call test from being collected.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 20, 2026 16:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (5)

tests/test_tasks/test_classification_task.py:46

  • Setting mock_get.side_effect to a positional list makes the test depend on the exact number/order of HTTP GETs performed by get_task(). This is brittle: an extra request (or caching behavior change) can shift which payload is returned. Consider using a side_effect function keyed by the requested URL and failing explicitly on unexpected URLs.
    @mock.patch.object(requests.Session, "get")
    def test_class_labels(self, mock_get):
        mock_get.side_effect = _TASK_RESPONSES[:3]
        task = get_task(self.task_id)

tests/test_tasks/test_classification_task.py:52

  • Setting mock_get.side_effect to a positional list makes the test depend on the exact number/order of HTTP GETs performed by get_task() / data download. This is brittle: an extra request (or caching behavior change) can shift which payload is returned. Consider using a side_effect function keyed by the requested URL and failing explicitly on unexpected URLs.
    @mock.patch.object(requests.Session, "get")
    def test_get_X_and_Y(self, mock_get):
        mock_get.side_effect = _TASK_RESPONSES
        task = get_task(self.task_id)

tests/test_tasks/test_classification_task.py:21

  • The mocked requests.Response objects in _TASK_RESPONSES are created once at import time and then reused across tests. Since requests.Response is mutable (e.g., .encoding can be set when .text is accessed), this can make tests order-dependent. Consider generating fresh Response objects per test (e.g., keep filepaths and build responses inside the mock side_effect) to avoid shared state.
_TASK_RESPONSES = [
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "tasks" / "task_description_119.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_description_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_features_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "diabetes.arff"),

tests/test_tasks/test_classification_task.py:22

  • OpenML’s API layer logs a warning when the response lacks Content-Encoding: gzip (see openml._api_calls.__check_response). These mocked responses have empty headers, so the test run will emit warnings and diverge from real server behavior. Consider setting at least Content-Encoding (and optionally encoding='utf-8') on the mocked Response objects.
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "tasks" / "task_description_119.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_description_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_features_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "diabetes.arff"),
]

tests/test_tasks/test_classification_task.py:37

  • Setting mock_get.side_effect to a positional list makes the test depend on the exact number/order of HTTP GETs performed by get_task(). This is brittle: an extra request (or caching behavior change) can shift which payload is returned. Consider using a side_effect function keyed by the requested URL and failing explicitly on unexpected URLs.
    @mock.patch.object(requests.Session, "get")
    def test_download_task(self, mock_get):
        mock_get.side_effect = _TASK_RESPONSES[:3]
        task = super().test_download_task()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 20, 2026 18:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (3)

tests/test_tasks/test_classification_task.py:52

  • Using a list for mock_get.side_effect makes the tests depend on the exact order and count of HTTP GETs performed by get_task() / get_X_and_y(). This is brittle (an extra metadata request or a reordering will raise StopIteration). A more robust approach is a side_effect function that returns the right fixture based on the requested URL (and can raise with a clear message for unexpected URLs).
    @mock.patch.object(requests.Session, "get")
    def test_download_task(self, mock_get):
        mock_get.side_effect = _TASK_RESPONSES[:3]
        task = super().test_download_task()
        assert task.task_id == self.task_id
        assert task.task_type_id == TaskType.SUPERVISED_CLASSIFICATION
        assert task.dataset_id == 20
        assert task.estimation_procedure_id == self.estimation_procedure

    @mock.patch.object(requests.Session, "get")
    def test_class_labels(self, mock_get):
        mock_get.side_effect = _TASK_RESPONSES[:3]
        task = get_task(self.task_id)
        assert task.class_labels == ["tested_negative", "tested_positive"]

    @mock.patch.object(requests.Session, "get")
    def test_get_X_and_Y(self, mock_get):
        mock_get.side_effect = _TASK_RESPONSES
        task = get_task(self.task_id)

tests/test_tasks/test_classification_task.py:22

  • _TASK_RESPONSES is built at import time, which triggers file I/O (reading the XML/ARFF fixtures) during test collection. This introduces import-time side effects and makes the module fail to import if the fixtures are missing/mislocated, before the test runner can report a normal test failure. Consider generating these requests.Response objects inside each test (or via a small helper/fixture) so the file reads happen at test execution time and you get fresh responses per test run.
_MOCK_DIR = Path(__file__).parent.parent / "files" / "mock_responses"

_TASK_RESPONSES = [
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "tasks" / "task_description_119.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_description_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_features_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "diabetes.arff"),
]

tests/test_tasks/test_classification_task.py:22

  • The mocked requests.Response objects created by create_request_response don’t set headers (notably Content-Encoding) or an explicit encoding. In openml._api_calls.__check_response this will emit a warning for every mocked request (“Received uncompressed content...”), making test logs noisy. Setting the response headers/encoding in the mocks (or extending the helper to do so) would avoid these warnings and better match real responses.
_TASK_RESPONSES = [
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "tasks" / "task_description_119.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_description_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "data_features_20.xml"),
    create_request_response(status_code=200, content_filepath=_MOCK_DIR / "datasets" / "diabetes.arff"),
]

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MNT] Replace Live Server Calls with Mocks in Unit Tests

1 participant