Skip to content

Fix AsdfFile pickling for instances without file descriptors#2038

Open
sydduckworth wants to merge 14 commits into
asdf-format:mainfrom
sydduckworth:fix-pickle-serialization
Open

Fix AsdfFile pickling for instances without file descriptors#2038
sydduckworth wants to merge 14 commits into
asdf-format:mainfrom
sydduckworth:fix-pickle-serialization

Conversation

@sydduckworth
Copy link
Copy Markdown
Contributor

@sydduckworth sydduckworth commented May 4, 2026

Description

This PR fixes two bugs that prevented AsdfFile instances from being pickled, which partially addresses #1782.
The result is that now asdf files can be pickled if they don't contain a file descriptor, which basically means files that were created from an in-memory dict.

  • Fixed a bug in which AsdfObject instances were failing to deserialize because of their conflicting dict and UserDict base classes. Added a manual __reduce__ override which resolves the problem.
  • Updated ValidatorManager to not return local callables defined inside a method, which can't be pickled.
    • ValidatorManager now returns a new JsonSchemaValidators callable class
    • Significantly simplified ValidatorManager implementation.
  • Added test case to _tests/test_asdf.py to verify the limited pickling that is now supported.

@sydduckworth sydduckworth requested a review from a team as a code owner May 4, 2026 19:21
@sydduckworth sydduckworth requested a review from braingram May 4, 2026 19:24
Comment thread asdf/extension/_manager.py
Comment thread asdf/extension/_validator.py
Comment thread asdf/schema.py
Comment thread asdf/extension/_manager.py Outdated
@sydduckworth
Copy link
Copy Markdown
Contributor Author

sydduckworth commented May 5, 2026

@braingram I've reverted the public API for ValidatorManager.

@sydduckworth sydduckworth requested a review from braingram May 13, 2026 17:06
@braingram
Copy link
Copy Markdown
Contributor

I think we'd benefit from including a wider set of possible AsdfFile instances for initial pickling support.

Testing this PR if validate is called on an AsdfFile (created from an in-memory tree) prior to pickling the pickling fails with:

import asdf, pickle, numpy as np
af = asdf.AsdfFile({"arr": np.arange(42)})
pickle.loads(pickle.dumps(af))  # passes
af.validate()
pickle.loads(pickle.dumps(af))  # fails
TypeError: cannot pickle 'weakref.ReferenceType' object

The same error appears if array storage type is set on "arr" prior to pickling.

af.set_array_storage(af["arr"], "inline")
pickle.loads(pickle.dumps(af))  # fails

As part of this we should address what operations cause pickling to fail and why? What are the "typical" AsdfFile instances we want to support (probably at least a tree created from in-memory objects but also one read from a file and read then modified)?

I think we shouldn't claim pickling support until we support whatever we decide are the "typical" cases.

@sydduckworth
Copy link
Copy Markdown
Contributor Author

Is there a different changelog entry you would prefer for a PR that specifically fixes the two bugs addressed here but that does not fully enable pickling?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants