This proposal is a milder variant of #599. Instead of completely eliminating ProcessEdgesWork, we propose split it into two traits/structs instead:
Tracer: Provide the trace_object method and the enqueue method.
- Alternatively, provide
trace_object, only, but takes ObjectQueue as a parameter, like most trace_object-like methods in different spaces.
- Eliminate
ProcessEdgesWorkTracer. Just Tracer.
- It can be a sub-trait of the current
ObjectTracer which is exposed via the API and provides only the trace_object method.
ProcessSlots: Provide a Vec<Slot> so that they can be processed in a Plan-specific way.
The point is, one trait is for trace_object, and the other is for slots.
Motivation
ProcessEdgesWork is everywhere. We have seen it for long.
We have known for a long time that ProcessEdgesWork has been everywhere, and was previously even exposed through the VMBinding API. It is the central part of tracing, and it is the de-facto provider of the per-plan trace_object method. We even have things like ProcessEdgesWorktracer which wraps around an empty (no slots) ProcessEdgesWork instance to use its trace_object method. This shows that the trace_object method
- is central to a GC algorithm (particularly, the center of a trace, i.e. transitive closure), but
- has no direct connection with slots because we can call it on
ObjectReference directly.
concurrent_marking_work::ProcessRootSlots is not using trace_object
The concurrent_marking_work::ProcessRootSlots struct doesn't implement trace_object
fn trace_object(&mut self, _object: ObjectReference) -> ObjectReference {
unreachable!()
}
But surprisingly, we have testes ConcurrentImmix on OpenJDK which never uses pinning or transitive-pinning, and it works. concurrent_marking_work::ProcessRootSlots, overrides process_slots (which it is not supposed to do) which
- reads the slots, which is like a conventional
ProcessEdgesWork, and
- stuffs the object references into
ConcurrentTraceObjects work packets, which is unlike ProcessEdgesWork's default implementation which just calls trace_object.
We can see that what ConcurrentImmix wanted was not a ProcessEdgesWork, but just a work packet that just visits slots.
The new pull request #1454 adds trace_object to concurrent_marking_work::ProcessRootSlots because it is used for handling "root nodes" (root edges represented as ObjectReference of the target objects). The ProcessRootNodes work packet calls ProcessRootSlots::trace_object with the intention to trace it, but in the trace_object method added by the PR, trace_object and create_scan_work will add the nodes to a ConcurrentTraceObjects work packet and schedule it. See, our current architecture is abusing ProcessRootSlots to process "root nodes", while it actually wants
- Something that provides
trace_object, and
- some plan-specific work packets that process root nodes (in this case
ConcurrentTraceObjects).
The current plan-agnostic ProcessRootNodes work packet is not customizable for ConcurrentImmix, but it should be overridable.
More use cases in LXR
In LXR, LXRStopTheWorldProcessEdges implements both
ProcessEdgesWork which leaves trace_object empty but has full_gc_trace_object, and
ObjectQueue which has enqueue.
And it calls trace_object with self as the queue.
self.lxr.immix_space.rc_trace_object(
/* queue: */ self,
object,
// more arguments go here...
)
This shows that a work packet that traces object probably also wants to handle the "enqueued" object, where "enqueue" may not necessarily mean literally putting the object in the queue data structure, but just having a way to handle it, be it scanning the object immediately or stuffing the object into a queue or another work packet.
Details
The GCWorkContext trait is still the central trait for plans to customize the work packet types used by a trace (each plan can have multiple traces, such as nursery GC vs mature GC, or concurrent GC vs STW GC). It will provide a concrete Tracer implementation so that other work packets (such as ScanObjects or the prospective ProcessSlots work packet) can depend on that Tracer implementation (instead of ProcessEdgesWork implementation) to call trace_object.
This proposal is a milder variant of #599. Instead of completely eliminating
ProcessEdgesWork, we propose split it into two traits/structs instead:Tracer: Provide thetrace_objectmethod and theenqueuemethod.trace_object, only, but takesObjectQueueas a parameter, like mosttrace_object-like methods in different spaces.ProcessEdgesWorkTracer. JustTracer.ObjectTracerwhich is exposed via the API and provides only thetrace_objectmethod.ProcessSlots: Provide aVec<Slot>so that they can be processed in a Plan-specific way.The point is, one trait is for
trace_object, and the other is for slots.Motivation
ProcessEdgesWorkis everywhere. We have seen it for long.We have known for a long time that
ProcessEdgesWorkhas been everywhere, and was previously even exposed through the VMBinding API. It is the central part of tracing, and it is the de-facto provider of the per-plantrace_objectmethod. We even have things likeProcessEdgesWorktracerwhich wraps around an empty (no slots)ProcessEdgesWorkinstance to use itstrace_objectmethod. This shows that thetrace_objectmethodObjectReferencedirectly.concurrent_marking_work::ProcessRootSlotsis not usingtrace_objectThe
concurrent_marking_work::ProcessRootSlotsstruct doesn't implementtrace_objectBut surprisingly, we have testes ConcurrentImmix on OpenJDK which never uses pinning or transitive-pinning, and it works.
concurrent_marking_work::ProcessRootSlots, overridesprocess_slots(which it is not supposed to do) whichProcessEdgesWork, andConcurrentTraceObjectswork packets, which is unlikeProcessEdgesWork's default implementation which just callstrace_object.We can see that what ConcurrentImmix wanted was not a
ProcessEdgesWork, but just a work packet that just visits slots.The new pull request #1454 adds
trace_objecttoconcurrent_marking_work::ProcessRootSlotsbecause it is used for handling "root nodes" (root edges represented asObjectReferenceof the target objects). TheProcessRootNodeswork packet callsProcessRootSlots::trace_objectwith the intention to trace it, but in thetrace_objectmethod added by the PR,trace_objectandcreate_scan_workwill add the nodes to aConcurrentTraceObjectswork packet and schedule it. See, our current architecture is abusingProcessRootSlotsto process "root nodes", while it actually wantstrace_object, andConcurrentTraceObjects).The current plan-agnostic
ProcessRootNodeswork packet is not customizable for ConcurrentImmix, but it should be overridable.More use cases in LXR
In LXR,
LXRStopTheWorldProcessEdgesimplements bothProcessEdgesWorkwhich leavestrace_objectempty but hasfull_gc_trace_object, andObjectQueuewhich hasenqueue.And it calls
trace_objectwithselfas the queue.This shows that a work packet that traces object probably also wants to handle the "enqueued" object, where "enqueue" may not necessarily mean literally putting the object in the queue data structure, but just having a way to handle it, be it scanning the object immediately or stuffing the object into a queue or another work packet.
Details
The
GCWorkContexttrait is still the central trait for plans to customize the work packet types used by a trace (each plan can have multiple traces, such as nursery GC vs mature GC, or concurrent GC vs STW GC). It will provide a concreteTracerimplementation so that other work packets (such asScanObjectsor the prospectiveProcessSlotswork packet) can depend on thatTracerimplementation (instead ofProcessEdgesWorkimplementation) to calltrace_object.