Skip to content

Document how to use a component reading to create a new source#131

Draft
nvaytet wants to merge 3 commits intomainfrom
reading-source-reuse
Draft

Document how to use a component reading to create a new source#131
nvaytet wants to merge 3 commits intomainfrom
reading-source-reuse

Conversation

@nvaytet
Copy link
Member

@nvaytet nvaytet commented Mar 5, 2026

We add section to the docs where we show how to use a component reading to create a more efficient source.

We histogram the neutrons at the detector in wavelength and birth_time, smooth the histogram, and use it as a new distribution to sample from.

Screenshot_20260305_151531 Screenshot_20260305_151553

Can speed up calculations.
We should consider making the final function into a classmethod of the Source: Source.from_reading(...).

@nvaytet nvaytet requested a review from jokasimr March 5, 2026 14:27
@nvaytet nvaytet marked this pull request as draft March 5, 2026 15:05
@jokasimr
Copy link

jokasimr commented Mar 5, 2026

Interesting! Could give a big speedup in some cases.

However, I'd like to suggest a modification. Currently the probability of sampling neutrons from point $(\lambda, t, t^o)$ where ($t$ is the time in the detector, and $t^o$ is the birth time) with the new source will change from what it was using the original source. It might be close to the same, and maybe it does not matter in practice. But we can actually make them be the same, with a little bit more work.

If we have a source distribution $q_0$ and we have identified a region $(\lambda, t^o) \in M$ such that outside $M$ the probability the neutron passes the chopper cascade is essentially zero. Then we can define a new source probability density

$$q(\lambda, t^o) = \mathcal{1}_M \frac{q_0(\lambda, t^o)}{\int_M q_0(\lambda, t^o) d\lambda \ d t^o}$$

such that the probability density of $\lambda, t^o$ given that the neutron passed the chopper cascade is the same for the source $q_0$ as it is for $q$.

Intuitively, with the new source we only increase the probability of sampling in the region where we actually need to sample, but in that region, the ratio of the probabilities of sampling two points does not change.

After applying the gaussian kernel to $p$ I'd say we have identified $M$. It's the region where $p>0$ (or perhaps where $p>\varepsilon$ for some small $\varepsilon$ because the gaussian kernel might introduce some nonzero weight everywhere, I'm not sure).

I'm imagining it would look something like this:

def source_from_reading(\n",
    reading,
    original_source,
    neutrons: int,
    time_smooth_kernel: sc.Variable | None = None,
    wavelength_smooth_kernel: sc.Variable | None = None,
):
    from scipp.scipy.ndimage import gaussian_filter

    p = reading.data.hist(wavelength=original_source.coords['wavelength'], birth_time=original_source.coords['birth_time'])
    if time_smooth_kernel is None:
        time_smooth_kernel = 2 * time_binning
    if wavelength_smooth_kernel is None:
        wavelength_smooth_kernel = 2 * wavelength_binning
    p = gaussian_filter(
        p,
        sigma={
            'birth_time': time_smooth_kernel,
            'wavelength': wavelength_smooth_kernel,
        },
    )
    q = original_source * (p > 1e-8) / (original_source * (p > 1e-8)).sum()
    return tof.Source.from_distribution(neutrons=neutrons, p=q)"

@nvaytet
Copy link
Member Author

nvaytet commented Mar 6, 2026

Thanks for looking into this 👍

So you are saying that you'd rather apply a mask on the original source. I like the idea.
However, as it's written now

q = original_source * (p > 1e-8) / (original_source * (p > 1e-8)).sum()

would require p to have the same shape as original_source.

Basically it would be really nice if we could come up with a procedure where the mask we put on the source does not depend on either the initial binning for the histogram, nor the width of the gaussian kernel.
My intuition is that we can probably achieve the former but not the latter (i.e. the gaussian smoothing will always have an effect).

Currently, the ess source is sampled from a data array with a defined shape, so we could use that original shape to define our bins for histogramming?
Screenshot_20260306_154130

@jokasimr
Copy link

jokasimr commented Mar 6, 2026

So you are saying that you'd rather apply a mask on the original source.

Yes that's it. Because then the distribution in the detector will be unchanged, assuming we sampled a reasonable number of points to figure out what region of the source makes it to the detector.

As written now, would require p to have the same shape as original_source.
It would be really nice if we could come up with a procedure where the mask we put on the source does not depend on either the initial binning for the histogram, nor the width of the gaussian kernel.

I don't understand why we want the shape of the new source to be independent of the original source.
Why does that matter?

If we want the new source to be finer than the original source, then we can first up-sample the original source (for example by using scipy.ndimage.zoom, or scipp.rebin) to the desired shape, and then do the same thing.

Making the new source coarser than the original source does not seem useful, but I might be missing something.

@nvaytet
Copy link
Member Author

nvaytet commented Mar 6, 2026

I meant that to do an operation such as q = original_source * p, Scipp requires the DataArrays p and original_source to have the same shape.

In my current version, p is defined by some binning I randomly chose when looking at the detector events.

I had missed that in the code you proposed, you suggested to use for histogramming the bin edges from the original source, which ensures consistency.

Base automatically changed from inelastic-sample to main March 11, 2026 09:10
@nvaytet
Copy link
Member Author

nvaytet commented Mar 16, 2026

@jokasimr I had an alternative idea for masking the original source distribution.

I am wondering if instead of creating a new source, this couldn't be something we always make available in the Model. Maybe via a optimize=True kwarg?
It should be possible to figure out relatively quickly if some combinations of initial wavelength and birth time are not viable, from the rays that get blocked?

Can we make something where we try with a fraction of the neutrons requested by the user, and every time a neutron gets blocked by a component, we lower the probability distribution in the square (lambda, birth_time) where that neutron originated from by a small factor (maybe something related to how many neutrons were requested, to avoid having that value as yet another free parameter).

So the squares that are producing many blocked neutrons would gradually see their probability fall to zero, but we don't want to mask the square entirely if only one neutron from there was blocked.

We could use chunks of neutrons of size [30%, 25%, 20%, 15%, 10%] so that we learn from a large chunk to begin with, and then keep refining as we go along. But I expect that the last 3 chunks would contain very few blocked neutrons? (according to Peter, this is called 'adaptive importance sampling').
The old behaviour would be recovered by sending 100% in the first chunk.

Finally, this could be an even bigger advantage for pulse skipping instruments. Instead of making a probability distribution for just one pulse, we concatenate it over the number of pulses, and sample from that. The gradual masking would mean that the probability inside every other pulse would drop drastically uniformly as all neutrons there get blocked, and so we don't waste compute by sampling neutrons from those blocked pulses. One caveat is that sampling this way means that we will no longer necessarily have exactly the same number of neutrons inside each pulse (I haven't found a good way to do so at least). It would be close but not exact, as it is now. If we cannot fix this, it would mean we would have to return the results as a DataGroup with one entry per pulse, instead of a 2D DataArray with a pulse dimension, as we do now. This would maybe be quite annoying (although could make some thing like plotting actually easier).

@YooSunYoung
Copy link
Member

YooSunYoung commented Mar 16, 2026

One caveat is that sampling this way means that we will no longer necessarily have exactly the same number of neutrons inside each pulse (I haven't found a good way to do so at least).

We could just group/bin the events per pulse and return the binned data, no? Then the number of events don't have to be the same for each pulse.

I am wondering if instead of creating a new source, this couldn't be something we always make available in the Model. Maybe via a optimize=True kwarg?

How about we specify the target of optimization? like, we only sample the events that survive at a certain point.

Sth like... run(optimize_for='detector') would only sample the events from the wavelength, birth_time range of the source, where the events survive at detector, and run(optimize_for='sample') at the sample.

Then we won't need general sampling optimization like importance sampling,
since you can easily switch between sampling optimization points.

@nvaytet
Copy link
Member Author

nvaytet commented Mar 16, 2026

We could just group/bin the events per pulse and return the binned data, no? Then the number of events don't have to be the same for each pulse.

That's an option for sure. But then users would have to know a bit about binned data to work with it, which may be more of a learning curve.

How about we specify the target of optimization? like, we only sample the events that survive at a certain point.
Sth like... run(optimize_for='detector') would only sample the events from the wavelength, birth_time range of the source, where the events survive at detector, and run(optimize_for='sample') at the sample.

Yes, optimizing up to a component is definitely something we want to have, like in the case of T-rex 👍

@jokasimr
Copy link

@bingli621 also brought up that it might be easy to make mistakes using this optimization, for example if the user uses the optimization optimize_for='detector' but then looks at the data in a monitor just after the first chopper. Then the result they see will not be correct.

Is there some way we can avoid that? For example, if selecting optimize_for=something, should we remove the detectors/monitors that comes before "something"?

@jokasimr
Copy link

jokasimr commented Mar 16, 2026

The adaptive approach sounds really nice. But it is a bit complicated.

Running with 30% of all neutrons in the first test will limit the speedup we can achieve to ~3x.
One alternative if we want to make it faster: We can optimize the source for each chopper in sequence, starting with the one closest to the source and moving toward the target component.

Something like:

  1. Optimize the source distribution for the first chopper. Use 100_000 neutrons, it is small enough that it will run quickly, and probably large enough that it will give us a good idea about what region of the source passes the chopper.

  2. Using the new source distribution, optimize the source again for the second chopper. Use another 100_000 neutrons. And so on, until you reach the component that the user requested.

  3. Then run the full number of neutrons using the final optimized source.

But that would of course be yet a bit more complicated.

@nvaytet
Copy link
Member Author

nvaytet commented Mar 16, 2026

Is there some way we can avoid that? For example, if selecting optimize_for=something, should we remove the detectors/monitors that comes before "something"?

I don't see a good way to avoid this. Removing detectors would be possible, but you can just as easily request a Chopper reading and that would also be 'wrong' (or not properly flooded with neutrons).

I think we need to basically make it clear what the optimization does, and assume that the users are intelligent people that understand that if you optimize for the detector, then things might not be accurate for a monitor placed way before along the beam path.

@nvaytet
Copy link
Member Author

nvaytet commented Mar 16, 2026

Running with 30% of all neutrons in the first test will limit the speedup we can achieve to ~3x.

30% was just a guess, I have not tested it. I suggested it to Claude who thought that it was a good idea, but then I asked what about using a small chunk first (e.g. 5%) and it thought it also sounded like a good idea 🙄 (if we added the notion of 'confidence' when masking parts of the original source). So it all remains to be tested.

It could be that instead of decreasing chunks we could go with increasing chunks, from 10% up to 30%?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants