-
Notifications
You must be signed in to change notification settings - Fork 2
Description
I'm trying to do a spectrum image on silicon on gpu, but the following is bananas on disk space:
trajectory = Loader(dump,timestep=dt,atom_mapping=types).load() # LOAD TRAJECTORY
# SET UP PROBE POSITION SCAN
probe_xs = np.linspace(0,8*a,16*8,endpoint=False)
probe_ys = np.linspace(0,4*b,16*4,endpoint=False)
calculator=MultisliceCalculator()
calculator.setup(trajectory,aperture=30,voltage_eV=100e3,sampling=.1,slice_thickness=.5,min_dk=1/15,probe_xs=probe_xs,probe_ys=probe_ys,use_memmap=True)
exitwaves = calculator.run()
tacaw = TACAWData(exitwaves,chunkFFT=True)
...
looping per probe so i can assemble the SI later:
for x in probe_xs:
for y in probe_ys:
calculator=MultisliceCalculator()
pp = [[x,y]]
calculator.setup(trajectory,aperture=30,voltage_eV=100e3,sampling=.1,slice_thickness=.5,min_dk=1/15,probe_positions=pp)
exitwaves = calculator.run()
...
is predicted to take 20 minutes to get through all 500 timesteps once
whereas looping through columns of probe positions (also so i can assemble the SI later):
for x in probe_xs:
calculator=MultisliceCalculator()
pp = [ [x,y] for y in probe_ys ]
calculator.setup(trajectory,aperture=30,voltage_eV=100e3,sampling=.1,slice_thickness=.5,min_dk=1/15,probe_positions=pp)
exitwaves = calculator.run()
...
is taking 2 minutes per 500 timesteps.
how on earth does this make sense? using more probes isn't just "faster because i can do them simultaneously", it's actually faster. according to this, i should always include extra bogus probe positions even if i don't need them just so it processes my single probe position faster!
my guess is there's some sort of disk or cpu/gpu speed limit and we have different bottlenecks? but even if this were the case, we're writing more to disk in the latter case, transferring generally the same information between cpu/gpu in both cases.....i'm bewildered.