GK sheath BC: adding a mu dependence to parallel cut velocity#960
Conversation
|
in the description above you say |
manauref
left a comment
There was a problem hiding this comment.
Looks mostly benign. I have two higher level questions:
- Have you measured whether there are any performance changes in say TCV IWL 3x2v sims with these new kernels?
- Hare you thought about whether there's any difference in vcut_fact being discretized with project_on_basis or with eval_on_nodes? i.e. is there any benefit/disadvantage to vcut_fact being continuous along mu?
…(` and comment out the print statement in the unit test for bc sheath
…ction for clarity
…is one. I cherry picked the code and the ctest runs fine on GPU now.
You can find below two performance comparison tests using the TCV 3x2v setup. In short, I see 1% cost increase on a pessimistic 1CPU scenario, and no changes in production like TCV. on 1 CPU for the TCV 3x2v regression testWe use an adapted resolution (18x8x4x8x8) to create a worst case scenario (note: the shift function is halved to make it terminate). There is a slight cost increase of the BC evaluation in this scenario ~1%. main runMu-dependent vcut fact1 A100 GPU with coarse production TCV 3x2vHere is a test with 100 steps of the coarse production like tcv setup (resolution 24x16x12x12x8) on one A100 GPU. mainthis branch |
So we do not require to be continuous along the mu direction so I don't see why we would u se |
|
The last commit was tested and it is valgrind clean using (note that it takes a few minutes to run this) and 2 GPUs compute-sanitizer clean (except Nvidia communicator errors that are not related to this PR) using |
|
I have a structural question for how the vcut indexing working. The description of the algorithm looks like alpha is implicitly a function of (x,y) too (or you define a per-configuration-space alpha coefficient). How does this alpha projection fold into this additional indexing per (x,y) cell? |
In the code alpha = alpha(x,y,mu), we have a gkyl array with this "weird" Does this answer your question @JunoRavin ? |
|
It looks like this particular PR is a minimal working example of vcut, because my guess is that alpha(x,y,mu) is provided by GYRAZE in production? Like you don't make alpha within the bc_sheath_gyrokinetic. You have helper functions for acquiring the pointer to the vcut array, but you make the ranges for indexing vcut within this function? It's hard to trace the program flow for where alpha is defined because it looks like bc_sheath_gyrokinetic is managing some of the memory, but the computation of vcut must be external to bc_sheath_gyrokinetic? There's a question about why the vcut allocation and range initialization is handled by bc_sheath_gyrokinetic if the manipulation of this memory is external to this object (like you could treat vcut like how we treat auxiliary arrays for the DG gyrokinetic equation object; you could just pass a pointer to the vcut array and the vcut range to bc_sheath_gyrokinetic at t=0, and then let some other updater update vcut in the same way you have a different updater updater Apar/phi/etc.) |
|
in retrospect i think this should merge into the aisheath PR, not main |
Ok i get it better now. Indeed it is not needed to allocate a vcut array at all and instead just pass a function to evaluate vcut(x,y,mu) returning a double. The vcut array idea started because we wanted to maybe update the vcut curve every n step to avoid expensive computations. I will remove this array later in the aisheath branch. |
In this PR, we implement the ability to apply a conducting sheath boundary condition (BC) with a$\mu$ -dependent parallel velocity cut (see DR #957 for more details). This PR also includes and improves the sheath BC unit test presented in PR #956. The kernels are generated with the branch of this Gkylcas PR
Design choice
The main design decision is to encode the$\mu$ dependence in a non-dimensional factor $\alpha(\mu)$ such that,
where$v_{\parallel \text{cut} 0}$ is the currently implemented cutting parallel velocity. This allows to preserve the existing code structure, leaving the cut velocity evaluation at the kernel level.
Implementation details
The main changes are mainly centered around the sheath BC updater
gkyl_bc_sheath_gyrokinetic. A new DG array is added to the updater structure,to store the$\alpha(\mu)$ factor as a 2D $(v_\parallel,\mu)$ GK hybrid DG array. The use of a 2D representation simplifies the kernel generation by making the nodal evaluation more straightforward.
The updater creator,
gkyl_bc_sheath_gyrokinetic_new, setsalpha_mu = 1by default, which yields the current state of the sheath BC implementation.One can manipulate
alpha_muwith the new functiongkyl_bc_sheath_gyrokinetic_set_alpha_muto copy an existing array intoalpha_mu.Demonstration
We run$\mu$ -dependent parallel cut velocity as
gyrokinetic/unit/ctest_bc_twistshift.cwhich sets up Maxwellian distribution functions in1x2v,2x2v, and3x2vdomains. This test also sets awhich adds an exponential decaying term to the cutting velocity. This function is projected on a DG GK hybrid 2v basis:
where the
grid_velis the same as the one that define thebc_sheathupdater and thebasis_velis obtained using a newbc_sheathinterfaceWe then copy our
alpha_muarray to thebc_sheathupdater usingWe run and plot the resulting distribution functions using this script: bc_sheath_test.sh, which produces figures like:
Note that we varied the Maxwellian parallel flow depending on the dimensionality, specifically$u_\parallel=0$ for $u_\parallel=v_{th}$ for $u_\parallel=-v_{th}$ for
1x2v(bottom row),2x2v(middle row), and3x2v(top row).