We need to better optimize performance at small numbers of antenna. This issue will track the implementation of this. The strategy involved will likely require we employ a separate kernel for the off-diagonal blocks versus the diagonal blocks.
The implementation of this will be in the diagonal branch.
We need to better optimize performance at small numbers of antenna. This issue will track the implementation of this. The strategy involved will likely require we employ a separate kernel for the off-diagonal blocks versus the diagonal blocks.
The implementation of this will be in the diagonal branch.