Skip to content

Feature/parallel ranks qfree metrics#2

Open
0rkhann wants to merge 2 commits intoAxelRolov:mainfrom
0rkhann:feature/parallel-ranks-qfree-metrics
Open

Feature/parallel ranks qfree metrics#2
0rkhann wants to merge 2 commits intoAxelRolov:mainfrom
0rkhann:feature/parallel-ranks-qfree-metrics

Conversation

@0rkhann
Copy link
Copy Markdown
Contributor

@0rkhann 0rkhann commented Mar 27, 2026

Summary

  • Add BLAS-based Tanimoto similarity (tanimoto_similarity_matrix_blas) — 10-50x faster than element-wise numba for large matrices
  • Add parallel rank computation (compute_ranks_parallel) — argsort + scatter via numba prange, returns int32
  • Add Q-matrix-free co-ranking measures (coranking_measures_from_ranks) — computes QNN, LCMC, AUC, Qlocal, Qglobal, T, C directly from rank matrices without materializing the N×N Q matrix

0rkhann added 2 commits March 27, 2026 20:19
…ures

Add three new public functions to the scoring module:

- compute_ranks_parallel(distances): computes rank matrix via parallel
  row-wise argsort + scatter (replaces double-argsort, returns int32)

- coranking_measures_from_ranks(ranks_high, ranks_low, k_neighbors):
  computes QNN, LCMC, AUC, kmax, Qlocal, Qglobal, T(k), C(k) directly
  from rank matrices without materialising the N×N co-ranking matrix Q,
  reducing peak memory from O(3N²) to O(2N²)

- _coranking_qnn_histogram, _coranking_trust_cont: numba helpers

All existing API is unchanged. Verified numerically identical results
against DRScorer.coranking_matrix + DRScorer.coranking_measures.
Add tanimoto_similarity_matrix_blas() as a drop-in replacement for
tanimoto_int_similarity_matrix_numba(). Delegates the O(N*M*D) dot
product to BLAS SGEMM, then applies the Tanimoto formula in-place
via a small numba kernel (_apply_tanimoto_formula).

Returns identical float32 output (max diff < 1e-5). Speedup grows
with matrix size as BLAS benefits from cache blocking and SIMD.
@AxelRolov
Copy link
Copy Markdown
Owner

I'm ready to merge it, but would like to see if new functions give the same results as the old one. Could you please add tests on simulated data and check the results for the old and new version of the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants