large int factorization by s-celles · Pull Request #173 · JuliaMath/Primes.jl

s-celles · 2026-03-11T12:45:34Z

Adds efficient large integer factorization via a polyalgorithm combining:

Perfect power detection — checks if n = k^d before expensive methods
ECM (Elliptic Curve Method) — Montgomery curves with Suyama parametrization and batched GCD; effective when one factor is much smaller
MPQS (Multiple Polynomial Quadratic Sieve) — Self-Initializing QS (SIQS) with Gray code polynomial switching;
handles balanced semiprimes

Performance optimizations

In-place GMP arithmetic to minimize BigInt allocations
~~unsafe_store!/unsafe_load in sieve inner loops to bypass bounds checking~~
Interleaved two-root sieve writes for memory-level parallelism
Double Large Prime (DLP) variation with Pollard rho splitting for composite remainders
Factored-form a mod p computation avoiding GMP calls in SIQS polynomial setup
Guided trial factoring using sieve positions to skip non-dividing primes

Benchmarks (Apple Silicon, Julia 1.12)

30-digit semiprime: ~0.01s
50-digit semiprime: ~0.5s
60-digit balanced semiprime: ~10s

Closes #159

Test plan

Existing test suite passes
New tests for perfect power check, ECM, MPQS, polyalgorithm dispatch
60-digit balanced semiprime factorization test (the Factorization of "large" numbers #159 target)
Review parameter table tuning for digit ranges 30–76

Tools being used: Github Spec Kit + Claude Opus 4.6
Methodology: Spec Driven Development with AI assistance

…ation Add efficient large integer factorization using a polyalgorithm that combines perfect power detection, ECM (Elliptic Curve Method), and MPQS (Multiple Polynomial Quadratic Sieve with Self-Initialization). Key features: - ECM with Montgomery curves, Suyama parametrization, and batched GCD - SIQS with Gray code polynomial switching and incremental root updates - Double Large Prime variation with Pollard rho splitting - In-place GMP arithmetic to minimize BigInt allocations - Allocation-free sieve using unsafe_store!/unsafe_load

codecov · 2026-03-11T12:47:59Z

Codecov Report

❌ Patch coverage is 86.63172% with 102 lines in your changes missing coverage. Please review.
✅ Project coverage is 89.54%. Comparing base (20a92a0) to head (f86b0b6).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
src/mpqs.jl	84.57%	97 Missing ⚠️
src/ecm.jl	96.39%	4 Missing ⚠️
src/Primes.jl	95.65%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #173      +/-   ##
==========================================
- Coverage   93.08%   89.54%   -3.55%     
==========================================
  Files           2        4       +2     
  Lines         463     1224     +761     
==========================================
+ Hits          431     1096     +665     
- Misses         32      128      +96

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/polyalgorithm.jl

src/mpqs.jl

oscardssmith · 2026-03-11T13:23:45Z

Can you separate the ECM from the quadratic sieve? The ECM code is a lot smaller and looks like it's in better shape, so I would like to review/merge that first.

oscardssmith · 2026-03-11T15:14:18Z

src/ecm.jl

+"""
+struct MontgomeryCurvePoint


The fact that this is GMP only is somewhat unfortunate. Ideally this code would work for BitIntegers.jl also... I'm willing to accept it though since BigInt is probably what most users are using in practice.

src/polyalgorithm.jl

… thresholds

oscardssmith · 2026-03-11T15:56:10Z

src/Primes.jl

-    should_widen = T <: BigInt || widemul(n - 1, n - 1) ≤ typemax(n)
-    p = should_widen ? pollardfactor(n) : pollardfactor(widen(n))
+    # For large cofactors, use polyalgorithm dispatch (ECM → MPQS)
+    if n > big"100000000000000000000"  # > 10^20


is ECM slower than pollard for smaller numbers? That seems unexpected. Also, can you delete the polyalgorithm.jl file and move that code into here?

Oh this is likely related to the ECM impl being BigInt only.

s-celles · 2026-03-11T17:50:46Z

I don't think I can do better for code coverage.

Here is what my AI agent is writing

⏺ The remaining uncovered lines in mpqs.jl are mostly:
  - Edge cases in trial factoring (L351-370) — the s1 == 0 brute force path                                                  
  - DLP chaining (L881-912) — deep nesting in double large prime combination                                                 
  - Error paths (L1131, L1146) — failure modes                                                                             
  - Rare branches (L694, L725, L1042, L1104) — sentinel values, CRT sign flip                                                
  - Extract factor fallbacks (L585-596) — x + y path in factor extraction                                                  
                                                                                                                           
  Many of these are inherently hard to cover deterministically (they depend on random polynomial selection hitting specific
  number-theoretic edge cases). The important uncovered code is the _extract_factor fallback path and the _gf2_eliminate → x+y path.

  The remaining ~47 uncovered lines in mpqs.jl are mostly stochastic paths (DLP chaining, rare CRT sign flips) and error
  paths that require specific number-theoretic conditions difficult to trigger deterministically.

Any opinion?

oscardssmith · 2026-03-11T19:10:10Z

IMO the coverage is less important than code complexity. As such, I would prefer if this was split into separate PRs for ECM vs MPQS.

I would also like for these methods (at least ECM) to not force BigInt since for smaller numbers, Int128 or Int256 (from BitIntegers) can be a lot faster.

s-celles · 2026-03-11T19:25:22Z

Ok I will try to tackle that tomorrow.

oscardssmith reviewed Mar 11, 2026

View reviewed changes

src/polyalgorithm.jl Outdated Show resolved Hide resolved

oscardssmith reviewed Mar 11, 2026

View reviewed changes

src/mpqs.jl Show resolved Hide resolved

oscardssmith reviewed Mar 11, 2026

View reviewed changes

src/mpqs.jl Outdated Show resolved Hide resolved

oscardssmith reviewed Mar 11, 2026

View reviewed changes

src/mpqs.jl Outdated Show resolved Hide resolved

oscardssmith reviewed Mar 11, 2026

View reviewed changes

src/mpqs.jl Outdated Show resolved Hide resolved

s-celles marked this pull request as draft March 11, 2026 13:52

s-celles added 3 commits March 11, 2026 15:37

cleanup: remove useless optim unsafe_store! / pointer...

a70428b

cleanup: use IntegerMathUtils.ispower

cf41020

cleanup: using pollardfactor instead of _pollard_rho_small

dd73d80

oscardssmith reviewed Mar 11, 2026

View reviewed changes

src/polyalgorithm.jl Outdated Show resolved Hide resolved

s-celles added 2 commits March 11, 2026 16:36

refactor(polyalgorithm): use bit-length instead of decimal digits for…

33d809a

… thresholds

cleanup: remove useless optim unsafe_store! / pointer... (2)

f0edbae

oscardssmith reviewed Mar 11, 2026

View reviewed changes

s-celles added 3 commits March 11, 2026 18:02

cleanup: move _find_factor

e3cf32d

cleanup: move _find_factor for Julia 1.6

7626ae5

test: improve coverage

f86b0b6

s-celles mentioned this pull request Mar 12, 2026

feat: Elliptic Curve Method (ECM) for integer factorization #174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

large int factorization#173

large int factorization#173
s-celles wants to merge 9 commits intoJuliaMath:mainfrom
s-celles:002-large-int-factorization

s-celles commented Mar 11, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oscardssmith commented Mar 11, 2026

Uh oh!

oscardssmith Mar 11, 2026

Uh oh!

Uh oh!

oscardssmith Mar 11, 2026

Uh oh!

oscardssmith Mar 11, 2026

Uh oh!

s-celles commented Mar 11, 2026

Uh oh!

oscardssmith commented Mar 11, 2026

Uh oh!

s-celles commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		"""
		struct MontgomeryCurvePoint

Uh oh!

Conversation

s-celles commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

oscardssmith commented Mar 11, 2026

Uh oh!

oscardssmith Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

oscardssmith Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

oscardssmith Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

s-celles commented Mar 11, 2026

Uh oh!

oscardssmith commented Mar 11, 2026

Uh oh!

s-celles commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

s-celles commented Mar 11, 2026 •

edited

Loading

codecov bot commented Mar 11, 2026 •

edited

Loading