In my previous post, I showed how changing one bit of a semiprime (i.e. the product of two primes) creates an integer that can be factored much faster. I started writing that post using Python with SymPy, but moved to Mathematica because factoring took too long.
SymPy vs Mathematica
When I’m working in Python, SymPy lets me stay in Python. I’ll often use SymPy for a task that Mathematica could do better just so I can stay in one environment. But sometimes efficiency is a problem.
SymPy is written in pure Python, for better and for worse. When it comes to factoring large integers, it’s for worse. I tried factoring a 140-bit integer with SymPy, and killed the process after over an hour. Mathematica factored the same integer in 1/3 of a second.
Mathematica vs PARI/GP
The previous post factors 200-bit semiprimes. The first example, N = pq where
p = 1078376712338123201911958185123
q = 1126171711601272883728179081277
took 99.94 seconds to factor using Mathematica. A random sample of 13 products of 100-bit primes and they took an average of 99.1 seconds to factor.
Using PARI/GP, factoring the value of N above took 11.4 seconds to factor. I then generated a sample of 10 products of 100-bit primes and on average they took 10.4 seconds to factor using PARI/GP.
So in these examples, Mathematica is several orders of magnitude faster than SymPy, and PARI/GP is one order of magnitude faster than Mathematica.
It could be that the PARI/GP algorithms are relatively better at factoring semiprimes. To compare the efficiency of PARI/GP and Mathematica on non-semiprimes, I repeated the exercise in the previous post, flipping each bit of N one at a time and factoring.
This took 240.3 seconds with PARI/GP. The same code in Mathematica took 994.5 seconds. So in this example, PARI/GP is about 4 times faster where as for semiprimes it was 10 times faster.
Python and PARI
There is a Python interface to PARI called cypari2
. It should offer the convenience of working in Python with the efficiency of PARI. Unfortunately, the installation failed on my computer. I think SageMath interfaces Python to PARI but I haven’t tried it.
“So in these examples, Mathematica is several orders of magnitude faster than Mathematica, and PARI/GP is one order of magnitude faster than PARI/GP.”
Do you mean Mathematica is faster than Python, and PARI/GP is faster than Mathematica?
Yes. Thanks.
SymPy on PyPy? Pure-Python rips on PyPy.
“several hours” is not necessarily “several magnitudes” more than 100 seconds. It must be at least 3 hours to be about 2 orders of magnitude, and it would have to be about 30 hours to be 3 orders of magnitude more than 100 seconds.
But yes, one should clearly prefer PARI/gp for this. Insofar more as it’s even free and you can even run it without any installation in the online “WebApp” on their web site (although I’d recomment installing it on your computer for larger factorization jobs).
In case someone is looking for a faster Python method for the factorization of larger semiprimes: https://github.com/sbaresearch/smoothsubsumsearch
It can factor the 200-bit example in the post above in around 230 seconds. While this is slower than the solutions in Mathematica or PARI/GP, it is around 5 to 7 times faster than the implementation of the Self-Initializing Quadratic Sieve in sympy. The algorithm is not based on sieving, but on a novel approach described in this paper: https://arxiv.org/abs/2301.10529
I might implement it in Mathematica and PARI/GP in the future to directly compare it to these solutions.