Re^3: Algorithm for cancelling common factors between two lists of multiplicands
by Limbic~Region (Chancellor) on Aug 10, 2005 at 12:30 UTC

BrowserUk,
I misinterpreted "Other reductions possible" to read "Further reductions possible" given the example and where the commentary appeared. Admittedly I haven't read the entire thread and was under the incorrect impression for any given input there was only 1 acceptable output. I was also under the incorrect impression that you were not interested in prime factorization but what would be the result if the problem of "cancelling out" had been attacked by hand.
I believe the approach I presented could be made more efficient though I doubt it would be able to compete with a good prime factorization algorithm. I am familiar with several, some may have even been used in this thread, such as the quadratic sieve  but I don't currently have the time to write a perl implementation. IIRC numbers smaller than a certain size are best done with 1 algorithm and larger than a different number with a different algorithm with the numbers falling in the middle being a toss up.
If I do come up with anything I will be sure to let you know. Thanks for the fun problem.
 [reply] 
Re^3: Algorithm for cancelling common factors between two lists of multiplicands
by sk (Curate) on Aug 10, 2005 at 07:56 UTC

BrowserUk,
I would be curious to see the results if you ever benchmark the GCD approach with the prime factor approach on the problem size you were mentioning
The GCD Approach is O(m*n)*O(GCDalgo) but I am not sure how the complexity reduces for the prime factorization approach.
You need to calculate the prime factor each of the product value right? i.e. if you have to do 100*101*102*103*104/57*58*59 then you need factor for each of the values correct?
The GCD approach will loop (5 * 3)*O(GCDalgo) times. But prime factor approach needs to calc factors for 8 values and that could potentially loop a lot more per value (max upto the value) and the complexity will be more than the GCD approach.
I ran the GCD approach and took about 9 minutes (not a formal benchmark) just to cancel out terms. Don't have any math package to do big multiplications/divisions. I even tried to reduce the inner loop by switching @a and @b but did not see a huge impact
Also, i am confused whether the Haskell code actually computed the example (40000+!) because the execution speed was just unbelievable. Did it loose any precision when you tested it? I guess the approach was similar in canceling out terms. It makes me wonder why Perl canceling out takes so much longer than Haskell implementation
will be happy to hear on how your final implementation look.
Thanks,
SK  [reply] [d/l] 

[11:20:43.54] C:\ghc\ghc6.4\code>FET < ex1.dat
+8.070604647867604097576877675243109941729476950993498765160880e7030
[11:20:46.34] C:\ghc\ghc6.4\code>
Using Math::Pari (in 26 ms):
P:\test>MPFET.pl
8.070604647867604097576877668E7030
1 trial of _default ( 24.817ms total), 24.817ms/trial
Using Math::BigInt (Yes. That 4 hrs 38 minutes!)
P:\test>MBFFET.pl
8.070604647867604097576877675243109941729e7030
1 trial of _default ( 16,699s total), 16,699s/trial
The whole point of the cancelling code is that you do not have to calculate the huge factorials in infinite precison (slow) math, because you reduce the factorials to products of sets of much smaller, prime factors and then cancel most of those.
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.  Rule 1 has a caveat!  Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
 [reply] [d/l] [select] 

Thanks for the update
I cannot read Haskell so I was not sure how the factoring was done. Going through the example it seems like tmoertel was just factoring like terms. If the code does further factoring (school days division) then I am surprised that it can loop through that fast!
If you take a look at my scratchpad I tried ikegami's logic on gcd. I added a condition to check for values == 1 to save an extra GCD. I also did not have the Math module he was using so picked up another one. This one takes about 9 minutes. It would be interesting to see what will happen if we apply BigInt from here onwards!
I am also curious to see the runtime of the prime factor approach. I shall try hv's code to see how fast it runs. If I remember correctly Euclidean algo for GCD is O(n+log n) and prime factorization is NPcomplete. For small values, prime fact is a breeze but the complexity does not change so i shall let you know when i get around testing the runtime for it!
cheers
SK
 [reply] 



 [reply] 


The last two digits of the Math::Pari implementation's result seem to be off:
+8.070604647867604097576877675243109941729476950993498765160880e7030
8.070604647867604097576877675243109941729e7030
8.070604647867604097576877668E7030 < Math::Pari
^^
Can you increase the implementation's precision? Or is that the limit?
Cheers, Tom
 [reply] [d/l] 



 [reply] [d/l] [select] 
Re^3: Algorithm for cancelling common factors between two lists of multiplicands
by QM (Parson) on Aug 10, 2005 at 17:51 UTC

Coming up with an efficient, pure perl implementation of prime factorisation (limiting myself to integers 0 .. 2**32 ) is my current fun challenge :)
You might have a look at Math::Big::Factors, which I found useful, and is pure Perl. (Note that the docs have a typo, and mention factor_wheel where the function is actually named factors_wheel.)
QM

Quantum Mechanics: The dreams stuff is made of
 [reply] [d/l] [select] 

Without actually trying it, Math::Big::Factors says:
For very small numbers the computation of the wheel of order 7 may actually take longer than the factorization, but anything that has more than 10 digits will usually benefit from order 7.
As I'm limiting myself to 2**32, it probably won't help?
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco.  Rule 1 has a caveat!  Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
 [reply] 

Rate 7 6 5
7 34.1/s  95% 100%
6 627/s 1739%  94%
5 9770/s 28548% 1458% 
and for factors_wheel of numbers near 32 bits, using an order 7 wheel:
s/iter 4294967291 4294967293 4294967295
4294967291 4.45  61% 71%
4294967293 1.74 156%  25%
4294967295 1.31 241% 33% 
Note that 4294967291 is prime, 4294967293 has 2 large factors, and 4294967295 has 5 factors, the largest of which is 2**16+1.
For 4294967291, and wheel orders 57:
s/iter 7 5 6
7 4.44  16% 21%
5 3.71 19%  6%
6 3.49 27% 6% 
In all fairness, I should probably mention that I've customized my own version of Math::Big::Factors to Memoize results where possible, and to reduce the calls to Math::BigInt::new() for constants.
Note that factors_wheel creates a new wheel every time (what a shame), instead of requiring a wheel reference be passed in. Caching wheel goes a little way in correcting this, without changing the interface.
You might also consider avoiding the use of Math::BigInt where possible, as you don't need numbers that big.
Now, back to the question at hand. For numbers near 2**32, factoring a prime number seems to take 150 times longer than creating the order 7 wheel. Creating the order 6 wheel is considerably faster, and so is the factoring based on that wheel.
I'm not sure where the breakpoint is for order 7 wheels. A casual search hints that it is very large.
QM

Quantum Mechanics: The dreams stuff is made of
 [reply] [d/l] [select] 
