Goldbach's conjecture algorithm
Clash Royale CLAN TAG#URR8PPP
$begingroup$
I did a Goldbach's Conjecture exercise and got it to work. It's pretty slow though and I was wondering how I could optimize it.
number = int(input("Enter your number >> "))
print("nCalculating...")
if number % 2 == 0: #Only even numbers
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
Right now, up to 10,000 the algorithm is pretty fast, but at 100,000 it takes about 3 seconds to finish. Is that just how it is or could I make it faster?
python performance
$endgroup$
|
show 4 more comments
$begingroup$
I did a Goldbach's Conjecture exercise and got it to work. It's pretty slow though and I was wondering how I could optimize it.
number = int(input("Enter your number >> "))
print("nCalculating...")
if number % 2 == 0: #Only even numbers
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
Right now, up to 10,000 the algorithm is pretty fast, but at 100,000 it takes about 3 seconds to finish. Is that just how it is or could I make it faster?
python performance
$endgroup$
7
$begingroup$
Have you profiled this? How isprimenums
implemented?
$endgroup$
– Graipher
Feb 13 at 8:23
1
$begingroup$
Profiled? What do you mean?
$endgroup$
– Kevin
Feb 13 at 13:46
6
$begingroup$
@EricLippert Don't worry about language/compiler/interpreter optimization before you do mathematical/complexity optimization first. Switching languages can only improve performance by a small factor (1-10). Changing the algorithm can give speedups of 1-infinity.
$endgroup$
– Vortico
Feb 13 at 18:10
2
$begingroup$
@Vortico are you seriously lecturing Eric Lippert on performance gains and optimization?
$endgroup$
– MikeTheLiar
Feb 13 at 19:57
8
$begingroup$
@MikeTheLiar I know he knows, but someone reading his comment may be misled. It's bad advice to recommend to a programming beginner to change the tool he's learning before even pointing out the issue with his algorithm. It's discouraging, doesn't solve the problem at hand (because maybe his goal is to simply learn Python, and remember this is a code review community), and doesn't teach how to think on one's own. It's not "The first problem" with his code as he said. It's the 26th or maybe 126th problem he should worry about while playing around with Goldbach's conjecture.
$endgroup$
– Vortico
Feb 13 at 20:31
|
show 4 more comments
$begingroup$
I did a Goldbach's Conjecture exercise and got it to work. It's pretty slow though and I was wondering how I could optimize it.
number = int(input("Enter your number >> "))
print("nCalculating...")
if number % 2 == 0: #Only even numbers
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
Right now, up to 10,000 the algorithm is pretty fast, but at 100,000 it takes about 3 seconds to finish. Is that just how it is or could I make it faster?
python performance
$endgroup$
I did a Goldbach's Conjecture exercise and got it to work. It's pretty slow though and I was wondering how I could optimize it.
number = int(input("Enter your number >> "))
print("nCalculating...")
if number % 2 == 0: #Only even numbers
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
Right now, up to 10,000 the algorithm is pretty fast, but at 100,000 it takes about 3 seconds to finish. Is that just how it is or could I make it faster?
python performance
python performance
edited Feb 14 at 4:02
Jamal♦
30.4k11120227
30.4k11120227
asked Feb 13 at 8:17
KevinKevin
10617
10617
7
$begingroup$
Have you profiled this? How isprimenums
implemented?
$endgroup$
– Graipher
Feb 13 at 8:23
1
$begingroup$
Profiled? What do you mean?
$endgroup$
– Kevin
Feb 13 at 13:46
6
$begingroup$
@EricLippert Don't worry about language/compiler/interpreter optimization before you do mathematical/complexity optimization first. Switching languages can only improve performance by a small factor (1-10). Changing the algorithm can give speedups of 1-infinity.
$endgroup$
– Vortico
Feb 13 at 18:10
2
$begingroup$
@Vortico are you seriously lecturing Eric Lippert on performance gains and optimization?
$endgroup$
– MikeTheLiar
Feb 13 at 19:57
8
$begingroup$
@MikeTheLiar I know he knows, but someone reading his comment may be misled. It's bad advice to recommend to a programming beginner to change the tool he's learning before even pointing out the issue with his algorithm. It's discouraging, doesn't solve the problem at hand (because maybe his goal is to simply learn Python, and remember this is a code review community), and doesn't teach how to think on one's own. It's not "The first problem" with his code as he said. It's the 26th or maybe 126th problem he should worry about while playing around with Goldbach's conjecture.
$endgroup$
– Vortico
Feb 13 at 20:31
|
show 4 more comments
7
$begingroup$
Have you profiled this? How isprimenums
implemented?
$endgroup$
– Graipher
Feb 13 at 8:23
1
$begingroup$
Profiled? What do you mean?
$endgroup$
– Kevin
Feb 13 at 13:46
6
$begingroup$
@EricLippert Don't worry about language/compiler/interpreter optimization before you do mathematical/complexity optimization first. Switching languages can only improve performance by a small factor (1-10). Changing the algorithm can give speedups of 1-infinity.
$endgroup$
– Vortico
Feb 13 at 18:10
2
$begingroup$
@Vortico are you seriously lecturing Eric Lippert on performance gains and optimization?
$endgroup$
– MikeTheLiar
Feb 13 at 19:57
8
$begingroup$
@MikeTheLiar I know he knows, but someone reading his comment may be misled. It's bad advice to recommend to a programming beginner to change the tool he's learning before even pointing out the issue with his algorithm. It's discouraging, doesn't solve the problem at hand (because maybe his goal is to simply learn Python, and remember this is a code review community), and doesn't teach how to think on one's own. It's not "The first problem" with his code as he said. It's the 26th or maybe 126th problem he should worry about while playing around with Goldbach's conjecture.
$endgroup$
– Vortico
Feb 13 at 20:31
7
7
$begingroup$
Have you profiled this? How is
primenums
implemented?$endgroup$
– Graipher
Feb 13 at 8:23
$begingroup$
Have you profiled this? How is
primenums
implemented?$endgroup$
– Graipher
Feb 13 at 8:23
1
1
$begingroup$
Profiled? What do you mean?
$endgroup$
– Kevin
Feb 13 at 13:46
$begingroup$
Profiled? What do you mean?
$endgroup$
– Kevin
Feb 13 at 13:46
6
6
$begingroup$
@EricLippert Don't worry about language/compiler/interpreter optimization before you do mathematical/complexity optimization first. Switching languages can only improve performance by a small factor (1-10). Changing the algorithm can give speedups of 1-infinity.
$endgroup$
– Vortico
Feb 13 at 18:10
$begingroup$
@EricLippert Don't worry about language/compiler/interpreter optimization before you do mathematical/complexity optimization first. Switching languages can only improve performance by a small factor (1-10). Changing the algorithm can give speedups of 1-infinity.
$endgroup$
– Vortico
Feb 13 at 18:10
2
2
$begingroup$
@Vortico are you seriously lecturing Eric Lippert on performance gains and optimization?
$endgroup$
– MikeTheLiar
Feb 13 at 19:57
$begingroup$
@Vortico are you seriously lecturing Eric Lippert on performance gains and optimization?
$endgroup$
– MikeTheLiar
Feb 13 at 19:57
8
8
$begingroup$
@MikeTheLiar I know he knows, but someone reading his comment may be misled. It's bad advice to recommend to a programming beginner to change the tool he's learning before even pointing out the issue with his algorithm. It's discouraging, doesn't solve the problem at hand (because maybe his goal is to simply learn Python, and remember this is a code review community), and doesn't teach how to think on one's own. It's not "The first problem" with his code as he said. It's the 26th or maybe 126th problem he should worry about while playing around with Goldbach's conjecture.
$endgroup$
– Vortico
Feb 13 at 20:31
$begingroup$
@MikeTheLiar I know he knows, but someone reading his comment may be misled. It's bad advice to recommend to a programming beginner to change the tool he's learning before even pointing out the issue with his algorithm. It's discouraging, doesn't solve the problem at hand (because maybe his goal is to simply learn Python, and remember this is a code review community), and doesn't teach how to think on one's own. It's not "The first problem" with his code as he said. It's the 26th or maybe 126th problem he should worry about while playing around with Goldbach's conjecture.
$endgroup$
– Vortico
Feb 13 at 20:31
|
show 4 more comments
3 Answers
3
active
oldest
votes
$begingroup$
One thing that makes your code slow is your repeated calls to primes.index
. Each of these calls needs to linearly scan primes
until it finds the value you are looking for, making this $mathcalO(n)$. This is especially bad in the first if
case since you do this twice with exactly the same argument. Instead just keep the index around in addition to the number:
def goldbach_keep_index(number):
if number % 2 == 1: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
i, j = 0, 0
addend1 = primes[i]
addend2 = primes[j]
while addend1 + addend2 != number:
if addend2 == primes[-1]:
i = j = i + 1
addend2 = primes[i]
addend1 = primes[i]
else:
j += 1
addend2 = primes[j]
return addend1, addend2
Note that I made this a function, so you can reuse it.
Your main calling code could look like this:
if __name__ == "__main__":
number = int(input("Enter your number >> "))
p1, p2 = goldbach_keep_index(number)
print(f"p1 + p2 = number")
However, what you are really doing is taking all combinations of prime numbers, without repeating yourself. So just use itertools.combinations_with_replacement
:
from itertools import combinations_with_replacement
def goldbach_combinations(number):
if number % 2 == 0:
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number)
for addend1, addend2 in combinations_with_replacement(primes, 2):
if addend1 + addend2 == number:
return addend1, addend2
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
In this case primes
does not even need to be a list
, it can just be a generator.
If you instead make primes
a set
, you can easily implement the idea suggested by @Josiah in their answer by just checking if number - p
is in primes
:
def goldbach_set(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = set(primenums(number)) #returns all prime numbers <= input number
for p in primes:
k = number - p
if k in primes:
return p, k
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
And now some timing comparisons, where goldbach
is your code put into a function:
def goldbach(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
return addend1, addend2
And primenums
is a simple sieve:
def prime_sieve(limit):
prime = [True] * limit
prime[0] = prime[1] = False
for i, is_prime in enumerate(prime):
if is_prime:
yield i
for n in range(i * i, limit, i):
prime[n] = False
def primenums(limit):
return list(prime_sieve(limit))
And when pulling the generation of the primes outside of the function (and calculating them up to the largest number in the plot):
Of course the first three functions are vastly slower than the set
because they need to go through more combinations now. However, all functions are then constant time, so the increase in time comes solely from the fact that you have to consider more primes.
$endgroup$
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
1
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
@Josiah: That is indeed what I did (x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...
$endgroup$
– Graipher
Feb 15 at 8:05
add a comment |
$begingroup$
As suggested by Graipher in the comment, you're probably spending a lot of time generating a list of primes. That would be worth profiling, but I can't speculate as to how it might be improved without at least seeing the code.
There are two things that stand out to me as very clear performance sinks.
The first, and easiest to change, is index
. If you call index
on a list, python has to check every element in the list one at a time to see whether it's the thing you're after. If you have millions of primes, every one of those index calls is secretly a hidden loop that runs millions of times.
The good news is you can resolve this easily enough: just keep another variable which stores where in the list of primes you're up to. Then you don't have to search for your place again every time you go round the loop, because you have a bookmark to it.
Second, and slightly more subtle, is the way that this is looping. You are looking for two numbers that add up to a target. If your target is 100, you might first say "what if it's 3?" and then try all primes with 3. Then you say "what if it's 5?" and try all the primes with 5. And then you do the same with 7, and so on. This means your trying all the primes with all the primes.
Instead, if you are checking whether the first prime is 3, you know that the second prime has to be 97. So all you have to do is check whether 97 is prime. If so, it's a match, if not, you can move on to checking 5 and 95, and then 7 and 93, and so on. In this way your algorithm checks each prime against one other number instead of each prime against all the others, and should be massively faster.
$endgroup$
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
I would default to using aset
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.
$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
If I had to use a regularlist
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.
$endgroup$
– Josiah
Feb 13 at 20:25
add a comment |
$begingroup$
Let's take n = 1,000,000,000 for example and see what your code does.
It calculates all primes from 1 to 1 billion. That takes a while, but it gives you an array of all primes in sorted order.
It then calculates 2 + 2, 2 + 3, 2 + 5, 2 + 7, 2 + 999,999,xxx to check if one of these numbers equals 1,000,000,000. But obviously when addend1 = 2, addend2 has to be 999,999,998 to add to one billion, so you are checking tens of millions of sums unnecessarily.
It then calculates 3 + 3, 3 + 5, 3 + 7 etc., and again, addend2 would have to be 999,999,997 million. And then again for addend1 = 5, 7, 11 etc. So you are checking a huge number of sums needlessly.
Start with addend1 = first prime, addend2 = last prime in your array. If their sum is too small, replace addend1 with the next prime to make the sum larger by the smallest possible amound. If their sum is too large, replace addend2 with the previous prime. If the sum is right, you have found a solution. And once you reached addend1 > addend2, you know there is no solution. This will be very quick, since usually you don't need to check too many values for addend1.
That makes the search a lot quicker, but doesn't help with the sieve trying to find all the primes. You usually don't need all the primes, only a very small number. In the example with n = 1,000,000,000 you probably find two primes adding up to a billion with addend1 ≤ 1000 and addend2 ≥ 999,999,000.
So here's what you do: To find all primes say in the range 999,999,000 to 1,000,000,000, you need to check if these numbers are divisible by any prime up to the square root of 1,000,000,000 which is a bit less than 32,000. So first you use a sieve to find all numbers up to 32,000. Then you create a sieve that finds all primes from 999,999,000 to 1,000,000,000. You run your search algorithm until it tries to examine primes that are not in this range. This likely doesn't happen, but if it happens, you create another sieve for the numbers 999,998,000 to 999,999,000 and so on. So instead of maybe 50 million primes, you look only for maybe 50 or 100. Again, that makes it a lot faster.
$endgroup$
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f213357%2fgoldbachs-conjecture-algorithm%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
One thing that makes your code slow is your repeated calls to primes.index
. Each of these calls needs to linearly scan primes
until it finds the value you are looking for, making this $mathcalO(n)$. This is especially bad in the first if
case since you do this twice with exactly the same argument. Instead just keep the index around in addition to the number:
def goldbach_keep_index(number):
if number % 2 == 1: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
i, j = 0, 0
addend1 = primes[i]
addend2 = primes[j]
while addend1 + addend2 != number:
if addend2 == primes[-1]:
i = j = i + 1
addend2 = primes[i]
addend1 = primes[i]
else:
j += 1
addend2 = primes[j]
return addend1, addend2
Note that I made this a function, so you can reuse it.
Your main calling code could look like this:
if __name__ == "__main__":
number = int(input("Enter your number >> "))
p1, p2 = goldbach_keep_index(number)
print(f"p1 + p2 = number")
However, what you are really doing is taking all combinations of prime numbers, without repeating yourself. So just use itertools.combinations_with_replacement
:
from itertools import combinations_with_replacement
def goldbach_combinations(number):
if number % 2 == 0:
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number)
for addend1, addend2 in combinations_with_replacement(primes, 2):
if addend1 + addend2 == number:
return addend1, addend2
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
In this case primes
does not even need to be a list
, it can just be a generator.
If you instead make primes
a set
, you can easily implement the idea suggested by @Josiah in their answer by just checking if number - p
is in primes
:
def goldbach_set(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = set(primenums(number)) #returns all prime numbers <= input number
for p in primes:
k = number - p
if k in primes:
return p, k
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
And now some timing comparisons, where goldbach
is your code put into a function:
def goldbach(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
return addend1, addend2
And primenums
is a simple sieve:
def prime_sieve(limit):
prime = [True] * limit
prime[0] = prime[1] = False
for i, is_prime in enumerate(prime):
if is_prime:
yield i
for n in range(i * i, limit, i):
prime[n] = False
def primenums(limit):
return list(prime_sieve(limit))
And when pulling the generation of the primes outside of the function (and calculating them up to the largest number in the plot):
Of course the first three functions are vastly slower than the set
because they need to go through more combinations now. However, all functions are then constant time, so the increase in time comes solely from the fact that you have to consider more primes.
$endgroup$
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
1
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
@Josiah: That is indeed what I did (x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...
$endgroup$
– Graipher
Feb 15 at 8:05
add a comment |
$begingroup$
One thing that makes your code slow is your repeated calls to primes.index
. Each of these calls needs to linearly scan primes
until it finds the value you are looking for, making this $mathcalO(n)$. This is especially bad in the first if
case since you do this twice with exactly the same argument. Instead just keep the index around in addition to the number:
def goldbach_keep_index(number):
if number % 2 == 1: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
i, j = 0, 0
addend1 = primes[i]
addend2 = primes[j]
while addend1 + addend2 != number:
if addend2 == primes[-1]:
i = j = i + 1
addend2 = primes[i]
addend1 = primes[i]
else:
j += 1
addend2 = primes[j]
return addend1, addend2
Note that I made this a function, so you can reuse it.
Your main calling code could look like this:
if __name__ == "__main__":
number = int(input("Enter your number >> "))
p1, p2 = goldbach_keep_index(number)
print(f"p1 + p2 = number")
However, what you are really doing is taking all combinations of prime numbers, without repeating yourself. So just use itertools.combinations_with_replacement
:
from itertools import combinations_with_replacement
def goldbach_combinations(number):
if number % 2 == 0:
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number)
for addend1, addend2 in combinations_with_replacement(primes, 2):
if addend1 + addend2 == number:
return addend1, addend2
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
In this case primes
does not even need to be a list
, it can just be a generator.
If you instead make primes
a set
, you can easily implement the idea suggested by @Josiah in their answer by just checking if number - p
is in primes
:
def goldbach_set(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = set(primenums(number)) #returns all prime numbers <= input number
for p in primes:
k = number - p
if k in primes:
return p, k
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
And now some timing comparisons, where goldbach
is your code put into a function:
def goldbach(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
return addend1, addend2
And primenums
is a simple sieve:
def prime_sieve(limit):
prime = [True] * limit
prime[0] = prime[1] = False
for i, is_prime in enumerate(prime):
if is_prime:
yield i
for n in range(i * i, limit, i):
prime[n] = False
def primenums(limit):
return list(prime_sieve(limit))
And when pulling the generation of the primes outside of the function (and calculating them up to the largest number in the plot):
Of course the first three functions are vastly slower than the set
because they need to go through more combinations now. However, all functions are then constant time, so the increase in time comes solely from the fact that you have to consider more primes.
$endgroup$
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
1
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
@Josiah: That is indeed what I did (x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...
$endgroup$
– Graipher
Feb 15 at 8:05
add a comment |
$begingroup$
One thing that makes your code slow is your repeated calls to primes.index
. Each of these calls needs to linearly scan primes
until it finds the value you are looking for, making this $mathcalO(n)$. This is especially bad in the first if
case since you do this twice with exactly the same argument. Instead just keep the index around in addition to the number:
def goldbach_keep_index(number):
if number % 2 == 1: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
i, j = 0, 0
addend1 = primes[i]
addend2 = primes[j]
while addend1 + addend2 != number:
if addend2 == primes[-1]:
i = j = i + 1
addend2 = primes[i]
addend1 = primes[i]
else:
j += 1
addend2 = primes[j]
return addend1, addend2
Note that I made this a function, so you can reuse it.
Your main calling code could look like this:
if __name__ == "__main__":
number = int(input("Enter your number >> "))
p1, p2 = goldbach_keep_index(number)
print(f"p1 + p2 = number")
However, what you are really doing is taking all combinations of prime numbers, without repeating yourself. So just use itertools.combinations_with_replacement
:
from itertools import combinations_with_replacement
def goldbach_combinations(number):
if number % 2 == 0:
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number)
for addend1, addend2 in combinations_with_replacement(primes, 2):
if addend1 + addend2 == number:
return addend1, addend2
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
In this case primes
does not even need to be a list
, it can just be a generator.
If you instead make primes
a set
, you can easily implement the idea suggested by @Josiah in their answer by just checking if number - p
is in primes
:
def goldbach_set(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = set(primenums(number)) #returns all prime numbers <= input number
for p in primes:
k = number - p
if k in primes:
return p, k
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
And now some timing comparisons, where goldbach
is your code put into a function:
def goldbach(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
return addend1, addend2
And primenums
is a simple sieve:
def prime_sieve(limit):
prime = [True] * limit
prime[0] = prime[1] = False
for i, is_prime in enumerate(prime):
if is_prime:
yield i
for n in range(i * i, limit, i):
prime[n] = False
def primenums(limit):
return list(prime_sieve(limit))
And when pulling the generation of the primes outside of the function (and calculating them up to the largest number in the plot):
Of course the first three functions are vastly slower than the set
because they need to go through more combinations now. However, all functions are then constant time, so the increase in time comes solely from the fact that you have to consider more primes.
$endgroup$
One thing that makes your code slow is your repeated calls to primes.index
. Each of these calls needs to linearly scan primes
until it finds the value you are looking for, making this $mathcalO(n)$. This is especially bad in the first if
case since you do this twice with exactly the same argument. Instead just keep the index around in addition to the number:
def goldbach_keep_index(number):
if number % 2 == 1: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
i, j = 0, 0
addend1 = primes[i]
addend2 = primes[j]
while addend1 + addend2 != number:
if addend2 == primes[-1]:
i = j = i + 1
addend2 = primes[i]
addend1 = primes[i]
else:
j += 1
addend2 = primes[j]
return addend1, addend2
Note that I made this a function, so you can reuse it.
Your main calling code could look like this:
if __name__ == "__main__":
number = int(input("Enter your number >> "))
p1, p2 = goldbach_keep_index(number)
print(f"p1 + p2 = number")
However, what you are really doing is taking all combinations of prime numbers, without repeating yourself. So just use itertools.combinations_with_replacement
:
from itertools import combinations_with_replacement
def goldbach_combinations(number):
if number % 2 == 0:
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number)
for addend1, addend2 in combinations_with_replacement(primes, 2):
if addend1 + addend2 == number:
return addend1, addend2
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
In this case primes
does not even need to be a list
, it can just be a generator.
If you instead make primes
a set
, you can easily implement the idea suggested by @Josiah in their answer by just checking if number - p
is in primes
:
def goldbach_set(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = set(primenums(number)) #returns all prime numbers <= input number
for p in primes:
k = number - p
if k in primes:
return p, k
raise Exception(f"Found a counter-example to the Goldbach conjecture: number")
And now some timing comparisons, where goldbach
is your code put into a function:
def goldbach(number):
if number % 2 == 0: #Only even numbers
raise ValueError("Goldbach conjecture is only defined for even numbers")
primes = primenums(number) #returns all prime numbers <= input number
addend1 = primes[0]
addend2 = primes[0]
while addend1 + addend2 != number:
if primes[-1] == addend2:
addend2 = primes[primes.index(addend1) + 1]
addend1 = primes[primes.index(addend1) + 1]
else:
addend2 = primes[primes.index(addend2) + 1]
return addend1, addend2
And primenums
is a simple sieve:
def prime_sieve(limit):
prime = [True] * limit
prime[0] = prime[1] = False
for i, is_prime in enumerate(prime):
if is_prime:
yield i
for n in range(i * i, limit, i):
prime[n] = False
def primenums(limit):
return list(prime_sieve(limit))
And when pulling the generation of the primes outside of the function (and calculating them up to the largest number in the plot):
Of course the first three functions are vastly slower than the set
because they need to go through more combinations now. However, all functions are then constant time, so the increase in time comes solely from the fact that you have to consider more primes.
edited Feb 14 at 9:45
answered Feb 13 at 8:37
GraipherGraipher
25.4k53687
25.4k53687
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
1
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
@Josiah: That is indeed what I did (x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...
$endgroup$
– Graipher
Feb 15 at 8:05
add a comment |
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
1
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
@Josiah: That is indeed what I did (x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...
$endgroup$
– Graipher
Feb 15 at 8:05
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
$begingroup$
I'm intrigued by the last graph. There is an obvious oscillating pattern where there's a sudden drop in run time even when the size of the input increases. I wonder what's going in there
$endgroup$
– DeepSpace
Feb 14 at 12:06
1
1
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
I wondered that. My guess is fixed probe numbers are used, and some happen to have a larger small addend needed to satisfy the conjecture, so they take more loop iterations.
$endgroup$
– Josiah
Feb 14 at 20:35
$begingroup$
@Josiah: That is indeed what I did (
x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...$endgroup$
– Graipher
Feb 15 at 8:05
$begingroup$
@Josiah: That is indeed what I did (
x = np.logspace(1, 5, dtype=int); x[x % 2 == 1] += 1
). And while larger numbers indeed tend to have more pairs of prime numbers that sum to them, there is quite a lot of variation: en.wikipedia.org/wiki/Goldbach%27s_conjecture#/media/… One might be able to argue that this rise is slightly visible above 10^4 in my plot for the first functions, but since the OP's code took too long I could no go up much higher in numbers...$endgroup$
– Graipher
Feb 15 at 8:05
add a comment |
$begingroup$
As suggested by Graipher in the comment, you're probably spending a lot of time generating a list of primes. That would be worth profiling, but I can't speculate as to how it might be improved without at least seeing the code.
There are two things that stand out to me as very clear performance sinks.
The first, and easiest to change, is index
. If you call index
on a list, python has to check every element in the list one at a time to see whether it's the thing you're after. If you have millions of primes, every one of those index calls is secretly a hidden loop that runs millions of times.
The good news is you can resolve this easily enough: just keep another variable which stores where in the list of primes you're up to. Then you don't have to search for your place again every time you go round the loop, because you have a bookmark to it.
Second, and slightly more subtle, is the way that this is looping. You are looking for two numbers that add up to a target. If your target is 100, you might first say "what if it's 3?" and then try all primes with 3. Then you say "what if it's 5?" and try all the primes with 5. And then you do the same with 7, and so on. This means your trying all the primes with all the primes.
Instead, if you are checking whether the first prime is 3, you know that the second prime has to be 97. So all you have to do is check whether 97 is prime. If so, it's a match, if not, you can move on to checking 5 and 95, and then 7 and 93, and so on. In this way your algorithm checks each prime against one other number instead of each prime against all the others, and should be massively faster.
$endgroup$
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
I would default to using aset
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.
$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
If I had to use a regularlist
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.
$endgroup$
– Josiah
Feb 13 at 20:25
add a comment |
$begingroup$
As suggested by Graipher in the comment, you're probably spending a lot of time generating a list of primes. That would be worth profiling, but I can't speculate as to how it might be improved without at least seeing the code.
There are two things that stand out to me as very clear performance sinks.
The first, and easiest to change, is index
. If you call index
on a list, python has to check every element in the list one at a time to see whether it's the thing you're after. If you have millions of primes, every one of those index calls is secretly a hidden loop that runs millions of times.
The good news is you can resolve this easily enough: just keep another variable which stores where in the list of primes you're up to. Then you don't have to search for your place again every time you go round the loop, because you have a bookmark to it.
Second, and slightly more subtle, is the way that this is looping. You are looking for two numbers that add up to a target. If your target is 100, you might first say "what if it's 3?" and then try all primes with 3. Then you say "what if it's 5?" and try all the primes with 5. And then you do the same with 7, and so on. This means your trying all the primes with all the primes.
Instead, if you are checking whether the first prime is 3, you know that the second prime has to be 97. So all you have to do is check whether 97 is prime. If so, it's a match, if not, you can move on to checking 5 and 95, and then 7 and 93, and so on. In this way your algorithm checks each prime against one other number instead of each prime against all the others, and should be massively faster.
$endgroup$
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
I would default to using aset
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.
$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
If I had to use a regularlist
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.
$endgroup$
– Josiah
Feb 13 at 20:25
add a comment |
$begingroup$
As suggested by Graipher in the comment, you're probably spending a lot of time generating a list of primes. That would be worth profiling, but I can't speculate as to how it might be improved without at least seeing the code.
There are two things that stand out to me as very clear performance sinks.
The first, and easiest to change, is index
. If you call index
on a list, python has to check every element in the list one at a time to see whether it's the thing you're after. If you have millions of primes, every one of those index calls is secretly a hidden loop that runs millions of times.
The good news is you can resolve this easily enough: just keep another variable which stores where in the list of primes you're up to. Then you don't have to search for your place again every time you go round the loop, because you have a bookmark to it.
Second, and slightly more subtle, is the way that this is looping. You are looking for two numbers that add up to a target. If your target is 100, you might first say "what if it's 3?" and then try all primes with 3. Then you say "what if it's 5?" and try all the primes with 5. And then you do the same with 7, and so on. This means your trying all the primes with all the primes.
Instead, if you are checking whether the first prime is 3, you know that the second prime has to be 97. So all you have to do is check whether 97 is prime. If so, it's a match, if not, you can move on to checking 5 and 95, and then 7 and 93, and so on. In this way your algorithm checks each prime against one other number instead of each prime against all the others, and should be massively faster.
$endgroup$
As suggested by Graipher in the comment, you're probably spending a lot of time generating a list of primes. That would be worth profiling, but I can't speculate as to how it might be improved without at least seeing the code.
There are two things that stand out to me as very clear performance sinks.
The first, and easiest to change, is index
. If you call index
on a list, python has to check every element in the list one at a time to see whether it's the thing you're after. If you have millions of primes, every one of those index calls is secretly a hidden loop that runs millions of times.
The good news is you can resolve this easily enough: just keep another variable which stores where in the list of primes you're up to. Then you don't have to search for your place again every time you go round the loop, because you have a bookmark to it.
Second, and slightly more subtle, is the way that this is looping. You are looking for two numbers that add up to a target. If your target is 100, you might first say "what if it's 3?" and then try all primes with 3. Then you say "what if it's 5?" and try all the primes with 5. And then you do the same with 7, and so on. This means your trying all the primes with all the primes.
Instead, if you are checking whether the first prime is 3, you know that the second prime has to be 97. So all you have to do is check whether 97 is prime. If so, it's a match, if not, you can move on to checking 5 and 95, and then 7 and 93, and so on. In this way your algorithm checks each prime against one other number instead of each prime against all the others, and should be massively faster.
answered Feb 13 at 8:44
JosiahJosiah
3,742428
3,742428
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
I would default to using aset
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.
$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
If I had to use a regularlist
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.
$endgroup$
– Josiah
Feb 13 at 20:25
add a comment |
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
I would default to using aset
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.
$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
If I had to use a regularlist
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.
$endgroup$
– Josiah
Feb 13 at 20:25
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
Your second suggestion is great indeed. I will implement it that way! Thanks
$endgroup$
– Kevin
Feb 13 at 14:03
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
You can make this even more efficient by keeping track of "near misses" to your target which eliminate other integers. Suppose you were testing 102. You start by looking for 3 + some prime. 99 isn't a prime, but the nearest prime >= 99 is 101, so if you organize the search carefully, you can remember that you just found a prime pair for 104 = 3 + 99 while looking for a pair for 102, and the next even number you need to test is not 104 but 106 (unless you already found a pair for that as well. of course).
$endgroup$
– alephzero
Feb 13 at 17:10
$begingroup$
I would default to using a
set
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
I would default to using a
set
as in @Graipher's implementation, which is very fast at discovering whether something is present, but does not tell you what the nearest miss was.$endgroup$
– Josiah
Feb 13 at 20:18
$begingroup$
If I had to use a regular
list
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.$endgroup$
– Josiah
Feb 13 at 20:25
$begingroup$
If I had to use a regular
list
, and if the list was sorted, then there is indeed a clever trick available. The idea is that as the small prime being tested grows, the number it needs to be added to shrinks. That means that you can keep two bookmark variables pointing at the list of primes, one at the start and moving forwards, and one at the end and moving backwards. As long as sum is too small, you move the first bookmark along. If the sum is too big, you bring the second bookmark backwards. This avoids having to scan the whole list each time. However, it's quite fiddly to get right.$endgroup$
– Josiah
Feb 13 at 20:25
add a comment |
$begingroup$
Let's take n = 1,000,000,000 for example and see what your code does.
It calculates all primes from 1 to 1 billion. That takes a while, but it gives you an array of all primes in sorted order.
It then calculates 2 + 2, 2 + 3, 2 + 5, 2 + 7, 2 + 999,999,xxx to check if one of these numbers equals 1,000,000,000. But obviously when addend1 = 2, addend2 has to be 999,999,998 to add to one billion, so you are checking tens of millions of sums unnecessarily.
It then calculates 3 + 3, 3 + 5, 3 + 7 etc., and again, addend2 would have to be 999,999,997 million. And then again for addend1 = 5, 7, 11 etc. So you are checking a huge number of sums needlessly.
Start with addend1 = first prime, addend2 = last prime in your array. If their sum is too small, replace addend1 with the next prime to make the sum larger by the smallest possible amound. If their sum is too large, replace addend2 with the previous prime. If the sum is right, you have found a solution. And once you reached addend1 > addend2, you know there is no solution. This will be very quick, since usually you don't need to check too many values for addend1.
That makes the search a lot quicker, but doesn't help with the sieve trying to find all the primes. You usually don't need all the primes, only a very small number. In the example with n = 1,000,000,000 you probably find two primes adding up to a billion with addend1 ≤ 1000 and addend2 ≥ 999,999,000.
So here's what you do: To find all primes say in the range 999,999,000 to 1,000,000,000, you need to check if these numbers are divisible by any prime up to the square root of 1,000,000,000 which is a bit less than 32,000. So first you use a sieve to find all numbers up to 32,000. Then you create a sieve that finds all primes from 999,999,000 to 1,000,000,000. You run your search algorithm until it tries to examine primes that are not in this range. This likely doesn't happen, but if it happens, you create another sieve for the numbers 999,998,000 to 999,999,000 and so on. So instead of maybe 50 million primes, you look only for maybe 50 or 100. Again, that makes it a lot faster.
$endgroup$
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
add a comment |
$begingroup$
Let's take n = 1,000,000,000 for example and see what your code does.
It calculates all primes from 1 to 1 billion. That takes a while, but it gives you an array of all primes in sorted order.
It then calculates 2 + 2, 2 + 3, 2 + 5, 2 + 7, 2 + 999,999,xxx to check if one of these numbers equals 1,000,000,000. But obviously when addend1 = 2, addend2 has to be 999,999,998 to add to one billion, so you are checking tens of millions of sums unnecessarily.
It then calculates 3 + 3, 3 + 5, 3 + 7 etc., and again, addend2 would have to be 999,999,997 million. And then again for addend1 = 5, 7, 11 etc. So you are checking a huge number of sums needlessly.
Start with addend1 = first prime, addend2 = last prime in your array. If their sum is too small, replace addend1 with the next prime to make the sum larger by the smallest possible amound. If their sum is too large, replace addend2 with the previous prime. If the sum is right, you have found a solution. And once you reached addend1 > addend2, you know there is no solution. This will be very quick, since usually you don't need to check too many values for addend1.
That makes the search a lot quicker, but doesn't help with the sieve trying to find all the primes. You usually don't need all the primes, only a very small number. In the example with n = 1,000,000,000 you probably find two primes adding up to a billion with addend1 ≤ 1000 and addend2 ≥ 999,999,000.
So here's what you do: To find all primes say in the range 999,999,000 to 1,000,000,000, you need to check if these numbers are divisible by any prime up to the square root of 1,000,000,000 which is a bit less than 32,000. So first you use a sieve to find all numbers up to 32,000. Then you create a sieve that finds all primes from 999,999,000 to 1,000,000,000. You run your search algorithm until it tries to examine primes that are not in this range. This likely doesn't happen, but if it happens, you create another sieve for the numbers 999,998,000 to 999,999,000 and so on. So instead of maybe 50 million primes, you look only for maybe 50 or 100. Again, that makes it a lot faster.
$endgroup$
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
add a comment |
$begingroup$
Let's take n = 1,000,000,000 for example and see what your code does.
It calculates all primes from 1 to 1 billion. That takes a while, but it gives you an array of all primes in sorted order.
It then calculates 2 + 2, 2 + 3, 2 + 5, 2 + 7, 2 + 999,999,xxx to check if one of these numbers equals 1,000,000,000. But obviously when addend1 = 2, addend2 has to be 999,999,998 to add to one billion, so you are checking tens of millions of sums unnecessarily.
It then calculates 3 + 3, 3 + 5, 3 + 7 etc., and again, addend2 would have to be 999,999,997 million. And then again for addend1 = 5, 7, 11 etc. So you are checking a huge number of sums needlessly.
Start with addend1 = first prime, addend2 = last prime in your array. If their sum is too small, replace addend1 with the next prime to make the sum larger by the smallest possible amound. If their sum is too large, replace addend2 with the previous prime. If the sum is right, you have found a solution. And once you reached addend1 > addend2, you know there is no solution. This will be very quick, since usually you don't need to check too many values for addend1.
That makes the search a lot quicker, but doesn't help with the sieve trying to find all the primes. You usually don't need all the primes, only a very small number. In the example with n = 1,000,000,000 you probably find two primes adding up to a billion with addend1 ≤ 1000 and addend2 ≥ 999,999,000.
So here's what you do: To find all primes say in the range 999,999,000 to 1,000,000,000, you need to check if these numbers are divisible by any prime up to the square root of 1,000,000,000 which is a bit less than 32,000. So first you use a sieve to find all numbers up to 32,000. Then you create a sieve that finds all primes from 999,999,000 to 1,000,000,000. You run your search algorithm until it tries to examine primes that are not in this range. This likely doesn't happen, but if it happens, you create another sieve for the numbers 999,998,000 to 999,999,000 and so on. So instead of maybe 50 million primes, you look only for maybe 50 or 100. Again, that makes it a lot faster.
$endgroup$
Let's take n = 1,000,000,000 for example and see what your code does.
It calculates all primes from 1 to 1 billion. That takes a while, but it gives you an array of all primes in sorted order.
It then calculates 2 + 2, 2 + 3, 2 + 5, 2 + 7, 2 + 999,999,xxx to check if one of these numbers equals 1,000,000,000. But obviously when addend1 = 2, addend2 has to be 999,999,998 to add to one billion, so you are checking tens of millions of sums unnecessarily.
It then calculates 3 + 3, 3 + 5, 3 + 7 etc., and again, addend2 would have to be 999,999,997 million. And then again for addend1 = 5, 7, 11 etc. So you are checking a huge number of sums needlessly.
Start with addend1 = first prime, addend2 = last prime in your array. If their sum is too small, replace addend1 with the next prime to make the sum larger by the smallest possible amound. If their sum is too large, replace addend2 with the previous prime. If the sum is right, you have found a solution. And once you reached addend1 > addend2, you know there is no solution. This will be very quick, since usually you don't need to check too many values for addend1.
That makes the search a lot quicker, but doesn't help with the sieve trying to find all the primes. You usually don't need all the primes, only a very small number. In the example with n = 1,000,000,000 you probably find two primes adding up to a billion with addend1 ≤ 1000 and addend2 ≥ 999,999,000.
So here's what you do: To find all primes say in the range 999,999,000 to 1,000,000,000, you need to check if these numbers are divisible by any prime up to the square root of 1,000,000,000 which is a bit less than 32,000. So first you use a sieve to find all numbers up to 32,000. Then you create a sieve that finds all primes from 999,999,000 to 1,000,000,000. You run your search algorithm until it tries to examine primes that are not in this range. This likely doesn't happen, but if it happens, you create another sieve for the numbers 999,998,000 to 999,999,000 and so on. So instead of maybe 50 million primes, you look only for maybe 50 or 100. Again, that makes it a lot faster.
answered Feb 13 at 23:57
gnasher729gnasher729
2,142812
2,142812
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
add a comment |
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
I didnt learn about sieves yet, gotta look into it
$endgroup$
– Kevin
Feb 14 at 5:52
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
$begingroup$
It might be worth testing the primality of the larger number as you hit it, rather than sieving a whole range.
$endgroup$
– Josiah
Feb 14 at 7:54
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f213357%2fgoldbachs-conjecture-algorithm%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
7
$begingroup$
Have you profiled this? How is
primenums
implemented?$endgroup$
– Graipher
Feb 13 at 8:23
1
$begingroup$
Profiled? What do you mean?
$endgroup$
– Kevin
Feb 13 at 13:46
6
$begingroup$
@EricLippert Don't worry about language/compiler/interpreter optimization before you do mathematical/complexity optimization first. Switching languages can only improve performance by a small factor (1-10). Changing the algorithm can give speedups of 1-infinity.
$endgroup$
– Vortico
Feb 13 at 18:10
2
$begingroup$
@Vortico are you seriously lecturing Eric Lippert on performance gains and optimization?
$endgroup$
– MikeTheLiar
Feb 13 at 19:57
8
$begingroup$
@MikeTheLiar I know he knows, but someone reading his comment may be misled. It's bad advice to recommend to a programming beginner to change the tool he's learning before even pointing out the issue with his algorithm. It's discouraging, doesn't solve the problem at hand (because maybe his goal is to simply learn Python, and remember this is a code review community), and doesn't teach how to think on one's own. It's not "The first problem" with his code as he said. It's the 26th or maybe 126th problem he should worry about while playing around with Goldbach's conjecture.
$endgroup$
– Vortico
Feb 13 at 20:31