Collatz 3n + 1 conjecture possibly solved

Gerhard Opfer has posted a paper that claims to resolve the famous Collatz conjecture.

Start with a positive number n and repeatedly apply these simple rules:

  1. If n = 1, stop.
  2. If n is even, divide n by 2.
  3. If n is odd, multiply n by 3 and add 1.

In 1937, Lothar Collatz asked whether this procedure always stops for every positive starting value of n. If Gerhard Opfer is correct, we can finally say that indeed it always stops.

Update: It appears there’s a flaw in the proof. See discussion here. Perhaps the gap can be filled in, or perhaps an idea in the paper can be of use somewhere else.

Update (September 10. 2019): As of this date, the full Collatz conjecture remains unsolved, but Terence Tao has just posted a paper chipping away at the problem.

Related post: Easy to guess, hard to prove

36 thoughts on “Collatz 3n + 1 conjecture possibly solved

  1. Wow!

    Have you tried to read the paper yet? It looks like it’s a little bit above my head. But nowadays I can ask around for help understanding something like this.

    I’m sad in a way. It was pretty cool to have a problem kids like playing around with and to know it had no solution.

  2. I haven’t read that kind of math for 5+ years so like Sue I can’t get through the paper… but it seems that it’s reasonable to expect it to stop if, in the long run, the 3n+1 operation mostly gives you numbers that can be divided twice in a row. Since you’re dividing by 4 then, it goes down faster than it goes up. In fact if that happens just over 75% of the time I think it would eventually end.

    Any odd number can be expressed at 2x+1, so if you multiply that by 3 and add 1 you end up with 6x + 4. The 4 can be divided by 4, so you need to know if the 6x can be divided by 4 most of the time – in other words if 3x is even. The series of odd numbers alternates between 2*+1 and 2* + 1 so half the x values will be even.

    That doesn’t seem sufficient, so I think you would need to either prove that it is or show that doing 3n+1 a few times tends to lead to more even numbers (maybe you do it 10 times and then you can divide by 2 20 times, or you do it 100 times than divide by 2 200 times). If it does work it’s probably by a very small margin.

    Alternately you could take the sequence of numbers that go straight to 1 because they’re the product of even numbers only, and then show that multiplying by 3n+1 enough times will eventually get you to one of those numbers. Then the question is how many times you need to do 3n+1 to hit a product of evens.

    In another 70 years I think I’ll have this :)

  3. Come now, “Collatz 3n + 1 conjecture solved”? A bit sensationalist, don’t you think? =) The paper has not even undergone peer review yet. That said, however, it certainly does look like a serious attempt — by a student of Lothar Collatz, no less! Let’s hope that it’s correct (and that I can understand it)!

  4. First error I caught was “The Collatz conjecture states, that all Collatz sequences terminate after finitely many steps with the cycle 4, 2, 1, . . .”

    Tut; n=2.

  5. @Richard “In fact if that happens just over 75% of the time I think it would eventually end.”
    I don’t think your reasoning is correct (well, it depends what you mean by “end”). I agree that the sequence tends to decrease, but you can reach a cycle. Actually one can generalize the sequences to rational numbers, and even though you can do the same reasoning, there are an infinite number of cycles.
    About 20 years ago, I tried a probabilistic approach: find the probability of getting a cycle of length c (with the operations (3n+1)/2 for n odd and n/2 for n even), and got a relation with the partial quotients and convergents of the continued fraction of log(3)/log(2). The goal was not to be rigorous, but to try to intuitively explain what we can observe. My memory is a bit weak, because I thought I obtained all the cycles (with any integer, i.e. possibly negative) with convergents, but after looking at I can see a cycle with negative integers that doesn’t correspond to a convergent.
    IIRC, I obtained the same formula as the one on Wikipedia (with different notation) about cycles of length c with b odd numbers: (3^{b-1} 2^{k_0} + … + 3^0 2^{k_{b-1}}) / (2^c – 3^b). What I was saying that |2^c – 3^b| should be “small” for some magnitude of b and c to increase the chance of getting an integer, hence the relation with the continued fraction of log(3)/log(2). Something I had noted is that the partial quotients of even position (i.e. the last partial quotient for the cases where 2^c – 3^b > 0) for positions 4 to 10 are larger than those of odd position, which is a bit bad luck…
    Actually, I now think this is much more complex and convergents don’t necessarily matter as they are too scarce. A naive probability is binomial(c,b) / (c * |2^c – 3^b|), the sign of the integers of the sequence being the sign of 2^c – 3^b. But I this is not sufficient, because the sum from b = 0 to b=c-1 doesn’t decrease quickly enough.

  6. @Scott: that’s not an error! The sequence starting with 2 is 2, 1, 4, 2, 1, 4, 2, 1, … as you can easily check, so it does fall into that cycle.

  7. So why don’t we just prove it and be done with it? Pick a number, any number. If it’s even and a power of 2, then repeated division by 2 (that pesky n/2 if n is even statement ) will converge to 1. If it’s odd, then 3n+1 will make it even. As my old math instructors would say “It’s intuitively obvious to even the most casual observer” , but you can prove it if you must. So ok, dividing that result by 2 will either end with another odd number eventually, or it will go to 1. When will it go to 1? When it’s a power of 2. That means we should find all of the numbers that make 3n+1 a power or 2. A little fiddling in Excel and you find that a starting set is (1,5,21,341, 1365, …) . You don’t really need 1 since it is, after all the solution to the whole mess, but I included it so you can see the really cool pattern that evolves.

    Thus 3*1+1 = 4 =2**2, 3*5+1 = 16, 3*21 +1 = 64 = 2**6, 3*341 + 1 = 1024 = 2**10 just to show a couple of examples. By the way, * means “times ” or multiplied by, ** means “exponent” or “to the power of” .

    The really cool pattern (as promised) is this:
    1 = 2**0 + 0, 5 = 2**2 + 1, 21 = 2**4 +5, 85 = 2**6+21, 341 = 2**8 + 85, 1365 = 2**10 + 341 and so on. As you can see (by looking at the exponents), for any even number (0, 2, 4, …) we can construct an odd number which, when plugged in to 3n+1, is a power of 2 and thus converges to 1. With the formula 3n+1 we can construct an even number from any odd number (1,3,5,7,9…). Since these 2 sets combined form the natural numbers, I say it’s QED.

  8. Of course “the really cool pattern” above cannot proove the Collatz Conjecture.
    If it was so easy, we could as well replace the term 3n+1 by simply +11, which also creates an even number from any odd number. The pattern then becomes even cooler, since any power of 2 – not only {2^n where n is even} – minus 11 is an odd number. But our resulting new conjecture, that any number leads to 1, is clearly false. q.e.d

  9. Guys,

    I am not a math student , but if I look at positive numbers only why is it so difficult to prove it ? Can a simple non mathematical explanation not suffice to prove the

    Lets assume the conjuncture is incorrect .

    So we have a repeating sequence which ends in suppose number “n”
    since n is the last number in the sequence before repetition starts n has to be odd and the first number in sequence would be 3n+1.

    The repeating sequence would proceed as follows

    3n+1 , (3n+1)/2,(3n+1)/(2^2)… this would continue till we reach
    a point where
    2^x*n – 3n =1
    n(2^x – 3) = 1
    This is only possible if n=1 and (2^x-3)=1

    Since n=1 is the only possibility the conjecture is true.

    Would appreciate any feedback.

  10. Ravi,

    The “multiply by 3 and add 1” might have to kick in again before you get back to n.

    Your argument shows that 1 is the only possibility if the “multiply by 3 and add 1” only happens once.

    Maybe you can use it for some inductive argument, but I doubt it.

  11. Ok, a bit of a variant, but… I’m a teacher and not a high-level mathematician and just learned about this via the MathPickle site which uses the conjecture within the context of a story (Icarus and Daedalus)… long story short, the goal is “not” to get to 1. Icarus’s problem is the Collatz conjecture. Daedalus’s is identical with one exception – triple and SUBTRACT one. With that change, there are numbers that will “loop”. I want to learn more about this 3n-1 conjecture. Can anyone point me in that direction? Love Collatz conjecture but curious about why the Daedalus variant WILL work… sometimes.

  12. Dr. Pedro E. Colla

    Any number positive integer number between [2^k]+1 and [2^(k+1)]-1 would require never less than k steps to converge to 1 using the Collatz algorithm. Thus if lim[k->infinite] {2^k}=infinite therefore the number of steps would be approaching at least infinite, the series will diverge and the conjecture is wrong.
    Just a reasoning that is got to be wrong somewhere, ¿any help to identify where?. Regards,

  13. Razvan Marinovici

    The sequence is completely linked to the starting number`s bit pattern. It’s not random. It depends on the flow of bits of the starting number. It’s not the number of 1`s or 0`s it is the actual position of the bits.

    Also the series is made from a lot of “restarts”, at each ceil(log2(n)) condensed steps, the series will “restart”, the original number is consumed, bit pattern exhausted.
    5 – Reaches 2 after 3 steps and goes from there
    7 – Reaches 26 after 3 steps and starts fresh, 26 -> after 5 steps reaches 8
    17 – Restarts at 5 after 4 steps
    23(11011) – Restarts at 20 after 5 steps
    27(11101) – Restarts at 71 after 5 steps, so 71 is an fresh start, a new series (end of series for 27).

    It’s easy to find/calculate an infinity of numbers that go through X = (1,2) mod 3. The first number will be at ceil(log2(n)) condensed steps away from X. No number will have a series ending in X=0 mod 3 (not sure about this one)

    For example some numbers that go through 71
    221 -> 2^8
    14601557 -> 2^24
    14601781 -> 2^24
    470383381 -> 2^29
    4136169948971763906787 -> 2^72 will reach 71 after 72 steps
    8278656945066165491029 -> 2^73 will reach 71 after 73 steps

    5041018883571751 -53 steps-> 4267784006233 -42 steps-> 91354821257 -37 steps-> 6952931999 -33 steps-> 2*470383381 -30 steps-> 71

    All above numbers will have a different path to 71, however from that point onward it is a new series, common for all.

    If we look at the complete series it might be difficult to find a pattern, if we look at one piece at a time it might become easier. It’s even difficult to look at the bits and see something.

    Because it is a bit pattern, it is a binary tree, so it resumes to walking through the tree and finding a leaf that leads to a cycle, the process of walking through the tree gives us the (a)-> goes to ->(b). I think it might be easier than calculating all numbers and checking if they eventually cycle.

    We can try and construct a number. We can definitely go further away than 2^60.

  14. Deb, your variant is equivalent to considering the 3n+1 rule applied to negative numbers. See Wikipedia for details.

  15. After reading through some of the paper and finding MATLAB algorithms as evidence for sections of the proof I was a little disappointed. The abstract also leaves me wondering what holomorphic mappings to the complex unit disk with two linear functional equations has to do with number theory. It seems to me that some complex variable research may have been applicable to the Collatz conjecture and they attempted to apply it as a solution.

    Converging series seems to be the key to this problem. The issue is that they need to simultaneously be built through induction while also showing any number in the natural numbers is contained in one of these series through induction. Which the paper seems to be attempting to address through the use of the linear functional equations and the complex disk.

    The elephant in the room is the countable infinity of the natural numbers, and the finite arbitrary sequences that resolve no matter which starting value is chosen.

    In essence you need to build out those sequences but arbitrarily, for example the sequence:

    (Thinking of n=5)
    {n, 3n+1, (3n+1)/2, (3n+1)/4, (3n+1)/8, (3n+1)/16}

    You need infinitely many of these built out somehow through induction, and then through a separate induction process you need to show any arbitrary starting value will result in a sequence that contains one of those sequences as a subsequence (another piece that the paper also attempts to address).

    Logically this is very tricky as you need to work in two contrary directions simultaneously to achieve a proof, which has always proven difficult in mathematics ;-) You need exposure to graduate level mathematics regularly to learn the logical told for these conundrums. I’m assuming the use of the complex disk with the linear functional equations is one such method.

    These problems certainly are fun, this has got me remembering real analysis and Cauchy sequences. I guess I should brush up on some mathematics from time to time, I sure miss it from completing my undergrad.

  16. Hi guys, I am not a mathematician so can someone please explain what is so hard to prove about this conjecture? I can graph the numbers in my head and reason out using it why it will converge. I apologise, as I said I am not from mathematics field. My knowledge is very limited for this.

  17. We can make some trivially provable assertions.

    Any odd number multiplied by 3 (or any other odd number) will be odd.

    Any odd number plus one will always be even.

    This simplifies the problem a bit.

    Assuming any number we start off with thats even we’ll just immediately divide by two, and any odd number we start off with we’ll multiply by 3 and add 1 to make even, the problem becomes:

    When does an even number regress to an odd number as opposed to another even number?

    Regression from even to odd ALWAYS follows from a number which has one even factor and one odd factor, i.e. any odd number times two.

    It then plainly follows this must be true for squares, cubes, squares of cubes, cubes of cubes, and so on.

    We then may say, any number with 2 as a factor and no other odd numbers or primes, will regress to 1, because division of a number where one of the factors is odd, will lead to an odd number, which naturally by definition isn’t divisible by 2.

    Ergo odds always regress to evens. Evens with mixed factors (odd and even) must always regress to odd, and evens with only even factors will always regress to 1.

    This isn’t a proof by no means but just some observations.

  18. Another aspect is to change the order of operations. 3(x+1),x/2
    For x = 5
    { 5 18 9 30 15 48 24 12 6 3 } So { 12 6 3 } cycle.
    But back on (3x)+1,x/2
    I use my own personal notation. I am an armature mathematician.
    A(x)+Y,x/2 For all Y that is a power of three, from observation, there is only one cycle. We can easily find that cycle when x equals y hence 3(1)+1 { 4 2 1 } , 3(3)+3 { 12 6 3 } and so on.
    For all other Y there are more than one cycle yet, there is always a cycle of the same; like 3(5)+5,x/2 such that { 5 20 10 5 }. This is not the only cycle for Y = 5.
    I wrote a program that searched for these cycles. I remember one example that had more than 20 numbers in its cycle.
    More can be done on this. I was mainly interested in binary encoding schemes.
    Cycles are nothing new in mathematics. Also I discovered that in physics there are many such that relate to particles. Everything is spinning.
    I believe that the premise of the Colllatz conjecture is misleading. There is a spectrum of cycles. Each unique value for Y offers a different number structure and a universe of the qualities of prime numbers. All Y being a power of three, again, is one cycle for all n-bit. However if we look at the parity language generated where an odd event is represented by 1 and an even by 0 then all values for Y generate the same language structure.
    Rule: no two consecutive odd events exist in stream.
    So: 0 or 1, { 00 10 01 }, { 000, 001, 010, 100, 101, } … Fibonacci sequence
    I believe that a universe of cycles is the true domain to study not just 3x+1
    As for a proof of any cycle? We just might have to examine the spectrum to find an answer.
    As an aside: Since I so loved all the the ilk of 3x+y,x/2 and binary encoders I went on to discover a dynamic mathematical object.
    The paper is most likely not well written, should be simplified and expanded but, I had to wing it not having any training beyond college algebra.
    Google Introduction to Dynamic Unary Encoding.

  19. I almost forgot.

    This is a dynamic equation. It consists of two functions 3x+y and x/2
    Naturally in the world of x/2 the number needs to be even inorder to produce an integer however, the data set generated by 3x+1 can process odd and even.
    For 3x+1,x/2 Collatz conjecture these two data sets share the same element only on the event of two x/2 events in the stream of events after an odd event.

  20. Lol auto-correct armature mathematician. You can correct that to amateur and delete this post.
    Too funny.

  21. I remembered another (attractor) value for 3x+5n that cycles and that is 19
    I would have to dig for that program to generate the data again.

    19 : 3(19)+5 : { 19 62 31 98 49 152 76 38 19 }
    8 elements in that cycle.

  22. Ernst you are bloody wonderful!

    The advice to look up unary coding has been a veritable cornucopia of insights into related topics. For example, if you jump from the article on unary coding to self-synchronizing code, you come across this passage:

    “In coding theory, especially in telecommunications, a self-synchronizing code[1] is a uniquely decodable code in which the symbol stream formed by a portion of one code word, or by the overlapped portion of any two adjacent code words, is not a valid code word. Put another way, a set of strings (called “code words”) over an alphabet is called a self-synchronizing code if for each string obtained by concatenating two code words, the substring starting at the second symbol and ending at the second-last symbol does not contain any code word as substring

    Whats fascinating about this is that many mechanisms within neural circuitry have been emulated using locality sensitive hashing, which reduces to the substring overlap problem in unary coding. This is important because the problem with spiking neural networks is that synchronization is not a given. Neural nets in the biological world are massive, parallel, and asynchronous systems. Worse, post-synaptic summation (done on the dendrite) is not the only mechanism for triggering a downstream neuron to fire–a much more limited set of strong pulses over strong connections and/or in a sufficiently short time frame, can trigger a response. In short, what can occur is that time encoded signals may just arrive while a cell is firing from other signals,
    become fragmented, or only partially arrive before the threshold for firing is met. Network protocols on the internet, such as TCP/IP handle packet fragmentation by simply discarding the packet, but theres no known method for ‘dropping a fragmented packet’ in the brain.
    Finding evidence LSH occurs in biological neural networks would go a long way toward demonstrating that neural communications are not just population encoded between ensembles, but also entropy encoded using unary coding–which would solve the fragmentation problem in biological networks.

    Interestingly, unary coding leads to golomb coding, used as a form of lossless compression. Studies of spike frequencies, as in spiking neural network models, have found that spikes of all frequencies have a distribution and density closely resembling, of all things, a geometric distribution.

    These insights lead on to things such as rice coding, golomb-rice coding which cheaply approximates huffman encoding without needing to transfer the encoding table, and even linear predictive coding, used in audio processing and digital speech compression/reproduction, a very simple but very efficient and effective “reproduction of the parameters of speech at a low bit rate” as wikipedia says, and which other sources corroborate.

    I could continue but I’ll leave it at that. You have my thanks and appreciation for your post!

  23. Thank you James.
    I thought your grasp of the Collatz issue was clear minded suggesting a curiosity driving investigation.
    Lost for daze in Wikipedia is not so bad IMO. Been there done that James :)
    Stream decoding, been there done that. I spent a decade trying to breach the limits of Physics. Information follows the laws of physics. Hence a song quote “I fought the law and the law won” by the Bobby Fuller Four.

  24. So my fellow Conjecture friends, we are perhaps confounded by the misstep of stopping on the value one. We fail to recognize the separate qualities of the cycle of integers which is separate from the “steps” in 3x+y,x/2 that do not cycle.
    So for all integers we can define separate sets of those values that do not cycle and those that do.
    Lothar Collatz and those honorable others who promoted this dynamic equation are not wrong yet, are we ready to see a different perspective of provable cycle vrs non-cycle?

  25. The Collatz conjecture can be restated using the lower 2 bits
    rather than as an even or odd number.

    Starting with 63,728,127
    Bits Action ~ Scale to 49,521,089 to 1
    00 => N = N / 4 1/4 22, 11.7% 43, 14.5%
    01 => N = (3 * N + 1) / 4 3/4+ 45, 23.9% 69, 23.3%
    10 => N = (3 * {N / 2} + 1) / 2 3/4+ 50, 26.5% 80, 27.0%
    11 => N = ({N * 3 + 1) / 2} * 3 + 1) / 2 9/4+ 71, 37.7% 104, 35.1%

    Put this way it is not surprising that it always converges.
    However, the number of times each case is applied differs.
    Percentages and actual counts for N = 63,728,127 are shown above.

    Multiplying out the applied scales:
    Down to 49,521,089 Scale > .25^22 * .75^45 * .75^50 * 2.25^71 = 0.77707
    Down to 1 Scale > .25^43 * .75^69 * .75^80 * 2.25^104 = 0.13260e-7

    A recursive proof only requires reducing the result below the starting number. A formal proof would require bounding the number of cases and the actual scaling values; which is of course the real work.

  26. I posted a paper about a revised Collatz series:

    A revised definition of the Collatz conjecture is used to understand its behavior. Two thirds of the values in the original series are skipped by considering values with: N % 3 = 2

    A graph of these transitions form a directed acyclic tree. For the tree to be complete it can have no disjoint branches and all branches converge to the root. Such a branch would be infinitely long and the transactions would produce ever increasing values.

    An argument is made that this branch could not exist. The distribution of transactions in the branch would need to be substantially skewed from the pool of possible transactions.

    The key observation is that values in the sequence maintain enough entropy so that skew cannot be sustained over an infinite path. A proof of the conjecture would need to account for this skew.

Comments are closed.