Proving Collatz is "Baker-Hard" (self-contained)

mathkook · June 19, 2025, 12:05pm

(Now that I understand Tao’s argument about Collatz being “Baker-hard”, I can re-explain it in this self-contained post.)

Tao’s Proposition 6 says (roughly), “If you find a case of 2^k - 3^x = n for some small n<x, then I can construct a Collatz counter-example cycle.”

The contrapositive is, “If there are no Collatz cycles (Weak Collatz Conjecture is true), then 2^k - 3^x > x for all (k, x).”

The upshot is, “Anyone who proves the Weak Collatz Conjecture will have proved, as a side effect, the quite hard result 2^k - 3^x > x, which lies in a difficult part of mathematics pioneered by Alan Baker in the 1960s.”

Or, “If you aren’t up to proving things like 2^k - 3^x > x, you aren’t going to solve Collatz.”

Since Proposition 6 is sobering news for a kook, I thought it’s worth understanding.

v = parity vector, such as 11010. This is some series of “up” (1) and “down” (0) moves in a Collatz trajectory.

k = length of v

x = count of 1 s in v

Every parity vector corresponds to some cycle whose first member is

\cfrac{\beta(v)}{2^k-3^x}

where \beta(v) is an interesting sum of 2^i 3^j products, the formula for which is critical to understanding Collatz, but has an unaesthetic form. A concrete example:

\beta(11010) = 2^0 3^2 + 2^1 3^1 + 2^3 3^0 = 23

For every 1 in v, we multiply 2^i (i is the 0-based position of the 1) with 3^j (j is x-m-1 for the first 1, moving down to 0 for the last 1).

One of the biggest problems with Collatz is that these \beta values seem kind of random.

In the case of 11010, the first member of its cycle is

\cfrac{23}{2^5-3^3} = \cfrac{23}{5}

which is not an integer.

It may be surprising that a Collatz cycle can have members that are fractions, but if you consider \frac{23}{5} as odd because 23 is odd, and consider 11010 as a sequence of operations where 1 stands for (3n+1)/2 and 0 stands for n/2, then \frac{23}{5} will return to itself after 5 steps. (Consider a fraction odd if its numerator is odd.)

The Weak Collatz Conjecture posits that no parity vector with k>2 corresponds to a cycle with integer members.

Now suppose I tell you I’ve found k and x such that 2^k - 3^x equals some n < x.

For example:

I say 2^k - 3^x = 2. You can say “not possible”, because 2^k - 3^x is odd.

I say 2^k - 3^x = 3. No, because 2^k - 3^x is throdd.

I say 2^k - 3^x = 5. According to mod-5 tables for 2^a and 3^b, we know 2^k - 3^x isn’t even divisible by 5 unless (at a minimum) k and x are both odd, or both even. If I admit to you, “Actually k is odd and x is even.” Then you say, “Again, not possible”. (Note that I don’t reveal my k and x, and won’t).

2^k - 3^x = 7. Okay, now you’re in a difficult spot. I tell you there’s an infinitude of (k, x) such that 2^k - 3^x is divisible by 7… so who’s to say it never equals 7?

Here, you change tack. Instead of saying “Not possible”, you say “Congratulations, if you are right, then you have also found a Collatz counter-example cycle!”

For the rest, I’m am going to demonstrate using the concrete (false) assertion that 2^{13} - 3^8 = 7, but the argument applies to any 2^k - 3^x = n where n<x.

There are 99 distinct cycles of length 8 with 5 odds, because there are \binom{k}{x} parity vectors with k per cycle .

Considering cycle member

\cfrac{\beta(v)}{2^k - 3^x}

we have 99 “chances” (different \beta values) for this fraction to be an integer. If I’ve claimed 2^{13}-3^8 = 7, then there’s an overwhelming “chance” that at least one of those 99 cycles has integer members. Already, it’s suspicious that 7 is so low.

But \beta values are not, in fact, random. So I can’t knock down your assertion with this probabilistic argument.

Tao’s Proposition 6 actually proves that at least one v of length 13 and weight 8 has a \beta(v) that is divisible by 7. (And more generally for any length k, weight x, and denominator n<x, though I’m going to stick with this concrete example for a while.)

So let’s make the proof sketch.

We start with a particularly-chosen v of length k and weight x, namely

v = 1101101011010

This is the v that distributes the 1s “as evenly as possible” among the 0s. It’s known as the upper Christoffel word, or the parity vector for the Collatz high loop.

The main property we care about is that at least half of the 1s are followed by a 0. This is because the 1-count is at most (2/3)k; otherwise 2^k - 3^x < 0.

So we have at least x/2 places where substring 10 appears in v.

These give us x/2 independent “knobs”, each of which we can decide to locally flip (or not flip), changing some 10 to 01.

Flipping no knobs keeps \beta(1101101011010) = 14501 \equiv 4 \not\equiv 0 \mod 7. So v can’t claim to be a Collatz cycle.

Flipping the first 10 changes v to v^\prime = 1011101011010, which changes \beta(v) by - 3^6 2^1 + 3^6 2^2, which is + 3^6 2^1. Notably, this “plus up” value is coprime to (odd, throdd) 2^k - 3^x.

The plus-up values for all five “knobs” are +1458, +1296, +1728, +1536, and +2048.

Generally speaking, moving from v to v^\prime via an individual flip will not result in \beta(v^\prime) \equiv 0 \mod 7 (though unfortunately for my concrete example, the fourth flip happens to do that, because \beta(v) + 1728 \equiv 0 \mod 7).

But we have at our disposal 32 combinations of flips. Amazingly (or “amusingly”, as Tao puts it), it turns out that one of these combinations is guaranteed to give \beta(v^\prime) \equiv 0 \mod 7, meaning v^\prime will correspond to a Collatz counter-example cycle, thus proving Proposition 6.

The key is that each of the plus-ups is coprime to \beta(v).

The ancient Greeks knew that ax + by = 1 always has a solution in (x, y) if constants a and b are coprime. Somewhere down the line, one of those x multiplies is going to wind up adjacent to one of those y multiples.

In a similar way, 14501 + 1458 y_1 + 1296 y_2 + 1728 y_3 + 1536 y_4 + 2048 y_5 \equiv 0 \mod 7 has a solution in binary y_i (1 = flip, 0 = no flip), provided that:

each coefficient is coprime to the first constant
there are enough terms

It turns out that our x/2 available flips are enough.

For better or worse, I’m going to leave it there. I’d be happy to explain ax + by = 1, but expanding “a large enough subset of a finite cyclic group has sum-sets that cover the whole group” into math-kook language isn’t on today’s bingo card.

Intuitively, 7 is so small enough that we can hit it with some combination of y_i. If it were much larger, we might not be able to.

The last thing I want to show is that the other members of the v^\prime cycle are also integers. To get each successive member, we keep rotating v^\prime to the left. If v^\prime starts with 0, then for its left-rotation w, we have

\beta(w) \equiv \beta(v^\prime) 2^{-1} \equiv 0

so \beta(w) \equiv 0. On the other hand, if v^\prime starts with 1, then

\beta(w) \equiv (\beta(v^\prime) - 3^{x-1}) \cdot 3 \cdot 2^{-1} + 2^{k-1}
\equiv - 3^x 2^{-1} + 2^{k-1}
\equiv 2(2^k - 3^x) \equiv 0

And we can continue to rotate by setting v^\prime to w. For this to work, we do need 2^k - 3^x to be divisible by n, though we don’t need 2^k - 3^x = n. In other words, if I claim 2^k - 3^x = 5 for some odd k and even x, you should say “stop right there” rather than attempting to construct a Collatz counter-example cycle.

Final conclusion:

The non-trivial separation of powers of 2 and 3 requires advanced math and was difficult to prove even for the Fields Medal-winning mathematicians (like 1960s Alan Baker).
If I claim 2^k - 3^x = n for some k, x, and n<x, then you can show me a Collatz counter-example cycle.
Therefore, a proof of the Collatz Conjecture will generate, as a by-product, some Fields Medal-worthy knowledge about number theory.

Of course, that “Fields Medal-worthy” knowledge has already been known since the 1960s, so you won’t get a Fields Medal. But, fortunately, this provides an alternate plan: instead of ignoring Baker’s results and winding up having to replicate them (or something even more difficult and powerful), you can use them.

mathkook · June 19, 2025, 4:54pm

Picking y_i = \{ 1, 1, 0, 0, 0\} solves the equation. The resulting vector has \beta(1011011011010) = 17255 = 2465 \cdot 7 \equiv 0 \mod 7, and so provides a putative Collatz counter-example (supposing 2^{13} - 3^8 = 7 weren’t a lie).

But the question is whether it always works to choose the Christoffel word (high cycle parity vector) as a base for the flipping … meaning that some choice of y_i is guaranteed to do the job. Like, a problem would arise if lots of the plus-ups were congruent to 0 \mod 7. As usual, the part of the proof that I glossed over may hide difficulties with my understanding

oros · June 20, 2025, 7:42pm

I find your exposition of Tao’s argument very clear and convincing!

To achieve the proof, all we need is to show that the set of differences \beta(v_i)-\beta(v_0) taken modulo 7 covers all congruence classes from 1 to 6, where v_0 is the upper Christoffel word associated to (k,x) and where the vectors v_i are arbitrary permutations of v_0. I think we can safely restrict the parameter space by setting k=\left\lceil \frac{\log 3}{\log 2} \, x \right\rceil, at least when x is not too small.

Now, here is a sketch of proof that works pretty well:

Take the upper (i.e., with intercept 1) Sturmian word of slope \frac{\log 2}{\log 3}, which is starting as follows: (110)^3(10)(110)^2(10)\ldots
Observe that, for x sufficiently large, the first 17 digits of v_0 are exactly the above ones (empirically, for x > 65 or so).
Consider the vectors v_i obtained as in your post by flipping the i^{\rm th} occurence of 10 in v_0 for i=1,\ldots,6.
Express the differences d_i = \beta(v_i)-\beta(v_0) as a function of x and compute the congruence class
d_1=2 \cdot 3^{x-2} \equiv 3^x \pmod{7}
d_2=2^{4} \cdot 3^{x-4} \equiv 4 \cdot 3^x \pmod{7}
d_3=2^{7} \cdot 3^{x-6} \equiv 2 \cdot 3^x \pmod{7}
d_4=2^{9} \cdot 3^{x-7} \equiv 5 \cdot 3^x \pmod{7}
d_5=2^{12} \cdot 3^{x-9} \equiv 6 \cdot 3^x \pmod{7}
d_6=2^{15} \cdot 3^{x-11} \equiv 3 \cdot 3^x \pmod{7}
To conclude, observe that 3^x is invertible in \mathbb{Z/7Z}, so that d_1, \ldots , d_6 covers the congruence classes 1, \ldots, 6 whatever the value of 3^x \pmod{7}.

oros · June 21, 2025, 11:00am

One should also investigate whether 2^k-3^x = 7 for some x \leq 65. By a straightforward computation, we find that the only solution is 2^4-3^2 = 7 if we assume x \geq 1 (otherwise, we also 2^3-3^0 = 7).

As a result, the Collatz conjecture implies the (“Baker-hard” ?) statement that the equation 2^k-3^x = 7 has exactly two solutions (k=4, x=2 and k=3, x=0).

However, we should be aware that solving equations of the form 2^k-3^x = n for small n is not always “Baker-hard”. In fact, they are very easy to solve when n is not congruent to 5 or 7 modulo 8. The reason is that the powers 3^x are either congruent to 1 or 3 (mod 8).

By the way, it is not clear to me why you rejected the equation 2^k-3^x = 5, which has the solution 2^5 - 3^3 = 5.

mathkook · June 21, 2025, 12:00pm

Good point: the parity vector I constructed for n=5 requires more than three 1s to work (or more accurately, it can’t produce a vector with three ones and two zeroes), so the construction doesn’t go through for small x=3, k=5.

Well maybe more importantly, we’re only “refuting” the case where someone claims the equation for some n < x. In this case, x=3 is smaller than n=5.

mathkook · June 21, 2025, 12:16pm

2^k - 3^x < n for all n < x doesn’t seem to need the “for sufficiently large x” caveat that Baker’s better bound has. I don’t know if it was proved before Baker’s theorem, but Tao says the known proof requires Baker-style techniques.

mathkook · June 21, 2025, 12:39pm

@oros I like your method so much, maybe we can go for a general proof?

We are trying to convert the assertion 2^k - 3^x = n (for some n<x) into a Collatz cycle.

We start with some parity sequence v and flip various instances of 10 to 01 until we reach some \beta(v^\prime) \equiv 0 \mod n.

It’s a great strategy to prove that the differences produced by individual flips cover all residues. Then one of the flips will combine with \beta(v) to make \beta(v^\prime) \equiv 0 \mod n.

And ensuring distinct residues by spacing the flips in an arithmetic progression – yeah! This is great, especially the part where 3^{x-2m} \equiv 3^x. And we can rely on 3 being coprime with n.

One problem is that n might be close to x (our original assertion only required n<x), which raises two questions:

having just a bit over x/2 flips can’t cover all the residues of n
those flips won’t be in an arithmetic progression in the Sturmian word base v

For point 2, let’s trying dropping the Sturmian word, and use this simpler construction as a base v instead:

(110)^{\lfloor x/2 \rfloor}(0|1)0^*

Now we get the arithmetic progression back. It’s funny that the (110)^* idea was part of my hand solution for n=5.

In this case, I don’t think we need the k = \lceil x \log_2(3) \rceil restriction anymore. More 1s will result in a negative cycle, and if we have fewer ones, we just replace some 110 with 010, preserving the flip locations. (But thank you for reminding me to start a topic on whether k = \lceil x \log_2(3) \rceil is ok, which I will do!)

For point 1, I think > x/2 flips should actually suffice if the residues are all distinct, because in that case, the sum of the differences produced by 2 flips (suitably chosen) should provide the right residue to force \beta(v^\prime) \equiv 0 \mod n.

Suppose \beta(v) \equiv c, and the flip-difference mods are \{d_1, d_2, \ldots d_m\}. Consider also a parallel list \{-c-d_1, -c-d_2, \ldots -c-d_m\}.

Since m > x/2 > n/2, there will be at least one overlap in the values of the two lists. Its location in the first list gives us one flip, and its location in the second list gives us the other flip. For example, if d_1 = - c - d_2, then \beta(v^\prime) = c + d_1 + d_2 = c + (- c - d_2) + d_2 \equiv 0 \mod n.

It seems surprising that two flips on a simple, regular parity sequence would be enough to turn the equation into a Collatz counter-example … interested in any comments …

oros · June 21, 2025, 1:01pm

Ok, so there is more work required to address all cases where n<x, as in Tao’s argument.

Actually, I just checked back the post of Tao which only mentions the constraint n \ll x for the absence of solution, which probably means n \leq c\,x for some constant c, as this notation is commonly used in the context of number theory to denote a Big-O upperbound (since Vinogradov).

So, we need to show that there are finitely many solutions for which n<x, then it will be easy to adjust the constant c and get the desired result.

mathkook · June 21, 2025, 1:12pm

I saw that too… I wonder if Tao was just being conservative there, or if his proof requires sufficiently large x to go through. Maybe another proof doesn’t need it.

FWIW, I don’t think there are any solutions to 2^k-3^x=n where n<x.

I mean, I don’t think there are any known solutions at small x, like there are in the case of Ellison’s bound, which requires x > 17. I think Rhin’s bound also has no exceptions, but it is crazy conservative at low x.

mathkook · June 21, 2025, 3:57pm

Not only do we not have to worry about the usual “small x” cases like 2^5-3^3 = 5, since n=5 isn’t less than x=3 … we don’t even have to worry about the usual suspect, 2^1 - 3^0 = 1, since n=1 isn’t less than x=0. Same for 2^2-3^1=1.

I took a look at Tao’s proof sketch, and I think maybe he is starting with a randomly-selected parity vector as base. There he says he expects \gg x flip locations. Which looks like shorthand for "for large enough x, you’re guaranteed there’s some parity vector (out of the zillions) with the property that there are at least (say) \frac{1}{2} x potential flip locations.

I can’t get the rest of his geometry … but it seems simpler to construct an adequate base vector than to prove that one must exist …

oros · June 21, 2025, 4:24pm

As for the equation 2^k-3^x = 7, it is indeed not “Baker-hard” as it was completely solved around 1936 by Herschfeld in The equation 2^x-3^y=d. The proof that (4,2) is the unique (non-zero) solution is clearly elementary and surprisingly simple. So I slightly corrected my previous post.

mathkook · June 21, 2025, 4:32pm

It is such a freaking cool history. Herschfeld made a series of little proofs concluding

2^k - 3^x > a whenever x > c

covering values of a up to 100.

He just couldn’t generalize to 2^k - 3^x > f(x), because … not elementary!

mathkook · June 21, 2025, 6:11pm

I don’t think I answered this well before. In my original post, I should have said, “If I claim to have found 2^k-3^x = 5 for some x > 5, you can refute my claim … just go directly to the cycle construction step, and certainly don’t worry about any cases where x \leq 5.”

The reason I moved on to 7 (and skipped 5) is way too subtle! The final step of the construction is to take the parity sequence whose first member is an integer … then rotate it to find the other integers in the cycle. This only works if 2^k - 3^x is divisible by n … which it surely is, if I claim it equals n! But for the concrete k and x I’d already picked for my example to come, 2^k - 3^x was not in fact divisible by 5, so the very last part of the concrete demo wouldn’t have been pretty.

In a general proof, this problem shouldn’t arise.

mathkook · June 21, 2025, 6:58pm

Oh, maybe the Sturmian word was the right idea after all, to get the residues to be distinct, instead of (110)^* 0^* ?? I can’t quite visualize the general proof that the residues will be distinct.

Either way, we won’t have all the residues covered, but more than half of them, so two flips instead of one should work.

mathkook · June 23, 2025, 3:04pm

Dangit, it’s proving pretty hard to find a general construction to turn any claim like 2^160 - 3^100 = 65 into a Collatz counter-example cycle. I’ll keep at it.

oros · June 23, 2025, 4:53pm

Yes, a large number of cases may be tough to handle if you haven’t enough flipping choices.

Very likely, it is sufficient to only consider the cases given by a constraint of the form n < c\,x with c a positive constant that can be freely adjusted so as to ease the finding of a general construction. If n is much smaller than x, that should be easier to prove the existence of a cycle of length k.

soitgoes · June 28, 2025, 11:56pm

Ellison’s bound can be improved to 2.93^x when x > 200. I just wrote a little program to get that number and reproduce the 2.56^x bound as well. I’d guess someone could just follow the work in his paper to formally prove this limit.

mathkook · July 1, 2025, 10:47am

@soitgoes Very interesting! Makes me curious if the 2.93 is an empirical curve-fit, versus, say, the 2.56 which is proved in Ellison’s paper.

soitgoes · July 2, 2025, 6:33pm

Here is the program I used. Since Baker showed it diverges after any negative results for a base. As a bound it doesn’t seem tight enough to be of use for Collatz so it is harder. It shows where the 2.56 comes from. It’s just a coincidence that it matches: 2^8 / 10^2

# Print out values for different bases for Baker’s bound: 2^K - 3^L > 2.5^L
# If a negative value is printed then the bound is invalid for that length.
#
import math

for L in range( 5, 646 ):

Base = 2.93 # L > 200
Base = 2.56 # Ellison: L > 17
Base = 2.5 # Baker: L > 17

print( L, Base, float(2 ** K - Base ** L) - float(3 ** L) )

oros · July 4, 2025, 6:30pm

And here is an attempt to get closer to a general proof that any solution to the equation 2^k -3^x = m with 0<m< c x implies the existence of a non-trivial Collatz cycle. We leave the positive constant c \leq 1 undetermined for now …

Thus, we assume the existence of such m \leq cx. It is not difficult to see that m is odd and is not a multiple of 3.

The case k > \frac{\log 3}{\log 2} \, x + 1 is easy to rule out since it implies
2^k > 2^{\frac{\log 3}{\log 2} \, x + 1} = 2 \cdot 3^x > 3^x + c x.

For the case k = \left\lceil \frac{\log 3}{\log 2} \, x \right\rceil, we call v_0 the vector matching the first k digits of the upper Sturmian word of slope \frac{\log 2}{\log 3} starting as
(110)^3(10)(110)^2(10)…

The number of 1’s in v_0 is equal to x. Since there are no two consecutive 0’s in v_0, the number of occurrences of 10 is equal to the number z of 0’s:
z = k-x > \left( \frac{\log 3}{\log 2} - 1\right) x > 0.58 x.

Next, we define the vectors v_i obtained by flipping the i^{\rm th} occurence of 10 in v_0 for i=1,\ldots,z.

Express the differences d_i = \beta(v_i)-\beta(v_0) under the form
d_i = 2^{a_i} \cdot 3^{x-1-b_i}
where
1=a_1 < a_2 < \ldots < a_z < k
1=b_1 < b_2 < \ldots < b_z < x
Empirically, we also have a_i = \left\lfloor \frac{\log 3}{\log 2} \, b_i \right\rfloor.

We want to show that all non-zero congruence classes modulo m are covered by the sequence of d_i values, which is only possible if m < z . So we set c = \frac{\log 3}{\log 2} - 1 \simeq 0.58.

Observe that d_{i+1} - d_i = \pm 2^{a_i} \, 3^{x-1-b_{i+1}} is never divisible by m (unless m=1), so that d_i and d_{i+1} are not in the same class for any i.

More generally, for i<j, we have d_j - d_i = 2^{a_i} \, 3^{x-1-b_j} \left( 2^{a_j-a_i} - 3^{b_j-b_i} \right) which is divisible by m iff \delta_{i,j} = 2^{a_j-a_i} - 3^{b_j-b_i} is divisible by m. The question is now whether the numbers of the form \delta_{i,j} can be divisible by m = 2^k-3^x.

Here we can use the fact that whenever 2^k-3^x divides 2^A-3^B, then it also divides 2^{k-nA}-3^{x-nB} for any n as long as the exponents are non-negative. If we repeat this process, we should be able to rule out this case. However, it is far from straightforward to finalize this proof and I leave it as a challenge…