Which Collatz cycle shapes correspond to integer cycles?

mathkook · June 21, 2025, 4:12pm

Think of a cycle shape as a series of ups and downs, such as up-down-up-up-down-up-down (1011010). There’s only one number that returns to itself after those operations. In the case of 1011010, it’s

\cfrac{2^0 3^3 + 2^2 3^2 + 2^3 3^1 + 2^5 3^0}{2^7 - 3^4} = \cfrac{119}{47}.

which is not an integer. If you’re uncomfortable with applying the Collatz rule to fractions, just use the numerator for the odd/even test, then see if you can take \cfrac{119}{47} back to itself, while confirming that the shape you follow corresponds to 1011010. The first step would be (3 \cfrac{119}{47} + 1)/2 = \cfrac{249}{47}.

The only shape known to return an integer to itself is: up-down (10). If you start with 1 and go “up-down” you go 1 \rightarrow 2 \rightarrow 1. Actually, the shapes 01, (10)^*, and (01)^* also return 2, 1, and 2 back to themselves, respectively.

If we could rule out all the other shapes, we’d have proved the Weak Collatz Conjecture (no cycles).

Here are all the cycles of length 8 with 5 ups (odds), along with the members drawn in:

I think it’s interesting to look at the individual numbers in the chart. Usually, we look at concrete numbers to double-check our abstract assertions or sweeping generalizations, but occasionally it’s the other way around – studying the numbers can reveal a pattern.

Some interesting things to observe:

All members of (8, 5) cycles fall in a range between 16 and 130.
No (8, 5) cycles consist of integers. That is, none of the numerators are ever multiples of 13. If you think of the numerators as random (and heck, I don’t blame you!), you will attribute “no (8, 5) cycles” to bad luck. I mean, come on (8, 5), you had seven chances to make some numerator that’s a multiple of 13 ! (Note: 7 chances, not 56 chances; easy math says, “If one member of a cycle is an integer, they all are.”)
Two cycles seem distinguished: (1) the outer ring, called the circuit, and (2) the inner ring, called the high cycle. The circuit has the “highest of all cycles’ high members” and the “lowest of the low members.” The high cycle has the “highest of the low members” and the “lowest of the high members.”
The high cycle has two members that we can strategically subtract from each other: \cfrac{485}{13} - \cfrac{421}{13} = \cfrac{2^6}{13}, which can’t be an integer, since 13 is odd. If the high cycle consisted of integers, this pair’s difference would also be an integer … which it isn’t … so it doesn’t.
It turns out every high cycle of length k has two members whose difference is \cfrac{2^{k-2}}{2^k - 3^x}. So no high cycle (on any other chart) is a Collatz counter-example.
Other cycles also have pairs of members with this property, but it always seems to be an “accident.”
Consider the first two members of the circuit, \cfrac{211}{13} and \cfrac{323}{13}. Their difference is \cfrac{112}{13} = \cfrac{2^4 \cdot 7}{13}, which is an integer only if \cfrac{7}{13} also is. But it isn’t, so this circuit doesn’t consist of integers.
If you try this for larger circuits, the numerator 2^{k-x}-1 (in the above case, 7) always seems smaller than the denominator 2^k - 3^x (in the above case, 13). But proving this currently requires Baker’s heavy artillery showing the denominator grows almost as fast as 3^x … that is, no near collisions between 2^k and 3^x are possible.
What about taking three members of a cycle, say a, b, and c, and showing that a+b-c is of the non-integral form \cfrac{2^i 3^j}{2^k - 3^x} ? That would rule the cycle out. Or n members, or all members? I’ve studied the heck out of this. When I look at increasingly big cycles, I always seem to find one that resists this sort of reasoning.
One thing circuits and high cycles have in common is that members of the “subtraction pair” always have very similar parity vectors. For example:

Circuit a: 1111{\color{red}1000}
Circuit b: 1111{\color{red}0001}

High cycle a: 101101{\color{red}10}
High cycle b: 101101\color{red}{01}

In both cases, they share a long prefix. I figured out that if the shared prefix is half the length (or more), then Baker’s theorem is strong enough to rule the cycle out. (I haven’t written this up … should I? Is it known?) Anyway, I like this result, because it addresses all kinds of other circuits, such as regular ones like (110)^* 0^*:

Regular a: 110110110110{\color{red}11000000}
Regular b: 110110110110{\color{red}00000110}

It’s sobering, though, that “almost no” cycle shapes are self-similar enough. A randomly-selected parity sequence will have no two rotations with a shared prefix longer (roughly speaking) than \log_2(k), much lower than the needed k/2.
It’s also possible to add and subtract members across different cycles, looking for patterns and conclusions. I’ve had limited success with that as well, but who knows?
See anything else in the chart?

Failix · June 21, 2025, 11:41pm

Is it intentional not to include negative cycles? Or the zero cycle?

cycle[0]: 0/(2^1-3^0) = 0/1 = 0
cycle[1]: 2^0*3^0/(2^1-3^1) = -1
cycle[110]: (2^0*3^1+2^1*3^0)/(2^3-3^2) = 5/(-1) = -5
cycle[11110111000]: (3^6+2*3^5+4*3^4+8*3^3+32*9+64*3+128)/(2^11-3^7) = 2363/(-139) = -17
(and it seems like that’s all)

If one cycle is circuit-ish (meaning it’s almost a circuit except for a few bugs, like 111111111100000010) then the difference between two consecutive terms is relatively simple.
For example, the difference between the generators of 111111111100000010 and 111111111000000101 is (2^9*3^1+2^16*3^0-2^15*3^1-2^17*3^0)/(2^18-3^11) = -162304/84997
The numerator is going to have 2(1+B) terms, with B the number of “bugs”

mathkook · June 22, 2025, 8:03am

Great! This looks like a good visual explanation for near-circuits not having integer members.

It’s a good point about 11110111000. We could draw a similar chart for the 30 cycles of length 11 with 7 ups/odds. Magically, one of those cycles would be composed of integers, albeit negative ones.

Failix · June 22, 2025, 1:45pm

Btw, the sign only depends on the length of the cycle and the number of odds: your numerator is a sum of positive integers and your numerator is 2^k-3^x.
2^11-3^7 = -139: you can only have negatives cycles.

oros · June 22, 2025, 5:24pm

Definitely yes!

Never heard of it before you mention this (in a video).

mathkook · June 24, 2025, 9:52am

Sure! Here’s the argument I have.

Claim: Any k-length parity vector with two rotations that share a prefix longer than k/2 cannot induce a Collatz counter-example cycle.

Corollaries: This rules out circuits, high cycles, regular cycles, and others.

Proof:

Suppose some Collatz cycle is induced by a k-length parity vector u with x 1s.

Now suppose condition A holds:

Parity vector u has two rotations v and w that share a prefix of length greater than k/2.

Let v and w correspond to cycle members \cfrac{\beta(v)}{2^k - 3^x} and \cfrac{\beta(w)}{2^k - 3^x}.

Assume wlog the latter is greater.

If those two members are integers, then so is their difference:

\cfrac{\beta(w) - \beta(v)}{2^k - 3^x}

We going to show, by way of contradiction, that this difference is not (in fact) an integer, so … any cycle where condition A holds cannot be a Collatz counter-example.

Here are examples of w and v:

w = ab = 101011{\color{red}01011}
v = ac = 101011{\color{red}10101}

Let k^\prime = |b| = |c| < k/2, and let x^\prime be the number of 1s in b (or equivalently, c).

To compute \beta(w) - \beta(v), terms from the shared prefix cancel out, so

\beta(w) - \beta(v) = 2^{k-k^\prime} (\beta(b) - \beta(c))

We can get rid of the power of two, because

\cfrac{\beta(w) - \beta(v)}{2^k - 3^x}

is only an integer when

\cfrac{\beta(b) - \beta(c)}{2^k - 3^x}

is an integer.

Now we’ll show the denominator outstrips the numerator, so the last expression can’t, in fact, be an integer.

How big can the numerator \beta(b) - \beta(c) get? In the maximum scenario, b will push all the 1s to the right, and a will push them to the left. With a little effort, we discover that the difference is totally maximized in the case where x^\prime = k^\prime - 1:

b = 0 1^{x^\prime} = 0 1^{k^\prime - 1}
c = 1^{x^\prime} 0 = 1^{k^\prime - 1} 0

\beta(b) = 2 (3^{k^\prime-1} - 2^{k^\prime-1})
\beta(c) = 3^{k^\prime-1} - 2^{k^\prime-1}

Now we have:

\beta(b) - \beta(c)
< 3^{k^\prime-1} - 2^{k^\prime-1}
< 3^{k^\prime}
< 3^{k/2}
< 1.74^k

For the denominator, we have Ellison’s theorem, applicable when k > 27:

2^k - 3^x
> \cfrac{2^k}{e^{k/10}}
> 1.8^k

Thus, the denominator barely outstrips the numerator, so the main result obtains.

You can see that a shared prefix of k/2 is close to the breaking point where Ellison’s theorem won’t suffice. A better result should be possible using Rhin’s theorem.

However, most parity vectors have self-overlap on the order of \log_2 k rather than some constant proportion of k.

Interested in any comments / corrections / errors

oros · June 24, 2025, 9:33pm

The proof is rather simple to follow and looks good to me. Well done! Everything is elementary except the final part with Ellison’s result. I recall the reference for the interested reader : On a Theorem of S. Sivasankaranarayana Pillai.

I noticed a few details here and there that could be corrected without altering the validity of the proof:

replace a with c before “will push them to the left”;
replace “<” with “=” after \beta(b) - \beta(c).

FWIW, the final result can be slightly improved as follows:
3^{k^\prime} <1.8^k is equivalent to k^\prime < k \frac{\log 1.8}{\log 3} \simeq 0.535\, k.
Thus, the length of the shared prefix need only be at least 0.465 \,k (less than k/2).

I made the calculation with Rhin"s theorem and found an asymptotic threshold close to 0.37\,k, which is substantially better! The breaking point occurs for k^\prime \simeq x \simeq k \log_3 2, which is probably the best we can do with this approach.

oros · June 24, 2025, 10:41pm

It appears that we can do better if we further assume that the 1’s are well-balanced accross v and w, which is the most frequent case for arbitrary vectors (i.e., the one for which the entropy is maximized). The idea is to avoid the worst-case scenario x^{\prime} = k^{\prime}-1 and assume that x^{\prime} \simeq k^{\prime} \log_3 2.

With such an assumption, we are led to the constraint (3/2)^{ k^{\prime} \log_3 2}\,2^{ k^{\prime} } < \alpha^k where \alpha = 1.8 if we use Ellison, otherwise take \alpha = 2 with Rhin. After some quick and dirty calculation, I find that the length of the prefix should exceed 0.38 \,k with Ellison, or 0.27\,k with Rhin (not that far from k/4), if I’m not mistaken…

jeff317 · August 21, 2025, 3:47am

For the center cycle to have a minimal range, does the set of all down linked path segments have to be single steps?

mathkook · August 21, 2025, 6:07am

@jeff317 Yes, if the cycle length k is ceiling(x log 3), where x is the number of up-moves, the center cycle is a concatenation of “up-up-down” and “up-down” segments. For the center cycle of length 65 with 41 up-moves (1) and 24 down-moves (0), it’s 11011011010110110101101101101011011010110110110101101101011011010. If x is very small relatively to k, then there will necessarily be stretches of zeroes.