Parity sequences and cycles

@Failix posted here about how a divergent trajectory’s parity sequence can’t be periodic.

While a cyclic trajectory’s parity sequence can (and must) be periodic …

So which periodic parity sequences induce integer cycles?

Every parity sequence (when repeated) induces some rational cycle, eg,

1110: -\frac{19}{11}, -\frac{23}{11}, -\frac{29}{11}, -\frac{38}{11}, -\frac{19}{11}, \ldots

Here are the parity sequences (up to a rotation) known to induce integer cycles:

0: 0, 0, 0, \ldots
1: -1, -1, -1, \ldots
10: 1, 2, 1, 2, \ldots
110: -5, -7, -10, -5, \ldots
11110111000: -17, -25, \ldots, -17, \ldots

(There’s a general formula for taking a parity sequence and identifying the first member of its induced cycle. It’s gnarly, involving summing products of powers of ascending powers of 2 with descending powers of 3. An example might be better than the formula … for the case of 11110111000, it goes like this: \cfrac{2^0 3^6 + 2^1 3^5 + 2^2 3^4 + 2^3 3^3 + 2^5 3^2 + 2^6 3^1 + 2^ 7 3^0}{2^{11} - 3^7} = -17.)

The Collatz conjecture says “there are no more such parity sequences.”

One approach to the conjecture is start ruling out classes of parity sequences.

Let k be the length of the sequence, and x be the weight (# of 1s).

Right away, we can rule out any sequence with 2^k < 3^x, or k < x \log_2(3) (\approx 1.58x), because it’ll be a negative cycle.

For positive cycles, we can also provably rule out (sufficiently-long) circuits and other regular-shaped parity sequences:

1^* 0^*
(110)^* 0^*
(1110)^* 0^*

The proof method is to identify a closed-form representation of the lowest members of cycles corresponding to parity sequence class, then show that they can’t be an integers. For example, for the circuit, that is \cfrac{3^x-2^x}{2^k-3^x}.

The circuit is the (k,x)-cycle whose lowest member is the smallest of all the (k,x)-cycles.

The high loop (for want of a better term) is the one whose lowest member is largest of all the (k,x)-cycles. The high loop’s parity sequence turns out to be the one that spreads the 1s as evenly as possible. It’s a bit irregular. For example, the (8,5) high loop has parity sequence

11011010.

But we can also provably rule out all high loops.

Since the smallest member of a (k,x) high loop seems to have no closed form, the proof works by demonstrating that the difference between two certain members of a high loop can’t be an integer. (Thus, the loop can’t consist of integers.)

The idea is to rotate the parity sequence to identify two loops members, eg:

101101\underline{01}
101101\underline{10}

Since the sequences overlap so much, the difference of the loop members works out to something simple: \cfrac{2^{k-1}-2^{k-2}}{2^k-3^x} = \cfrac{2^{k-2}}{2^k-3^x}, which is not an integer, because the denominator has an odd factor.

This same idea can be extended back to the cases ruled-out earlier. For example, we can rotate a circuit like this:

1111\underline{0001}
1111\underline{1000}

and the members’ difference is \cfrac{2^{k-1}-2^{k-x-1}}{2^k-3^x} = \cfrac{2^{k-x-1}(2^x-1)}{2^k-3^x}, which is only an integer when \cfrac{2^x-1}{2^k-3^x} is. By heavy-artillery Ellison’s theorem, this expression’s value is always less than one, because 2^k-3^x grows faster than 2.56^x.

So we can rule out any parity sequence with two rotations that are self-similar enough, ie, share a long-enough prefix p … where “enough” works out to about |p| > k/2. (Otherwise, Ellison’s theorem doesn’t provide a strong enough constraint.)

Now, most parity sequences don’t have this property. For the vast majority, the maximum similarity of two rotations is around \log k.

So that’s Part One of this story … what’s Part Two?

Possible directions:

  1. Which other classes of parity sequences can be ruled out, ie, can’t correspond to integer cycles?

  2. If we can rule out parity sequences A and B, can we somehow extend the ruling-out to sequence C (derived from A and B)? And keep extending until all parity sequences are covered?

Any thoughts?

Very nice!

The thing I am looking into is parity sequences that already are a solution to another tricot: for example in:
2k → k
2k+1 → 3k+4

all the sequences of 3n+1 still work fine:
0: 0, 0, 0…
1: -5,-5…
10: 5,10,5…
110: -25, -35, -50
11110111000: -85…
(notice that’s only multiples of 5)

but we get new sequences:
100: 1,4,2…
11100: 19,31,49,76,38,19
and those ones can’t correspond to integers in 3n+1

1 Like

There’s a really interesting paper from Rozier and Terracol that sits adjacent to this question.

In a nutshell: They identify a set of orbits - or more properly “orbit sequences” - that break the “Terras rule” about descending below the original starting value n, then rising back up. They track a type of parity ledger and show that they’ve only found a tiny number of these types of orbits up to a very large search space AND that for Collatz to be true there needs to be an infinite number of these kind of orbit sequences.

It’s a pretty neat application of classic heuristics (and an unexpected rebuttal to one of the “grandfather” ideas behind the conjecture) and adds a next-step that reframes the central questions. I dont’ know that any of the items involved in moving their work to general contradiction are easier than Collatz itself, but they’re certainly a different kind of analytic path.

It took me a bit to fully parse how they do their counting (when are you allowed to stop and call a sequence a sequence) so pay close attention to that bit if you give the paper a read.

1 Like

Super cool result!

Let me see if I can roughly capture it.

Starting with something like 38 and its k=3, x=2 initial parity vector:

38, 19, 29, \ldots 44
011

Terras knows that \cfrac{3^2}{2^3} > 1 reliably predicts that 44 > 38. With more than \approx 63\% odds, you’ll stay above yourself.

Reverse-wise, starting with something like 25 and its initial k=5, x=3 parity vector:

25, 38, 19, 29, 44 \rightarrow 22
10110

Terras observes that \cfrac{3^3}{2^5} < 1 also fairly reliably predicts that 22 < 25.

If that always happened, there would be no integer cycles, because \cfrac{3^x}{2^k} is never going to hit 1 exactly, so after any k steps, start number n is going to reach an m where either m>n or m<n. (That seems right to me, though it seems simpler than what’s out there in the literature, so it might be a gloss.)

However, R&T point out many exceptions to Terras’ observation, such as

7, 11, 17, 26, 13, 20, 10, 5 \rightarrow 8
11101001

where \cfrac{3^5}{2^8} < 1 but somewhat unexpectedly 8 > 7.

Their paper identifies 593 such exceptions involving start numbers n < 4614 and none others with n < 10^{19}. Beyond that, who knows. But they can say that if there are no more Terras-exception parity sequences beyond that, then there are no (non-trivial) Collatz integer cycles.

1 Like

Pretty much. The paper defines a term C_{j}(n)=3^{q}/2^{j}. That term only depends on the parity vector.

If you take any starting number n and run j collatz steps, you will end up with q steps that were multiply-by-3-and-add-1. (odd steps). After making those j steps, the new value is (3^q/2^j)*n+“some non-negative extra amount” (and C_{j}(n)=3^{q}/2^{j})

If C_{j}(n)\ge1 then the ending term of the parity vector is \ge n all the time
If C_{j}(n)<1 then things can go either way. Usually the ending term of the parity vector is \le n but every once in a while it’s bigger.

That’s a surprise. That little “some non-negative extra amount” can sometimes be just enough to push the result back above the start. They tried to find as many of these surprising segments as possible, and only found a tiny number, even though they searched a big space. That’s not proof that more don’t exist, but if they don’t…then Collatz is true.

One point: A non-trivial cycle must have C_{j}(n)<1 and a really carefully balanced positive remainder. Cycles aren’t ruled out by 3^{x}/2^{k}\neq 1

2 Likes

Okay, perfect – yes, I thought my simple explanation wasn’t quite right. That non-negative extra amount has such an annoying form that I tried to put it out of my mind :slight_smile: But right, that’s the crux of it.

These paradoxical sequences are very interesting. I see that they define paradoxical sequences of all lengths as distinct. So if there were a non-trivial cycle starting from n of period p (T^p(n) = n), then that would lead infinitely many paradoxical sequences C_{kp}(n) < 1 for all k \ge 1. So if there are only finitely many paradoxical sequences that precludes non-trivial cycles.

It’s not yet clear to me why a divergent trajectory requires infinite paradoxical sequences (haven’t read that far in the paper yet). It feels like this would depend upon proving that all parity sequences are sufficiently balanced (such that for all n, for sufficiently large j: C_j(n) < 1), but I’m guessing they found a better path :slight_smile:

Sequence n:1,3,5,7,9,11,13,15,17
Sequence s: 1,2,3,4,5, 6, 7, 8, 9

Note that i have given the odd numbers a sequential position, note that as in the example 7,11,17 is represented by s starting at 4 rising 4,6,9. This pattern of odd numbers moving sequentialy ×1.5 can be proven to hold in all odd numbers in even s positions, also demonstrating that for all n in odd s positions the next odd will be less than the original n. Is this of any interest or relavance?

@Fumblingnumbers pretty cool indeed – you might be interested in a similar idea posted by @PeliKinesis, A different way of looking at the Collatz function

Just checked this out, its cool to see what people derive from this problem, i tracked the movements of odd to odd numbers and created a flow chart which revealed a clear pattern, the same pattern @PeliKinesis system produces. Just in a different format. I would never have seen this 1 -1 2 -2 way of looking at it.im currently working on bijections between 3 sets, even set, odd set and sequential set. The equations that join the domain to the codomains in these bijections reveal some interesting structural implications.not sure how far this method goes though.

Ah, in order to prove for divergent sequences they show that for all divergent n, there are an infinite number of sequences C_b(2^k n) < 1 but T^b(2^k n) > 2^k n. In other words, they don’t depend on the parity sequence of n being balanced, they just add enough even transitions to the beginning to push C_b(2^k n) < 1 while keeping T^b increasing.

1 Like

Even nicer result. If I understand right,

If n is divergent, then for whatever k we pick,

T^b(2^k n) > 2^k n is true when b = k +l, where l is the number of steps required by n to reach 2^k n (via x odds),

meaning

C_b(2^k n) = \frac{3^x}{2^{k+l}},

which is less than 1 for any k > x \log_23 - l that we might pick, giving rise to an infinity of paradoxical sequences … so if there isn’t an infinity of paradoxical sequences, then there aren’t any divergent numbers.

1 Like

I’m trying to figure this out, but it feels a little circular. So for any divergent n and any k, we choose the minimum l such that T^{k+l}(2^k n) = T^l(n) > 2^k n. Such an l must exist b/c n is divergent. That path contains x odd numbers. Thus

C_{k+l}(2^k n) = \frac{3^x}{2^{k+l}}

But how do we know that k+l > x \log_2(3). If we choose a bigger k that might also increase the l and x, so it’s not clear to me how to make sure this inequality holds. From reading the paper it seems that there is a bit of effort they do to bound the “error term” E_j(n) = T^j(n) - C_j(n) n

Oh, you are right – it’s not l, but rather l_k.

Hello everyone,

Love questions 1. and 2.

Just wanted to advertise that my Python library coreli has built-in features to play with parity vectors (= parity sequences):

>>> import coreli
>>> pv = coreli.ParityVector([1,1,1,0])
>>> pv.cyclic_rational()
-19/11

If you are interested in the p-adic expansions of these rationals:

>>> import coreli
>>> pv = coreli.ParityVector([1,1,1,0])
>>> r = pv.cyclic_rational()
>>> Z2 = coreli.Padic(2)
>>> Z3 = coreli.Padic(3)
>>> Z6 = coreli.Padic(6)
>>> Z2.from_rational(r).rational_periodic_representation()
'(1000101110)* 0111'
>>> Z3.from_rational(r).rational_periodic_representation()
'(22011)* 1'
>>> Z6.from_rational(r).rational_periodic_representation()
'(2421031345)* 1'

:slight_smile:

Back to Rozier & Terracol 25, I think I get it now. They show that every divergent trajectory (should there be any!) creates paradoxical sequences.

Along the divergent trajectory, pick an n that never goes below itself (there must be an infinity of such n).

Then pick a certain pair (a, b) from the infinite set of pairs that make \cfrac{3^a}{2^b} closer and closer to 1. Like (12, 19), (41, 65), etc.

For any x just shy of 1, there’s guaranteed to be an (a, b) pair such that

x < \cfrac{3^a}{2^b} < 1

I don’t think you need the heavy artillery of Baker for this, just the 9mm Dirichlet.

So for our n, there’s guaranteed to be an (a, b) pair such that

1 - \cfrac{1}{4n} < \cfrac{3^a}{2^b} < 1

The latter inequality takes care of half the paradoxical requirement.

Now take the initial parity sequence of n, up until it sees a odds (length = j) … and prepend it with k zeroes such that b = j + k.

That gives the initial parity sequence of start number m = 2^k n.

The last thing is to prove the rest of the “paradox” by showing that after b steps, m has reached not less than m. In other words, the accumulation of all those little plus-ones is enough to overcome the “expectation” from the ratio of odds in m 's parity sequence.

Main claim:

T^b(m) \geq m.

Proof:

T^b(m) = \cfrac{3^a}{2^b} m + \cfrac{\beta(00.10111)}{2^b}

(Sorry for the \beta notation – that was my original way of summing the 2^i 3^j products. Here, 10111 pretends to be the initial parity sequence of n, and 00 pretends to be the prepended zeroes.)

T^b(m) = \cfrac{3^a}{2^b} m + 2^k \cfrac{\beta(10111)}{2^b}

= \cfrac{3^a}{2^b} m + \cfrac{\beta(10111)}{2^j}

> \cfrac{3^a}{2^b} m + \cfrac{\beta(11110)}{2^j}

Here, they move the 1s to the left – they’re saying, “even if the trajectory of n is all odds followed by all evens, which would yield a minimal increase from the little +1s.” And since the 1^*0^* trajectory has a nice, simplified form:

= \cfrac{3^a}{2^b} m + \cfrac{3^a-2^a}{2^j}

> (1 - \cfrac{1}{4n} )m + \cfrac{3^a-2^a}{2^j}

= m - \cfrac{m}{4n} + \cfrac{3^a-2^a}{2^j}

= m - \cfrac{2^k n}{2^2 n} + \cfrac{3^a-2^a}{2^j}

= m - 2^{k-2} + \cfrac{3^a-2^a}{2^j}

= m - 2^{k-2} + \cfrac{3^a}{2^j} - \cfrac{2^a}{2^j}

= m - 2^{k-2} + \cfrac{3^a}{2^b} \cfrac{2^b}{2^j} - \cfrac{2^a}{2^j}

> m - 2^{k-2} + (1 - \cfrac{1}{4n}) \cfrac{2^b}{2^j} - \cfrac{2^a}{2^j}

> m - 2^{k-2} + (1 - \cfrac{1}{4}) \cfrac{2^b}{2^j} - \cfrac{2^a}{2^j}

> m - 2^{b-j-2} + \cfrac{\cfrac{3}{4} {2^b} - {2^a}}{2^j}

T^b(m) > m + \cfrac{2^{b-1} - 2^a}{2^j}

Supposing we appended at least one zero (b > a), the second term is non-negative, so

T^b(m) \geq m.

I was wondering where the 1-\cfrac{1}{4n} comes from.

At first, it seems magically chosen to make all the +1s make m “unexpectedly” exceed itself after b steps.

But I guess considering all possibilities of the form 1-\cfrac{1}{cn},

the “shortfall” from the main term is going to be - \cfrac{2^k}{c} ,

while the “plus up” from the error term is (1 - \cfrac{1}{c} - \cfrac{1}{2}) 2^k,

so m will exceed itself after b steps if

(1 - \cfrac{1}{c} - \cfrac{1}{2}) 2^k > \cfrac{2^k}{c}

which happens for any c \geq 4, since the 2^k parts cancel out.

Other interesting points …

  1. Slack parts of the inequality derivation include:
    a. pushing all the 1s to the left in n's trajectory, and
    b. \cfrac{1}{4n} < \cfrac{1}{4}

  2. What’s required of the divergent trajectory seems not too demanding, i.e., some number n that stays above itself for long enough for a relevant (a, b) pair to be found. Of course, a divergent trajectory guarantees this, but do other non-divergent start numbers accomplish this as well? Information-theory-wise, it’s a rare n that could do the job, but it seems like maybe one would have to exist.

  3. The terms “remainder” and “error term” are interesting. E_b(m) does behave like an error term, such as used in prime-counting functions, where you have a similar balancing act to perform (@stargazer07817 's “carefully balanced positive remainder”), though it seems a little different, as it’s not making up for dropping a floor or ceiling function in the main term. Also, it’s interesting that the 1 - \cfrac{1}{4n} gets used in both the main term and the error term.

Re: Point 2 above … it looks like n would have to stay above itself for b \approx 2.8n steps for the proof to work. A divergent trajectory always has such an n, but it’s unknown whether (or not) any convergent trajectory can hold out that long.

So this is all way over my head, but if your simply looking for a starting n that stays above itself for a particular length of time (or number of collatz iterations) take 6(2^y-1)+4 this number will always continously increase ie (n÷2)×3+1÷2=nk>n.it will increase for y+1 iterations before converging on the first odd nk that goes below itself.hope this helps

@Fumblingnumbers Absolutely, right on. Every 2^y stays above itself for at least y steps (plus a few more, since it can’t teleport instantly back down to 2^y). In R&T’s proof, they might be looking for a number around 2^y that stays above itself for about 8^y steps, instead of y steps. Nobody’s found such a number, but nobody’s ruled it out either.