Questions about initial trajectories

A lot of 3n+1 work is about beginnings of trajectories, rather than middles or ends, because beginnings are more predictable.

Terras showed there’s a one-to-one mapping between numbers 0 to 2^n-1 and the parity sequences of length n.

Also, if j \equiv m \mod 2^n, then j and m have the same n-length initial parity sequence.

What more is known about these initial parity sequences?

0 always has parity sequence 0^n.

2^n-1 always has parity sequence 1^n.

Can we say anything about j's parity sequence if j is not 0 or 2^k-1?

For n=4, j written in reverse-binary predicts its parity sequence over half the time. Is this an accident?

j parity rev(bin(j))
0 0000 0000 ←
1 1010 1000
2 0101 0100
3 1100 1100
4 0010 0010 ←
5 1000 1010
6 0110 0110 ←
7 1110 1110 ←
8 0001 0001 ←
9 1011 1001
10 0100 0101
11 1101 1101 ←
12 0011 0011 ←
13 1001 1011
14 0111 0111 ←
15 1111 1111 ←

Are there any partial patterns that hold for arbitrary n? Any progress at all since Terras and Lagarias raised this question?

After n steps, 0 winds up at 0.

After n steps, 2^n-1 winds up at 3^n-1.

T^n(j) = \cfrac{3^x}{2^n} j + E_j(n), where x counts odds.

What’s the expected (mean) E_j(n) for 1 \leq j \leq 2^n? Rozier and Terracol [2025] answer this in Lemma 2.3 of their paradoxical paper as n/4 (independent of j!)

R&T also mention that much more remains to be discovered about the distribution of E_j(n) values, besides the mean. This seems like a crux of the Collatz conjecture; those sums of 2^i 3^j seem to have unpredictable properties (such as their divisibility by 2^n - 3^x) … but maybe more can be said about them in the aggregate.

For a number j chosen between 0 and 2^n-1, what’s the expected value (mean) of the whole wind-up point T^n(j), as function of j and n? I put this at j + n/4 … which is counter-intuitive … because “almost all” numbers go below themselves after n steps, not above themselves … but the rare high-flying outliers really throw off the mean.

The median wind-up point is more like j (\frac{\sqrt{3}}{2})^n + n/4.

Anything relevant on any of these questions? Or any other questions/answers about initial trajectories?

The run length in a run can be determined by the low bits in the seed. The run length is the number of odd steps until the sequence goes below the seed. The low order seed bits determine initial steps. If you mask off the low seed bits then certain values always produce runs of the same length.

Here a leading hash sign denotes a hexadecimal number. The vertical bar delimits a list of values. The and bitwise operator is: /\

 IF Seed /\ 3 = #1  then Length = 1
 IF Seed /\ #f = #3 then Length = 2
 IF Seed /\ #1f = #b |#17 then Length = 3
 IF Seed /\ #7f = #7 | #f | #3b then Length = 4
 IF Seed /\ #ff = #27 | #4f | #5f | #7b | #af | #c7 | #db then Length = 5

This process blows up exponentially. The list of constant values for a Length of 25 grows to 820_236_724 values with a 40 bit mask.

1 Like

If I understand well, you (@mathkook) try to compare the parity vector of j of length n (written from left to right) with the reversed binary representation of j on n bits. In fact, it is common to view the parity sequence as a 2-adic integer (which is simply an integer in base 2 with infinitely many digits) whose digits are most often written from right to left (the least significant digit is on the right end). With this convention, the reverse operation becomes superfluous. Like in Lagarias’ paper, we can denote by Q(j) the 2-adic integer corresponding to the parity sequence of j (i.e., its k^{th} digit gives the parity of the k^{th} iterate). Then it is known that Q(j) \equiv j \pmod{4} for any j, so that the first 2 bits are always the same. What is even more striking is that, if you consider the inverse Q^{-1} of Q, then you have Q^{-1}(j) \equiv Q(j) \pmod{16}. The first 4 bits are the same, and it is tempted to assume that Q^{-1} = Q. Unfortunately, this is not true anymore if we look at the fifth bit. Things are getting increasingly complicated as n increases!

Regarding the mean of T^n(j) for 0 \leq j \leq 2^n-1, I’m not aware of a known formula for this particular mean. But there is an abundant literature dealing with the distribution of the sum-of-digits functions . For example, if you denote by C_j(n) the coefficient \frac{3^x}{2^n}, it is striking that the mean of C_j(n) is equal to 1. It does not depend on n. And this remains true for k \leq j < k+2^n for any choice of k and n. Hence, your estimation of j+ n/4 seems relevant. More formally, when replacing j by its mean, we get the estimation (2^n-1)/2+n/4. But once again, things are more complicated. Empirically, the mean is higher. The main reason is simply that the mean of a product is not equal to the product of means. That would be too simple! Integers whose binary representation has a lot of 1’s are more likely to have odd iterates. In other words, the binary representation of an integer is rather strongly correlated with its parity vector, as was suggested by your first observation.

Another question might be to determine the variance of E_j(n) as a function n, whereas the variance for C_j(n) is probably already known…

1 Like

Some interesting info about E_j(n) PARTIAL SUMS OF THE 3X+1 ACCUMULATION TERM

1 Like