My paper “Binary expression of ancestors in the Collatz graph” gives a systematic way (using regular expressions) to describe the set of Collatz-ancestors of x such that there are k odd terms on the way to x (not including x).
It’s a mess ![]()
My library coreli used to compute them for you (using an old version, e.g. commit hash 8a12ea40)
Examples
5-type
The ancestors of 5 that see 3 odd terms (other than 5) on the way are described by the following regex:
(001011110110100001)*001011110110100(000111)*00(01)*01(0)*
| (001011110110100001)*00101111011010000(100011)*100(01)*01(0)*
| (001011110110100001)*001011110(110001)*1100(01)*01(0)*
| (001011110110100001)*00101(111000)*11100(01)*01(0)*
| (001011110110100001)*001(011100)*011100(01)*01(0)*
| (001011110110100001)*00101111011(001110)*0(01)*01(0)*
| (001011110110100001)*001011110110100(000111)*0001(10)*1(0)*
| (001011110110100001)*00101111011010000(100011)*10001(10)*1(0)*
| (001011110110100001)*001011110(110001)*110001(10)*1(0)*
| (001011110110100001)*00101(111000)*1(10)*1(0)*
| (001011110110100001)*001(011100)*01(10)*1(0)*
| (001011110110100001)*00101111011(001110)*001(10)*1(0)*
For instance, taking from the top branch we can construct 001011110110100 00 0101 01 corresponding to integer 1553429 of which Collatz sequence until reaching 5 is (odd integers marked with *):
* 1553429
2330144
1165072
582536
291268
145634
* 72817
109226
* 54613
81920
40960
20480
10240
5120
2560
1280
640
320
160
80
40
20
10
* 5
16-type
The regex for 16 and 3 odd integers is:
(100101111011010000)*1001011110110100(000111)*00(01)*01(0)*
| (100101111011010000)*100101111011010000(100011)*100(01)*01(0)*
| (100101111011010000)*1001011110(110001)*1100(01)*01(0)*
| (100101111011010000)*100101(111000)*11100(01)*01(0)*
| (100101111011010000)*1001(011100)*011100(01)*01(0)*
| (100101111011010000)*100101111011(001110)*0(01)*01(0)*
| (100101111011010000)*1001011110110100(000111)*0001(10)*1(0)*
| (100101111011010000)*100101111011010000(100011)*10001(10)*1(0)*
| (100101111011010000)*1001011110(110001)*110001(10)*1(0)*
| (100101111011010000)*100101(111000)*1(10)*1(0)*
| (100101111011010000)*1001(011100)*01(10)*1(0)*
| (100101111011010000)*100101111011(001110)*001(10)*1(0)*
For instance, taking an integer from the second branch you can construct:
100101111011010000 100101111011010000 100011 100011 100 01 01 01 0
Corresponding to integer 170803185867525674 of which Collatz sequence until seeing 16 is (odd integers marked *):
170803185867525674
* 85401592933762837
128102389400644256
64051194700322128
32025597350161064
16012798675080532
8006399337540266
* 4003199668770133
6004799503155200
3002399751577600
1501199875788800
750599937894400
375299968947200
187649984473600
93824992236800
46912496118400
23456248059200
11728124029600
5864062014800
2932031007400
1466015503700
733007751850
* 366503875925
549755813888
274877906944
137438953472
68719476736
34359738368
17179869184
8589934592
4294967296
2147483648
1073741824
536870912
268435456
134217728
67108864
33554432
16777216
8388608
4194304
2097152
1048576
524288
262144
131072
65536
32768
16384
8192
4096
2048
1024
512
256
128
64
32
16
A few points:
-
The number of branches of the regex is the same for all x (not multiple of 3) when you fix k, namely 2^{(k-1)} 3^{(k-1)(k-2)/2}, here with k=2 we get 2^2 \times 3 = 12 branches (See Theorem 1 there is an off-by-one error in the statement..), which is the number of branches we got in practice for 5 and 16 (separated by
|above). -
The size (i.e. number of characters) of the regex is also roughly the same at fixed k (when x is not a multiple of 3)
Coming back to your question:
Do the “5-type” and “16-type” sets both have positive density, or are almost all “5-type”?
The structure of the predecessor set of 5 and 16 is the same: infinite union of the regular expressions for each value of k, and each of these regex have the same structure. Hence, without being an expert about density arguments but given this 1-1 correspondence and knowing the structure of these regexes, I’m pretty sure it implies the densities are both positive.