Speaking of trying to predict total stopping time with better than baseline accuracy …
Have Busy Beaver folks tried to predict halt/no-halt from the description of a 5-state, 2-symbol TM? I’m sure there are many “give away” features of TMs that halt. Probably these arose in the process of creating the BB(5) proof.
Can an ML model automatically locate those features, given 88 million TM descriptions each labeled with class 0 (halt) or 1 (no-halt)?
It’s possible to make a traditional 90% train, 10% test setup … or alternatively, don’t divide the data, but rather see who can induce the shortest program that prints the entire dataset verbatim, along the lines of the Hutter Prize for text.
(BTW, is there a pointer to such data in a single downloadable txt file?)