
Brainfuck to SUBLEQ Compiler
Programming the One Instruction Set Computer (OISC) with Brainfuck and its Derivatives
Brainfuck Language
Brainfuck is an "esolang" - an esoteric language, with significance in computer science, but little going for it in terms of its ease of use (hence its name). It resembles obfuscated code more than it does a normal programming language.
Here is an example Brainfuck program to output "Hello World!": ++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
So what better language to implement as a "high-level" method of programming the One Instruction Set Computer (OISC)?
This page presents possibly the world's first Brainfuck to SUBLEQ compiler (or transpiler, depending on your point of view).
Originally, thought was given to implementing a CPU that natively runs Brainfuck, but that would be a major project to undertake - probably more complex than the OISC. As I don't have the 100s of hours to spend on that, it is left as an exercise for the reader.
Technical Goals
-
Write a compiler that takes a Brainfuck program and converts it to SUBLEQ to run on the OISC and its emulator
-
Make use of the synthetic assembly language I developed for the OISC
-
Test it by running Brainfuck programs (especially the Collatz Conjecture code written by Daniel B Cristofani)
-
Revel in the humour of running the Collatz Conjecture code in such an obscure way, given it is the downfall of many mathematicians (and their sanity)...
-
Allow "trivial substitution" Brainfuck programs to be compiled, including the teenage-boy snort-inducing PenisScript, and my personal favourite: "Brainfuck, but every + is replaced with the Bee Movie script"
-
Develop two new trivial substitutions to save ink (and paper) by being made of non-visible characters (inspired by Whitespace and !!Fuck): Ghostfuck and Blankfuck.
Compiler Operation
The compiler firstly converts the Brainfuck code into synthetic OISC assembler. A few extra addressing modes for the synthetic assembler commands needed to be written to provide the required functionality.
This transpilation to synthetic assembler is a single-pass parsing process, using a stack to track loop boundaries. Some optimisations are used, for example when there are sequences of multiple contiguous +-> or < characters, they are compressed into a single instruction, rather than multiple instructions. Multiple DEC (decrement) commands would become a single SUB (subtract) command; INCs would become ADDs.
Inputs are hard-coded into an input character array (string) in the synthetic assembler. For example, the input numbers to test for the Collatz Conjecture code are ASCII-represented integers separated by newline characters - in this case four small numbers (10, 33, 10, and 1000), then a rather large one (in the hundreds of decillions): .inchararray[] "10\n33\n10\n1000\n104389573485768345104389573485768345\n";
The synthetic assembler is compiled further into SUBLEQ commands, which is business-as-usual for programming the OISC.
Note that even relatively small Brainfuck programs can exceed the 8KiWords ROM limit of the hardware (which contains the program) - in which case the programs must be exceeded on the OISC/SUBLEQ emulator.
Example Code: Collatz Conjecture

The above program processes inputted numbers to determine, for each, the length of the cycle before it reaches 1.
The Collatz Conjecture ensnares mathematicians and laypeople by being one of the famously unsolved problems that anyone can understand, but no-one can prove. Here's how it works:
1. Start: Choose any positive integer (N).
2. Even Rule: If N is even, divide it by 2.
3. Odd Rule: If N is odd, multiply it by 3 and add 1 (this is why it's also known as the "3N+1 problem".
4. Repeat: Apply either the even or odd rule to the result, and repeat this process with the new number.
5. Conjecture: The Collatz Conjecture states that no matter what starting number you choose, this process will always eventually reach the number 1.
The compilation steps:
1. Convert Brainfuck code to synthetic OISC assembler
An excerpt is given below. Remember, there are no actual hardware registers or pointers (or even an accumulator) in the OISC.
Click here for a PDF of the full synthetic assembly language listing.

2. Compile the synthetic assembler to SUBLEQ
An excerpt is given below.

3. Run on the OISC hardware or emulator
Below is an example input set of numbers to check, along the (correct) program output sequence-lengths.

Brainfuck: Trivial Substitutions

Brainfuck "Trivial Substitutions" are where each of the eight Brainfuck tokens is translated into another token. These can be small tokens (e.g. 1 character), even in a different alphabet (e.g. "K-on Fuck" or "脑子爆掉" - the latter being Chinese for "Brain Exploded") or very large tokens (such as with "Brainfuck, but every + is replaced with the Bee Movie script" (self explanatory) or "Redundant" where every standard Brainfuck token is prefixed with the first 5000 digits of pi.
Above is the "Hello World!" Brainfuck program using the PenisScript substitution. Hilarious.
Below is part of the same program using the Broccosprout substitution.
A maximal-munch algorithm is using when decoding a substitution, but certain definitions (such as Pikalang) are ambiguous, such that successful decoding cannot be guaranteed. The C# code checks for ambiguous definitions.

Generation (Encoding & Decoding)
The C# code (excerpt below) was written so the definitions of the trivial substitutions could be taken verbatim from the relevant esolangs.org page. Warnings are given at runtime if there is potential ambiguity in the substitution definitions.

My Novel Trivial Brainfuck Substitutions: Ghostfuck & Blankfuck

As part of being a good environmental citizen, I propose the above trivial Brainfuck substitutions to save on both ink and paper. If only the printed versions of a program are distributed, it also acts as an unbreakable encryption algorithm, although unfortunately not at all decryptable (being blank pages).
Blankfuck uses tab and space characters. Below is an example of the "Hello World!" program.
Ghostfuck comprises tab and newline characters, meaning more paper will come out of the printer if printed. Careful consideration of the encoding substitution, and font size, can determine the number of blank pages ejected from the printer - which can be reused or recycled of course.
