In a serial computer, you have a 1-bit ALU, say an full adder that generate a sum and carry. Each clock cycle you read two bits and feed them into the adder, and then you write back the sum. You hold the carry in a flip-flop to use in the next clock cycle. It's just like doing a binary addition with pencil and paper, one bit at a time.
Note that you need to start with the lowest bit with a serial computer, which explains why x86 is little-endian. It goes back to the Datapoint 2200, a desktop computer made from TTL chips and running serially. The Intel 8008 processor was a copy of the Datapoint 2200 (as was the Texas Instruments TMX 1795). Although the 8008 was parallel, it copied the little-endian architecture of the Datapoint 2200.
I've often wondered if serial computers could have a useful role again. At very high clock speeds and wide data paths, you hear about trouble controlling signal skews. In contrast, imagine a serial computer clocking data around at 8 GHz vs. an 8-bit computer clocking data at 1 GHz. You have to deal with faster speeds, but no skew, and it seems like a 1-bit ALU might be simpler (and faster) than a 64-bit one.
Hmm, I see - how do the opcodes work and jumping then? Do you also read them bit-by-bit and reconfigure the ALU / codepaths? Is addressing also single-bit?
Here's a 16-bit bit-serial computer I made and tested on an FGPA https://github.com/howerj/bit-serial. If you look at `bit.c` it looks like an ordinary 16-bit Accumulator based Virtual Machine with a few odd instructions that make more sense when you know how a bit serial CPU works, nothing special about it. However the VHDL in `bit.vhd` shows how all those instructions are processed in a bit serial fashion, how data is fetched and stored in shift registers, etcetera.
The bit serial CPU in `bit.vhd` is actually customizable, you can make a 32-bit, a 14-bit, or a 27-bit CPU if you want from that VHDL quite easily.
Note that you need to start with the lowest bit with a serial computer, which explains why x86 is little-endian. It goes back to the Datapoint 2200, a desktop computer made from TTL chips and running serially. The Intel 8008 processor was a copy of the Datapoint 2200 (as was the Texas Instruments TMX 1795). Although the 8008 was parallel, it copied the little-endian architecture of the Datapoint 2200.