Simon N Goodwin's Technical analysis of the Multi-system

What follows is a hefty dump of my notes and memories. It is dense but as it's transcribed from pages of tiny hand-written notes.

My Konix folder contains the 67 page Flare 1 hardware reference guide. Slipstream was a Flare 1 souped up and with an 8086 CPU in place of the original Z80 CPU. And the Jaguar was a Panther souped up, and the Panther was souped up from Slipstream! By that point the CPU had become the 68000, and the capabilities of the Flare chips had expanded to use a 64 bit bus. In all these, most of the power is in the Flare custom chips, but the CISC processor is there to make it relatively easy to develop for. Consoles are not like PCs.

The Flare 1 bus was 8 bits wide - a Z80 system with hardware acceleration for things like plotting and calculation the Z80 couldn't do well. The Flare 2 or Slipstream just managed 16 bits (using a decade old 8086 processor with 64K paged addressing) and added a colour palette and more acceleration. Panther was the 32 bit version and Atari's Jaguar was the 64 bit remix.

Claims that the Slipstream was 32 bit are misleading, based on adding the two 16 bit paths for code and data to the DSP. If that were fair the PlayStation 2 CPU would be a 192 bit part. It isn't (most of the PS2 memory is actually only 16 bits wide, but accessed in very fast 128 bit bursts - such metrics are meaningless). However it was step two on the four step progression of doublings from Flare 1 to Atari Jaguar.

The wide memory on the later (Atari) versions was on chip and very, very small - only used for the tiny DSP programs. The same is true, but in blocks of a few K now, for the 128 bit paths in the PS2. Such systems are blisteringly fast if you rewrite programs - and scramble data - specifically for them, and almost useless otherwise.

The Achilles heel of the Slipstream was the way the 8086 was integrated with the other chips. Much of the time it was not able to access memory, as the other chips had priority. There were plans to mitigate this, though I suspect they were not followed through. The very limited DSP RAM, as on all Flare systems, was a key trade-off between price/performance and programmability - cheap to make at the expense of hard work for programmers - another typical console design decision.

The address space was divided into four 256K areas, for 16 bit screen RAM, ASIC (palette and DSP) registers, and two expansion RAM/ROM areas via the cartridge interface, which used a cheap 56 way edge connector.

The DSP had 256 bytes of data RAM, 512 of program RAM, 256 bytes of ROM plus 128 bytes (?) of constant data, and up to 128 bytes reserved for registers. That's BYTES, though it was all organised as 16 bit words. The palette was 512 bytes (256 words) but alternate bytes were only half-used (12 bit entries, like the original Amiga, but 256 of them). DSP programs could also run from the palette RAM - at the expense of colours ;-). This meant a lot more room for code in 16 colour mode (but slower blitting - see later).

The data was a 16 bit sine table, notionally 256 entries but (as it has only 128 bytes in the map) presumably decoded to exploit circular symmetry and store only a quarter circle and work out the rest on the fly (details unsure; in theory just 45 degrees would be enough, but 90 is easier). Maybe the map is wrong and it was actually 512 bytes? My notes are contradictory but the effect is the same regardless - you can rotate things thereby fast in steps of one or two degrees.

The DSP and Sine table was used for FM sound via the stereo DAC (Digital to Analogue converter) - the same approach used on current consoles like the PS3 and Xbox 360. The plan was to replace the external 14 bit ones with an on-chip version, capable of rates up to about 200 KHz (compared with just 48 KHz for 'next gen' consoles now). Notionally this would be 16 bit but the chip noise (as on early SoundBlasters) meant the low order bits would be fairly random. Still pretty good audio by the standards of the day, though. Or since.

The following snippets from my memory and notes are technical but essential to understand both why Slipstream was interesting and why it was always most likely to fail. Look up terms like DSP and contention if you don't understand them, as they're crucial.

The project was under funded from the start and Konix never had the money, or the backing, to produce what they promised.

The Konix logo in the photos in ACE magazine was glued on to the 'console' which was actually a solid lump of wood. I never saw a finished moulding. This is common for product previews.

The Power Chair never worked reliably and probably never would have done. It was not properly production-engineered and would have been a disaster if it'd ever reached retail.

The input interface used a custom 11 pin D-type joystick port, with 3 analogue axes, measuring +/-65 degrees of wheel turning, 0..100 degrees of column angle, and +/- 45 degrees of accelerator angle, each via 8 bit analogue inputs polled once every 50/60 Hertz display field. Not all the range was useful due to limited potentiometer travel, and the scale was non-linear, so the accuracy would be only a couple of degrees at best. Good enough, though.

Two-player mode involved the second machine running as a slave doing no processing and not even powered on, but just working as a joystick for the display and processors of the first. The three inputs were multiplexed to six (at half speed) for this case. 16 digital input and 8 output bits were also available, some of those used for the floppy control and input buttons.

The light pen or gun input could return X and Y co-ordinates for the beam, frozen when the pen/gun sensed light from the display. I don't know of any software designed to work with this, and suspect it'd have been more a gimmick than a useful feature.

ATD did their SDK development using an early 8 bit 100 pin version of the Flare chip with an 8088 processor. This was very slow indeed - slower than the Flare 1's Z80, in some respects. But - as with PS2 or PS3 - most of the processing power was in the custom chips, not the generic microprocessor which functioned mainly as conductor (on any game that took proper advantage of the hardware - it would have been a very hard system to make ports for, of the CPC/Spectrum variety; it took a lot of custom coding and redesign to make it work tolerably well.

Five systems - DSP, blitter, 808x, video and disc control - contend for access to the 'video' RAM.

The planned 160 pin version would support the 16 bit 8086, with less (but still substantial) contention, and have the built in floppy controller (using 700 gates - so fairly simple by WDC standards). Like Amiga - a strong influence on Flare, via the Amiga HRM - it used video flyback time for floppy disc data DMA, at two bytes per line (up to 30K a second, limited to about 25K/s in practice). Floppy loading might have helped it succeed against Sega and NES, mainly by making cheap publishing - and piracy - practical. Like the Amiga, data was stored in 5.5K track units. A bus latch (as on Atari ST) was meant to eliminate RAM contention, bringing the effective CPU speed (with fast 16 bit RAM) up from 4.68 to the 'full' 6 MHz. I'm not sure this was ever implemented. Speeding up the CPU any further - 6 MHz was not fast even in those days - would have been tough as the RAM and bus were doing so many other things.

Blitter programming involved a 13 byte preamble and then 3 bytes per scan line, to draw irregular shapes. Conversion from 3D to 2D co-ordinates involved at least 8 passes through the multiplier unit (four off-the-shelf Texas 74181 4-bit-slice arithmetic units) and 12 for correct perspective. This allowed at least 100,000 and maybe 200,000 vertices to be plotted per second (a tenth of PlayStation and perhaps 1% of a PS2 - and, like Jaguar but unlike the Sony systems, with no texturing hardware).

Even the Goraud shading of the Jaguar was not directly supported by the Slipstream hardware, so solid blocks or un-scaled copies of source texture data were all it could draw without lots of extra work by the slow CPU. If it had come out it would have been obsolete quite quickly, but perhaps might have kept pace if it sold a million in the first couple of years - a tall order.

The blitter could access any address in the 1 Mb map. A 64K full-screen block copy takes 7.1 ms (compared with 20 or 16.7 ms for a PAL or NTSC field refresh) - several times faster than the Spectrum and with many times the data to push around (more colours).

There was no support for interlaced screen modes. The PAL output was therefore 624 lines, but (unlike Flare 1) NTSC was presumed so the display was in a 200 pixel high letterbox within the 250-288 lines of a PAL TV display (for US compatibility) though notionally the number of lines was programmable - but limited by the 128K VRAM. In theory 64K or 256K versions were possible - Martin Brennan told me 'we have left the exact memory recipe to the last minute'. Given the chip shortages of the 80s which had killed some micro manufacturers (e.g. Nascom), and the demanding requirements, this was sensible but still indicates a gap between plans and hopes.

As on Flare 1 or SAM, there were 256 or 512 pixels per line (compared with 320 or 720 for Commodore systems, a much better match with the 160 video colour clocks per TV line) so Konix TV colour resolution suffered, and the pixels were oblong - never square - which complicates 3D processing.

Pixels were packed, like PC VGA, not planar like the Amiga, PC EGA, or the ST or QL (sort of). This made 3D easier but scrolling more expensive (though still easier than on QL, ST or Spectrum).

A 12 bit (4096 option) border colour could be set. The Pallete could only be safely be accessed while the beam was in the border, otherwise the display messed up. This means you had to wait 40 us or so for access to the palette - restrictive compared with other palette systems like ST, SAM and Amiga.

The bit-slice processor could do a 16 x 16 bit multiply to a 32 (? my notes say 36!) bit result in one cycle. In that respect alone - though multiplications are one of the fundamental measures of performance, they aren't much use on their own - it was about 50 times faster than an Amiga or ST (70 cycles per MUL, though at a slightly higher clock rate) assuming it ran at the same clock rate as the CPU, or 100 times if it was sync'd with the DSP (not sure; worth checking).

The DSP was designed by Ben, and known as 'OK'. It had conditional tests and indexed base addressing operations, using a fixed instruction of 7 bits of opcode and 9 bits (hence 512 locations) of operand. Cycle time was 85 ns. All instructions (like a cut down ARM) could be conditional on the value of the Carry flag. So it was a 12MHz 16 bit RISC chip - compare it with the 30 MHz 32 bit RISC R3000A in the PS1 five years later, though the PS1 CPU was a lot more general-purpose.

The ASIC used 18,000 gate array cells at the time I researched into it. That's more than the SAM one, far more than the Spectrum or QL (Ferranti ULAs) but less than the total of the Amiga custom chips, and a small fraction of what you can get on a cheap FPGA these days.

The video encoder was a 1377 chip, like a Sinclair QL or SAM (among others). This is not very good, but supports NTSC or PAL TV standards.

The blitter is very crude compared with Amiga, with no masking (storing of the background colour, though it did reserve one colour as 'transparent' or beam avoidance). It's more like the little-used one grafted onto the Atari STe. It just copies 4, 8 or 16 bit data around, blocking the CPU's access to RAM (though it does block copy a lot faster). Line drawing takes one cycle per pixel. Polygon drawing used the DSP to work out the co-ordinates and then the blitter to draw lines of the polygon. 16 colour (4 bit) video mode was slowed because the hardware had to do a read before each write, so (assuming no video contention - slower in practice!) it could plot 6 million 16 colour pixels per second or 12 million in 256 colour mode - a lot faster than the Amiga one, especially as the number of colours increases.

It needed a two-stage DSP program to display solid 3D in perspective, swapping the program every 64 us (every scan line) to do the whole job. Maths and audio operations were mutually exclusive. DSP memory bandwidth was 24 megabytes/second EACH, for both code and data - faster than typical RAM of the day, as it was on-chip.

Fred Gill reckoned it really needed a second DSP, not just one. Flare thought about doing this but it would have required quite a lot of changes.