If you check the other articles about the PlayStation [1] and the Nintendo 64 [2], you'll see that the design of a 3D-capable console in the 90s was a significant challenge for every company. Thus, each one proposed a different solution (with different pros and cons), yet all very interesting to analyse and compare. That's the reason this article was written.
> you'll see that the design of a 3D-capable console in the 90s was a significant challenge for every company.
While this is true, I still think that the PlayStation had the most interesting and forwarding looking design of its generation, especially considering the constraints. The design is significantly cheaper than both Saturn and Nintendo 64, it was fully 3D (compared to Saturn for example), using CD as media was spot-on and also having the MJPEG decoder (that allowed PlayStation to have not only significantly higher video quality than its rivals, but also allowed video to be used for backgrounds for much better quality graphics, see for example Resident Evil or Final Fantasy series).
I really wanted to see a design inspired in the first PlayStation with more memory (since the low memory compared to its rivals was an issue it seemed, especially in e.g.: 2D fighting games where the amount of animations had to be cut a lot compared to Saturn) and maybe some more hardware accelators to help fix some of the issues that plagued the platform.
It is not really any more 3D than the Saturn as it still does texture mapping in 2D space, same as the saturn. It's biggest advantage when it came to 3D graphics, aside from higher performance, was it's UV mapping. They both stretch flat 2D textured shapes around to fake 3D.
The N64 is really far beyond the other two in terms of being "fully 3D", with it's fully perspective correct z buffering and texture mapping, let alone mipmapping with bilinear blending and subpixel correct rasterization.
This is very true. I consider then N64 to be the first to use anything that resembles hardware vague similar to what the rest of the industry ended up with.
It is a shame that SGI's management didn't see a future in PC 3D accelerator cards, it lead to the formation of 3DFX and with things like that SGI's value in the market was crushed astoundingly fast. They had the future but short term thinking blinded them to the path ahead.
But N64 was a more expensive design, and also came almost 2 years later, and from an architectural standpoint it also had significant issues (e.g.: the texture cache size that someone said above).
This is why I said considering the constraints, I find the first PlayStation to be impressive.
This should have been no struggle for Sega. They basically invented the modern 3D game and dominated in the arcade with very advanced 3D games at the time. Did they not leverage Yu Suzuki and the AM division when creating the Saturn?
Then again rumor has it they were still stuck on 2D for the home market and then saw the PlayStation specs and freaked and ordered 2 of everything in the Saturn.
This should have been no struggle for Sega. They basically
invented the modern 3D game and dominated in the arcade with
very advanced 3D games at the time
Way different challenges!
The Model 2 arcade hardware cost over $15,000 when new in 1993. Look at those Model 1 and Model 2, that's some serious silicon. Multiple layers of PCB stacked with chips. The texture mapping chips were from partnerships with Lockheed Martin and GE. There was no home market for 3D accelerators yet; the only companies doing it were folks creating graphics chips for military training use and high end CAD work.
Contrast that with the Saturn. Instead of a $15,000 price target they had to design something that they could sell for $399 and wouldn't consume a kilowatt of power.
Although, in the end, I think the main hurdle was a failure to predict the 3D revolution that Playstation ushered in.
> The Model 2 arcade hardware cost over $15,000 when new in 1993. Look at those Model 1 and Model 2, that's some serious silicon.
That's an even bigger miss on Sega's part then.
Having such kit out in the field, should have given Sega good insight into the "what's hot, and what's not" for (near-future) gaming needs.
Which features are essential, what's low hanging fruit, what's nice to have but (too) expensive, performance <-> quality <-> complexity tradeoffs, etc.
Besides having hardware & existing titles to test-run along the lines of "what if we cut this down to... how would it look?"
Not saying Sega should have built a cut-down version of their arcade systems! But those could have provided good guidance & inspiration.
But they had the insight. And the insight they got was that 3D was not there yet for the home market, it was unrealistic to have good 3D for cheap (eg. no wobbly textures, etc), as it was still really challenging to have good 3D on expensive dedicated hardware.
Yeah. The 3D revolution was obvious in hindsight, but not so obvious in the mid 1990s. I was a PC gamer as well at the time so even with the benefit of seeing things like DOOM it wasn't necessarily obvious that 2.5D/3D games were going to be popular with the mainstream any time soon.
A lot of casual gamers found early home 3D games kind of confusing and offputting. (Honestly, many still kind of do)
We went from highly evolved colorful, detailed 2D sprites to 3D graphics that were frankly rather ugly most of the time, with controllers and virtual in-game cameras that tended to be rather janky. Analog controllers weren't really even prevalent thing for consoles at this point.
Obviously in hindsight the Saturn made a lot of bad bets and the Playstation made a lot of winning ones.
The secret to high 3D performance (particularly in those simpler days before advanced shaders and such) wasn't exactly a secret. You needed lots of computing horsepower and lots of memory to draw and texture as many polys as possible.
The arcade hardware was so ridiculous in terms of the number of chips involved, I don't even know how many lessons could be directly carried over. Especially when they didn't design the majority of those chips.
Shrinking that down into a hyper cost optimized consumer device relative to a $15K arcade machine came down to design priorities and engineering chops and Sega just didn't hit the mark.
In interviews IIRC ex-Sega staff has stated that they thought they had one more console generation before a 3D-first console was viable to the home market. Sure, they could do it right then and there, but it would be kind of janky. Consumers would rather have solid arcade-quality 2D games than glitchy home ports of 3D ones. Then Sony decided that the wow factor was worth kind of janky graphics (affine texture mapping, egregious pop-in, only 16-bit color, aliasing out the wazoo, etc.) and the rest is history.
Nintendo managed largely not-janky graphics with the N64, but it did come out 2-3 years after the Saturn and Playstation.
It will always be easy to make 3D games that look bad, but on the N64 games tend to look more stable than PS1 or Saturn games. Less polygon jittering[0], aliasing isn't as bad, no texture warping, higher polygon counts overall, etc.
If you took the same animated scene and rendered it on the PS1 and the N64 side by side, the N64 would look better hands down just because it has an FPU and perspective texture mapping.
[0] Polygon jittering caused by the PS1 only being capable of integer math, so there is no subpixel rendering and vertices effectively snap to a grid.
I thought the problem was that it only had 12 or 16-bit precision for vertex coords, which is not enough no matter whether you encode it as fixed-point or floating-point. Floats aren't magic.
Compare it to the Playstation, which could not manage proper texture projection and also had such poor precision in rasterization that you could watch polygons shimmer as you moved around.
The N64 in comparison had an accurate and essentially modern (well, "modern" before shaders) graphics pipeline. The deficiencies in it's graphics were not nearly enough graphics specific RAM (you only had 4kb total as a texture cache, half that if you were using some features! Though crazy people figured out you could swap in more graphics from the CARTRIDGE if you were careful) and a god awful bilinear filtering on all output.
Interestingly, the N64 actually had some sort of precursor in form of the RSP "microcode". Unfortunately there was initially no documentation, so most developers just used the code provided by Nintendo, which wasn't very optimized and didn't include advanced features. Only in the last years did homebrew people really push the limits here with "F3DEX3".
> and a god awful bilinear filtering on all output.
I think that's a frequent misconception. The texture filtering was fine, it arguably looks significantly worse when you disable it in an emulator or a recompilation project. The only problem was the small texture cache. The filtering had nothing to do with it. Hardware accelerated PC games at the time also supported texture filtering, but I don't think anyone considered disabling it, as it was an obvious improvement.
But aside from its small texture cache, the N64 also had a different problem related to its main memory bus. This was apparently a major bottleneck for most games, and it wasn't easy to debug at the time, so many games were not properly optimized to avoid the issue, and wasted a large part of the frame time with waiting for the memory bus. There is a way to debug it on a modern microcode though. This video goes into more detail toward the end: https://youtube.com/watch?v=SHXf8DoitGc
Fun trivia for readers, it isn't even normal 4-tap bilinear filtering, it's 3-tap, resulting in a characteristic triangular blurring that some N64 emulators recreate and some don't. (A PC GPU won't do this without special shaders)
Wiggling is down to lack of precision and lack of subpixel rendering, unrelated to Z buffering. Z buffers are for hidden surface removal, if you see wiggling on a single triangle floating in a void, it's not a Z buffer problem.
When you see models clipping through themselves because the triangles can't hide each other, that's the lack of Z buffer.
Thanks for clarifying. I knew I was getting something wrong, but can never remember all the details. IIRC PS1 also suffered from render order issues that required some workarounds, problems the N64 and later consoles didn't have.
The lack of media storage was thing that kind of solidified a lot of those issue. Many that worked on the N64 have said that the texture cache on the system was fine enough for the time. Not great but not terrible. The issue was that you were working in 8MB or 16MB space for the entire game. 32MB carts where rare and less than a dozen ever used 64MB carts.
Yeah. I'm not what one would call a graphics snob, but I found the N64 essentially unplayable even at the time of its release. With few exceptions, nearly every game looked like a pile of blurry triangles running at 15fps.
I always felt like N64 games were doing way too much to look good on the crappy CRTs they were usually hooked up to. The other consoles of the era may have had more primitive GPUs, but for the time I think worse may have actually been better, because developers on other platforms were limited by the hardware in how illegible they could make their games. Pixel artists of the time had learned to lean into and exploit the deficiencies of CRTs, but the same tricks can't really be applied when your texture is going to be scaled and distorted by some arbitrary amount before making it to the screen.
A part of this was due to the TRC of Nintendo. It also didn't hep that due to the complexity of the graphics hardware, most developers where railroaded into using things like the Nintendo provided Microcode just to run the thing decently.
No, it's due to limited precision in the vertices. If you had 64 bit integers you could have 32.32 fixed-point and it would look as good as floating-point.
Quake did not use floating point in it's rasterization math, and it exhibited none of the jittery polygon issues that the ps1 did. It's largely a lack of subpixel accurate rasterization causing it (not even sure if PS1 is pixel accurate, let alone subpixel :)) .
Sure, but lack of perspective correct texturing is a separate issue, with a separate visual artifact.
Jittery polygons refers to the artifacts you get when polygon vertices are snapped to integer pixel coordinates, rather than taking into account subpixel positions. Quake did not have this issue, despite not using floating-point calculation in it's rasterization. It did use floating point when texturing spans, after rasterization, but this was more of an optimization than a fundamental requirement for accurate texturing :)
A heroic, and ultimately unnecessary considering the mundane reasons that slowed the N64 down, attempt to consumerize exotic hardware.
The hardware was actually pretty great in the end. The unreleased N64 version of Dinosaur Planet holds up well considering how much more powerful the GameCube was.
/edit
Nintendo were largely the architects of their own misery. First, they set expectations sky high with their “Ultra 64” arcade games, then were actively hostile to developers in multiple ways.
I'm not 100% sure of the specifics, but Nintendo took a pretty different approach from Sony or Sega at this time. Sony and Sega both rolled their own graphics chips, and both of them made some compromises and strange choices in order to get to market more quickly.
Nintendo instead approached SGI, the most advanced graphics workstation and 3D modeling company in the world at the time, and formed a partnership to scale back their professional graphics hardware to a consumer price point.
Might be one of those instances where just getting something that works from scratch is relatively easy, but taking an existing solution and modifying it to fit a new use case is more difficult.
The cartridge ended up being a huge sore spot too.
Nintendo wanted it because of the instant access time. That’s what gamers were used to and they didn’t want people to have to wait on slow CDs.
Turns out that was the wrong bet. Cartridges just cost too much and if I remember correctly there were supply issues at various points during the N64 era pushing prices up and volumes down.
In comparison CDs were absolutely dirt cheap to manufacture. And people quickly fell in love with all the extra stuff that could fit on a desk compared to a small cartridge. There was simply no way anything like Final Fantasy 7 could have ever been done on the N64. Games with FMV sequences, real recorded music, just large numbers of assets.
Even if everything else about the hardware was the same, Nintendo bet on the wrong horse for the storage medium. It turned out the thing they prioritized (access time) was not nearly as important as the things they opted out of (price, storage space).
Tangentially related, but if you haven't already, you should read DF Retro's writeup of the absolutely incredible effort to port the 2 CD game Resident Evil 2 to a single 64MB N64 cartridge: https://www.eurogamer.net/digitalfoundry-2018-retro-why-resi...
Not just dirt cheap, the turn around time to manufacture was significantly lower. Sony had an existing CD manufacturing business and could produce runs of discs in the span of a week or so, whereas cartridges typically took months. That was already a huge plus to publishers since it meant they could respond more quickly if a game happened to be a runaway success. With cartridges they could end up undershooting, and losing sales, or overshooting and end up with expensive, excess inventory.
Then to top it all off, Sony had much lower licensing fees! So publishers got “free” margin to boot. The Playstation was a sweet deal for publishers.
>There was simply no way anything like Final Fantasy 7 could have ever been done on the N64.
Yes but I don't see how a game like Ocarina of time with its streaming data in at high speed would have been possible without a cartridge. Each format enabled unique gaming experiences that the other typically couldn't replicate exactly.
Naughty Dog found a solution - constantly streaming data from the disk, without regard for the hardware's endurance rating:
> Andy had given Kelly a rough idea of how we were getting so much detail through the system: spooling. Kelly asked Andy if he understood correctly that any move forward or backward in a level entailed loading in new data, a CD “hit.” Andy proudly stated that indeed it did. Kelly asked how many of these CD hits Andy thought a gamer that finished Crash would have. Andy did some thinking and off the top of his head said “Roughly 120,000.” Kelly became very silent for a moment and then quietly mumbled “the PlayStation CD drive is ‘rated’ for 70,000.”
> Kelly thought some more and said “let’s not mention that to anyone” and went back to get Sony on board with Crash.
Crash Bandicoot is a VERY different game from Ocarina Of Time. They are not comparable at all. They literally had to limit the field of view in order to get anything close to what they were targeting. Have you played the two games? The point still stands, Zelda with its vast open worlds is not feasible on a CD based console that has a max transfer rate of 300KB/s and the latency of an iceberg.
What ND did with Crash Bandicoot was really cool to see in action (page in/out data in 64KB chunks based on location) but you are right - this relied on a very strict control of visuals. OoT didn't have this limitation.
Nintendo did not approach SGI. SGI was rejected by Sega for the Saturn - Sega felt their offering was too expensive to produce, too buggy at the time despite spending man hours helping fix hardware issues,, and had no chance to make it to market in time for their plans.
For all we know, Nintendo had no plans past the SNES, except for the VirtualBoy. But then again, the VirtualBoy was another case of Nintendo being approached by a company rejected by Sega…
It's been years since since I read the book "Console Wars", but if memory serves me correctly SGI shopped their tech to SEGA first before Nintendo secured it for the N64.
Yep, Sega had a look at SGI's offering and rejected it. One of the many reasons they did so was because they thought the cost would be too high due to the die size of the chips.
Kind of funny considering the monstrosity the Saturn ended up becoming.
Oh I was not criticizing the article per se, my apologies if it came out as such, I just thought this piece of information was important to understand why they ended up with such a random mash of chips.
Ah no worries! From my side I was only trying to explain more about the origins of the article, since I see it often mentioned/speculated in many forums.
The other big problem with the N64 was that the RAM had such high latency that it completely undid any benefit from the supposedly higher bandwidth that RDRAM had and the console was constantly memory starved.
The RDP could rasterize hundreds of thousands of triangles a second but as soon as you put any texture or shading on them, the memory accesses slowed you right down. UMA plus high latency memory was the wrong move.
In fact, in many situations you can "de-optimize" the rendering to draw and redraw more, as long as it uses less memory bandwidth, and end up with a higher FPS in your game.
That's mostly correct. It is as you say, except that shading and texturing come for free. You may be thinking of Playstation where you do indeed get decreased fillrate when texturing is on.
Now, if you enable 2cycle mode, the pipeline will recycle the pixel value back into the pipeline for a second stage, which is used for 2 texture lookups per pixel and some other blending options. Otherwise, the RDP is always outputting 1 pixel per clock at 62.5 mhz. (Though it will be frequently interrupted because of ram contention) There are faster drawing modes but they are for drawing rectangles, not triangles. It's been a long time since I've done benchmarks on the pipeline though.
You're exactly right that the UMA plus high latency murders it. It really does. Enable zbuffer? Now the poor RDP is thrashing read modify writes and you only get 8 pixel chunks at a time. Span caching is minimal. Simply using zbuf will torpedo your effective full rate by 20 to 40 percent. That's why stuff I wrote for it avoided using the zbuffer whenever possible.
The other bandwidth hog was enable anti aliasing. AA processing happened in 2 places: first in the triangle drawing pipeline, for inside polygon edges. Secondly, in the VI when the framebuffer gets displayed, it will apply smoothing to the exterior polygon edges based on coverage information stored in the pixels extra bits.
On average, you get a roughly 15 to 20 percent fillrate boost by turning both those off. If you run only at lowres, it's a bit less since more of your tender time is occupied by triangle setup.
Another example from that video was changing a trig function from a lookup table to an evaluated approximation improved performance because it uses less memory bandwidth.
Was the zbuffer in main memory? Ooof
What's interesting to me is that even Kaze's optimized stuff is around 8k triangles per frame at 30fps. The "accurate" microcode Nintendo shipped claimed about 100k triangles per second. Was that ever achieved, even in a tech demo?
There were many microcode versions and variants released over the years. IIRC one of the official figures was ~180k tri/sec.
I could draw a ~167,600 tri opaque model with all features (shaded, lit by three directional lights plus an ambient one, textured, Z-buffered, anti-aliased, one cycle), plus some large debug overlays (anti-aliased wireframes for text, 3D axes, Blender-style grid, almost fullscreen transparent planes & 32-vert rings) at 2 FPS/~424 ms per frame at 640x476@32bpp, 3 FPS/~331ms at 320x240@32bpp, 3 FPS/~309ms at 320x240@16bpp.
That'd be between around 400k to 540k tri/sec. Sounds weird, right ? But that's extrapolated straight from the CPU counter on real hardware and eyeballing, so it's hard to argue.
I assume the bottleneck at that point is the RSP processing all the geometry, a lot of them will be backface culled, and because of the sheer density at such a low resolution, comparatively most of them will be drawn in no time by the RDP. Or, y'know, the bandwidth. Haven't measured, sorry.
Performance depends on many variables, one of which is how the asset converter itself can optimise the draw calls. The one I used, a slight variant of objn64, prefers duplicating vertices just so it can fully load the cache in one DMA command (gSPVertex) while also maximising gSP2Triangle commands IIRC (check the source if curious). But there's no doubt many other ways of efficiently loading and drawing meshes, not to mention all the ways you could batch the scene graph for things more complex than a demo.
Anyways, the particular result above was with the low-precision F3DEX2 microcode (gspF3DLX2_Rej_fifo), it doubles the vertex cache size in DMEM from 32 to 64 entries, but removes the clipping code: polygons too close to the camera get trivially rejected. The other side effect with objn64 is that the larger vertex cache massively reduces the memory footprint (far less duplication): might've shaved off like 1 MB off the 4 MB compiled data.
Compared to the full precision F3DEX2, my comment said: `~1.25x faster. ~1.4x faster when maxing out the vertex cache.`.
All the microcodes I used have a 16 KB FIFO command buffer held in RDRAM (as opposed to the RSP's DMEM for XBUS microcodes). It goes like this if memory serves right:
1. CPU starts RSP graphics task with a given microcode and display list to interpret from RAM
2. RSP DMAs display list from RAM to DMEM and interprets it
3. RSP generates RDP commands into a FIFO in either RDRAM or DMEM
4. When output command buffer is full, it waits for the RDP to be ready and then asks it to execute the command buffer
5. The RDP reads the 64-bit commands via either the RDRAM or the cross-bus which is the 128-bit internal bus connecting them together, so it avoids RDRAM bus contention.
6. Once the RDP is done, go to step 2/3.
To quote the manual:
> The size of the internal buffer used for passing RDP commands is smaller with the XBUS microcode than with the normal FIFO microcode (around 1 Kbyte). As a result, when large OBJECTS (that take time for RDP graphics processing) are continuously rendered, the internal buffer fills up and the RSP halts until the internal buffer becomes free again. This creates a bottleneck and can also slow RSP calculations. Additionally, audio processing by the RSP cannot proceed in parallel with the RDP's graphics processing. Nevertheless, because I/O to RDRAM is smaller than with FIFO (around 1/2), this might be an effective way to counteract CPU/RDP slowdowns caused by competition on the RDRAM bus. So when using the XBUS microcode, please test a variety of combinations.
I'm glad someone found objn64 useful :) looking back it could've been optimized better but it was Good Enough when I wrote it. I think someone added png texture support at some point. I was going to add CI8 conversion, but never got around to it.
On the subject of XBUS vs FIFO, I trialled both in a demo I wrote with a variety of loads. Benchmarking revealed that over 3 minutes each method was under a second long or shorter. So in time messing with them I never found XBUS to help with contention. I'm sure in some specific application it might be a bit better than FIFO.
By the way, I used a 64k FIFO size, which is huge. I don't know if that gave me better results.
Oh, you're the MarshallH ? Thanks so much for everything you've done !
I'm just a nobody who wrote DotN64, and contributed a lil' bit to CEN64, PeterLemon's tests, etc.
For objn64, I don't think PNG was patched in. I only fixed a handful of things like a buffer overflow corrupting output by increasing the tmp_verts line buffer (so you can maximise the scale), making BMP header fields 4 bytes as `long` is platform-defined, bumping limits, etc. Didn't bother submitting patches since I thought nobody used it anymore, but I can still do it if anyone even cares.
Since I didn't have a flashcart to test with for the longest time, I couldn't really profile, but the current microcode setup seems to be more than fine.
Purely out of curiosity, as I now own a SC64, is the 64drive abandonware ? I tried reaching out via email a couple times since my 2018 order (receipt #1532132539), and I still don't know if it's even in the backlog or whether I could update the shipping address. You're also on Discord servers but I didn't want to be pushy.
I don't even mind if it never comes, I'd just like some closure. :p
Recently Sauraen on Youtube demonstrated their performance profiling on their F3DEX3 optimizations. One thing they could finally do was profile the memory latency and it is BAD! On a frame render time of 50ms, about 30ms of that is the processors just waiting on the RAM. Essentially, at least in Ocarina of time, the GPU is idle 60% of the time!
Whole video is fascinating but skip to the 29 minutes mark to see the discussion of this part.
Ram latency is bad, GPU is spending half the time doing nothing, but in return using RDRAM allowed for 2 layer PCB making whole thing insanely cheap to manufacture.
N64 had a 4KB texture cache while the Ps1 had a 2KB cache. But the N64's mip-mapping requirement meant that it essentially had 2KB + lower resolution maps.
The streaming helped it a lot but I think the cost of larger carts was a big drag on what developers could do. It is one thing to stream textures but if you cannot afford the cart size in the first place it becomes merely academic.
The real problem was different. The PS1 cache was a real cache, managed transparently by the hardware. Textures could take the full 1 MB of VRAM (minus the framebuffer, of course).
In contrast, the N64 had a 4 kB texture RAM. That's it, all your textures for the current mesh had to fit in just 4 kB. If you wanted bigger textures, you had to come up with all sort of programming tricks.
[1] https://www.copetti.org/writings/consoles/playstation/
[2] https://www.copetti.org/writings/consoles/nintendo-64/