Combine risc and cisc?

Arugela · September 17, 2019

https://ieeexplore.ieee.org/document/182090

https://renesasrulz.com/doctor_micro/rx_blog/b/weblog/posts/cisc-and-risc-debate-continues-hybrid-wins

Would this benefit KSP?

Something was mentioning they thought macs may go risc. I was wondering if you couldn't combine them.

Or can it be combined in more ways to get more for things like physics? Maybe more low key code for programmers.

https://www.quora.com/Is-the-Intel-Pentium-really-a-hybrid-between-RISC-and-CISC

Quote

No.

It is a CISC.

Also, One of the features of many RISC processors is that all the instructions could be executed in a single clock cycle. No CISC CPU can do this.

Edited September 17, 2019 by Arugela

magnemoe · September 17, 2019

Pretty outdated information for multiple reasons.
First of all back in early 90s we had just past one millions transistors nowdays we have billions, 90% is used for cache, branch prediction and prefetch, yes cisc uses some hundred thousands more transistors
Second lately as in last 10 years simple instructions has been hardwired risk style on x86, only more complex instructions uses microcode.
Last cpu uses pipelines so they process one instruction in pipe each clock cycle but instruction uses multiple to pass the pipe, yes the one instruction each clock cycle was true with 1-200 MHz, not with multiple GHz

Now for small low power use cpu's risc is better, this is why they are used in cell phones and lots of embedded systems. Granted mobile cpu are not small anymore but as they uses risc its no point in switching.
Android and IOS require them to run.
On the other hand microsoft dropped their risc versions of surface tablet as it could not run standard windows programs.
Xbox and playstation switched to x86 this generation and will continue using it for the next version.
This has more to do with being able to use pc hardware rather than having to design their own systems.

wumpus · September 17, 2019

[Warning: this is detailed and probably too long. I was into this type of thing back in the day]

further "prescript": if you want real performance increases, look at GPU architectures. Unfortunately, they are pretty hostile to programmers and not very compatible with nearly all algorithms and programming methods, but anything that can be made to fit their model can get extreme performance (and even bitcoin mining wasn't a great fit, but makes a good example of CPU vs. GPU power)

CISC vs. RISC really belongs in the 1990s. I think there is a quote (from early editions) in Hennessy and Patterson (once "The Book" on computer architecture, especially when this debate was still going on) that any architecture made after 1984 or so was called "RISC".

A quick post defining "RISC" (or at least where to place real processors on a RISC-CISC continuum by a then leading name in the field). https://www.yarchive.net/comp/risc_definition.html
[tl:dr Indirect addressing was the big problem with CISC. Any complexity in computation is a non-issue to RISC, they just want to avoid any addressing complexity.]

As magnemoe mentioned, early CPUs had limited transistor budgets (modern cores have power budgets - most of the chip isn't the CPU "core").

CISC really wasn't a "thing", just the old way of doing things that RISC revolted against. Even so, I think the defining thing about CISC was the used of microcode. Microcode is basically a bit of software that turns a bundle of transistors and gates into a computer, and you pretty much have to learn how to make it to understand what it is. It also made designing computers, and especially much more complex computer wildly easier, so was pretty much universally adopted for CPU design.

Once CPU designers accepted microcode, they really weren't limited in the complexity of their instructions: the instructions were now coded as software instead of separate circuits. This also lead to a movement trying to "close the semantic gap" by making a CPU's internal instructions (i.e. assembly language) effectively a high level language that would be easy to program. The Intel 432 might be seen as the high point of this idea of CISC design, while the VAX minicomputer and the 68k (especially after the 68020 "improvements") are examples of success with extreme CISCyness.

The initial inspiration for RISC was the five step pipeline. Instructions wouldn't be completed in a single clock, but spread over a "fetch/decode/execute/memory access/write back" pipeline with each instruction being processes on essentially an assembly line. So not only could they execute an instruction per cycle, the clock speed could be (theoretically) five times faster. Not only did RISC have the room for such things (missing all the microcode ROM and whatnot), it was also often difficult to pipeline CISC instructions. Another idea was to have one memory operation per instruction, any more made single cycle execution impossible (this made memory indirect access impossible, something that would make out-of-order much more viable). [Note that modern x86 CPUs have 20-30 cycle pipelines and break instructions down to "load/store" levels, there isn't much difference here between CISC/RISC]

"Also, One of the features of many RISC processors is that all the instructions could be executed in a single clock cycle. No CISC CPU can do this."

This is quite wrong. First, RISC architectures did this by simply throwing out all instructions they could that would have these issues and thus have to use multiple instructions to do the same thing (and don't underestimate just how valuable the storage space for all those instructions were when RISC (and especially CISC) were defined). Early RISCs couldn't execute jump or branch instructions in a single cycle as well, look up "branch delay slots" for their kludge around this. Finally, I really think you want to include things like a "divide" instruction: divide really doesn't pipeline well but you don't want to stop an emulate it with instructions (especially with an early tiny instruction cache).

Once pipelining was effectively utilized, RISC designers included superscale processors (executing two instructions at once) and Out-of-order CPUs. These were hard to do with the simple RISC instruction sets and absolutely brutal for CISC.
VAX made two pipelined machines: one tried to pipeline instructions, the other pipelined microcode. The "pipeline microcode" was successful but still ran 1/5 the speed of DEC's new ALPHA RISC CPU.
Motorola managed pipelining with the 68040 and superscaler execution with the 68060. That ended the Motorola line. *NOTE* anybody who had to program x86 assembler always wished that IBM had chosen Motorola instead of Intel. The kludginess of early x86 is hard to believe in retrospect.
Intel managed pipelining with the i486, superscaler execution with the Pentium, and Out-of-order (plus 3-way superscaler) with the Pentium Pro (at 200MHz, no less). It was clear that at least one CISC could run with the RISCs in performance while taking advantage of the massive infrastructure it had built over the years.

Once Intel broke the out-of-order barrier with the Pentium Pro not to mention the AMD Athlon hot on its heels, the RISC chips had a hard time competing on performance against chips nearly as powerful, plenty cheaper, and with infinitely more software available. From a design standpoint, the two biggest differences between RISC and x86 were that decoding x86 was a real pain (lots of tricks have been used, and currently a microcode cache is used by both Intel and Ryzen) and the x86 didn't have enough integer registers (floating point was worse). This was fixed with the AMD64 instruction set that now has 16 integer instructions (same as ARM). The CISC-RISC division was dead and the RISCs could only retreat back to proprietary lockin and other means to keep customers.

While that all sounds straightforward, pretty much every CISC chip made after the 386/68020 era was called a "RISC core executing CISC instructions". Curiously enough, the chips this was really true tended to fail the hardest. AMD's K5 chip was basically a 29k (a real RISC) based chip that translated x86 in microcode: it was a disaster (and lead to AMD buying Nexgen who made the K6). IBM's 605 (a PowerPC that could run x86 or PowerPC code) never made it out of the lab, although this might be from IBM already being burned by the "OS/2 problem" (emulating your competition only helps increase their marketshare).

There really isn't a way you'd want to combine "RISC and CISC" anymore, RISC chips are perfectly happy including things like vector floating point multiply, and crypto instructions. The only CISCy thing they don't want is anything like indirect addressing (something not hard to code with separate instructions that can be then run out-of-order and tracked).

Here's an example of how to build "CISCy" instructions out of RISC ones and wildly increase the power/complexity ratio. To me it is more a matter of making an out-of-order much more in-order, but you might see it in RISC/CISC terms: https://repositories.lib.utexas.edu/bitstream/handle/2152/3710/tsengf71786.pdf

I'd also like to point out that if you really wanted to make a fast chip between a 6502 and an ARM1, a stack-based architecture might have been strongly tempting. The ARM1 spend half the transitor/space budget on 16 32bit registers alone, and I'd think that doing that with DRAM might have worked at the time (later DRAM and Logic process wouldn't be compatible, but I don't think this was true at the time). One catch with using a DRAM array for registers is that you could only access one operand at a time, which would work fine for a stack. Instructions typically take two operands and write to a third operand. The oldest architecture were accumulators (the 6502 was also an accumulator architecture) and would have a single operand either combined with the accumulator (single register) and the output would replace the accumulator or the accumulator would be written to memory. A stack[ish] would be an improvement on that with the accumulator being replaced with a "top of stack". CISC machines would allow both inputs to come from registers (or memory) and write to a register (or memory) [with the exception that the output would be the same as one of the inputs]. One of the defining characteristics of RISC was that they were load store: instructions either worked on 2 registers and outputed to another register (without the CISC requirement that one be the same) or load or store from memory to/from a register.

The point of all this "single instruction operand" business would be that it would be compatible with the DRAM array (which then could fit whatever registers you needed into an early CPU). The downside would be that it would barely tolerate pipelining, and completely fail to go either superscaler or out-of-order (dying with the CISCs). But for a brief window it should really fly (much like the 6502).

Sign In

Combine risc and cisc?

Recommended Posts

Arugela

Link to comment

Share on other sites

magnemoe

Link to comment

Share on other sites

wumpus

Link to comment

Share on other sites

Join the conversation

Forum

Activity

Community

Mods

Social Media