mmoy said:
Yeah, I know. It has a bunch of registers in the background
that do a lot of work behind your back for you. But Intel and
AMD have done a pretty nice job at doing a lot of the little
byte and bit twiddling operations pretty fast.
In general, though, it's easier to talk about P4/K8 as CISC
vs RISC as most people won't know what you're talking
about if you say that P4/K8 is RISC.
A lot of good stuff came out of the demise of Alpha.
The StrongARM chip also contributed somewhat as Intel bought the chip fabrication plant from DEC and five minutes later announced that the new Pentium design (what Linux calls the 586 architecture) was to use an ultra small high density Risc core. Strangely enough that is exactly what the SA was!
The SA had some lovely superfast bit twiddling and condition flag testing in it's tiny instruction set. For example you can do:
BR1
A=A-1 (setting condition flags)
If A is 0 set A = 8 (don't set condition flags)
If A is 0 bit shift B right by 4 (set condition flags)
If B is not zero branch to BR1
in only 4 machine code instructions that execute at 1 instruction per cycle, leads to some **** fast (if highly unreadable!) code.
Way back in the early nineties I had a BASIC program which simulated a globular cluster, it would run in around 8 minutes on a 486 DX, the ARM3 based Archimedes I had at the time managed to run it in just under 3 minutes. A compiled version ran on the PC in around 4.5 minutes and a hand assembled version on the ARM3 in 1 minute.
When I bought my SA powered RiscPC I tried out the hand assembled version, it ran in 7 seconds, I then re-coded it taking advantage of the SA only features and got the final program down to 1.7 seconds.
In those days DEC were unsurpassed when it came to processor fabrication and I reckon a lot of the SA & Alpha design ended up in Intel offerings, especially the new 'M' range and the Xscale processors.
Amen-Moses