Performance
The number of transistors available has a huge effect on the performance of
a processor. As seen earlier, a typical instruction in a processor like an 8088
took 15 clock cycles to execute. Because of the design of the multiplier, it
took approximately 80 cycles just to do one 16-bit multiplication on the 8088.
With more transistors, much more powerful multipliers capable of single-cycle
speeds become possible.
More transistors also allow a technology called pipelining. In a pipelined architecture, instruction execution overlaps. So even though it might take 5 clock cycles to execute each instruction, there can be 5 instructions in various stages of execution simultaneously. That way it looks like one instruction completes every clock cycle.
Many modern processors have multiple instruction decoders, each with its own pipeline. This allows multiple instruction streams, which means more than one instruction can complete during each clock cycle. This technique can be quite complex to implement, so it takes lots of transistors.
The trend in processor design has been toward full 32-bit ALUs with fast floating point processors built in and pipelined execution with multiple instruction streams. There has also been a tendency toward special instructions (like the MMX instructions) that make certain operations particularly efficient. There has also been the addition of hardware virtual memory support and L1 caching on the processor chip. All of these trends push up the transistor count, leading to the multi-million transistor powerhouses available today. These processors can execute about one billion instructions per second!
![]() |