Computer systems have enjoyed exponential performance growth since the commercial introduction of the monolithic microprocessor in 1971. Fueled by rapid increases of transistor count per silicon die (summarized by Gordon Moore in 1965 in the "law" that bears his name), this performance growth has made possible the sequencing of the human genome, cheap and accurate weather forecasts, cellular handsets with video-streaming capability, and interactive computer games with near-photorealistic image quality.
Until recently, the frequency of the clock signal that synchronizes and orchestrates a processor's calculations was also increasing at a similar rate. A few years ago, rising power requirements made further such increases untenable. Further performance improvements have instead been provided through multicore technology, where each microprocessor consists of several processor cores that cooperate to solve the problem at hand. Indeed, almost all current microprocessors consist of multiple cores, and the number of cores per processor will continue to increase.
Realizing the performance promises of multicore technology is difficult, though. Simply put, the computations must be amenable to parallelization for multicore technology to be useful. For a speedup by a factor of ten, it must be possible to keep ten processor cores busy, without any core stalling while waiting for input from some other core. Unfortunately, many important workloads have little inherent parallelism, and therefore cannot take advantage of the theoretical performance of large multicore machines.
In CHAMPP, we intend to study a complementary path toward higher performance, namely, processor cores that adapt to their workloads in order to be used more efficiently. We aim to develop design principles (both at the circuit and the architecture levels) for adaptive accelerator blocks, as well as for the memory system that provides data and instructions to the heterogeneous multicore machine.
Last modified: December 13, 2011