The Evolution of the GPU
Issue: Volume 41 Issue 3: (Edition 3 2018)

The Evolution of the GPU

If ever there was an example of, “If you build it, they will come,” the GPU is the world’s best. The “they” referred to here are the applications. When graphics semiconductor suppliers such as 3Dlabs, ATI, Nvidia, TI, and others began exploiting the density, speed, and economic benefits of Moore’s law for the unlimited demand of computer graphics, they had one goal: build a graphics accelerator that would get us closer to photorealistic, real time rendering.

And although a fixed program, hard-wired function accelerator would be cheaper, less expensive, and easier to control than a programmable one, the nature of CG is such that new algorithms, tricks, and functions are being developed almost weekly, and have been since the 1970s.

The trade-off of LUT-based, predetermined CG functions in a chip vs. a general-purpose CG device were rapidly becoming overwhelmingly in favor of a programmable solution. TI led the parade in 1986 with the TMS34010, and in 1999, 3Dlabs introduced the geometry processor GLINT – the first GPU.

Shortly thereafter, Nvidia introduced the NV10 and marketed it as the first GPU (it wasn’t, but Nvidia has always been an aggressive marketing company). And right on the heels of Nvidia’s announcement, ATI revealed its VPU, which was ATI’s attempt to offer the same technology but with a differentiated name. By this time, TI had faded from the scene, being unwilling to continue to invest in R&D for graphics – a move I suspect they have regretted many times.

The concept of using VLSI (very large scale integration) and lots of ALUs (arithmetic logic units) can be credited to Henry Fuchs, who, in 1981, proposed the Pixel Planes project at the University of North Carolina. A little later, Bill Dally did foundational work in stream processing in 1995 at MIT on the Imagine project, and in 1996, he moved to Stanford University. He’s now CTO at Nvidia.

Dally’s work showed that a number of applications ranging from wireless baseband processing, 3D graphics, encryption, and IP forwarding to video processing could take advantage of the efficiency of stream processing. This research inspired other designs, such as GPUs from ATI Technologies as well as the Cell microprocessor from Sony, Toshiba, and IBM. Stream processors, SIMDs, parallel processors, and GPUs are all closely-related family members.

At Stanford, the notion of doing stream computing using ATI GPUs was tried and proven successful, albeit difficult to program via OpenGL. To overcome that obstacle, Nvidia developed a C-like parallel processing language known as CUDA. A year later, Apple and Khronos introduced OpenCL. By 2006, parallel processing on the GPU had been established as a new paradigm, and one with tremendous potential. Wherever an application was begging for parallel processing, that’s where GPU computing took off.

In the past, parallel processing was done with huge numbers of processors, such as an x86, but they were very expensive and difficult to program. The GPU as a dedicated, single-purpose processor offered much greater compute density per dollar, and it’s been subsequently exploited in many math acceleration tasks – GPUs designed for gaming are now crucial to HPC, supercomputing, and AI.

A Game Changer

But make no mistake, gaming is still king. There are tens of millions of gamers (some estimates have exceeded 100 million), and every year they buy more than 50 million graphics boards and 20 million laptops for gaming. That’s in addition to the 25 million consoles with powerful GPUs in them. Contrast that with the 3.5 million workstations and the less than one million GPUs bought for data centers and supercomputers in 2017.

GPU
AERODYNAMICS OF A COMMERCIAL AIRCRAFT.

GPU
VIBRATION ANALYSIS OF A JET ENGINE.

Those ratios will change as AI training increases, but it won’t double. As more companies become more data-intensive, they will need GPUs to chew through all that data and make sense of it. The advent of IoT and smart sensors adds to the explosion of data – today, there are some 800,000 GPU developers.

For 30 years, the dynamics of Moore’s law held true. But CPU performance scaling has slowed. GPU computing is defining a new, supercharged law. It starts with a highly specialized parallel processor and continues through system design, system software, algorithms, and optimized applications.

There are over 35 applications and areas of science that are employing GPUs for compute acceleration, AI, and machine-learning (ML) applications, as well as video processing and streaming. From medical imaging to audio signal processing (Alexa and Shazam run on GPUs in the cloud), weather forecasting, computational fluid dynamics, and finite-element analysis, to cryptography and massive data-reduction projects like SETI and Folding@home.

You’ll never look at or think about a GPU the same. You knew them for rendering and gaming. And with the explosion in AI, they now do all those things and more. For example, raytracing on a GPU is not new. It’s also not easy, as GPUs aren’t well suited for raytracing due to the branching inherent in raytracing. Also, raytracing takes time to resolve an image. Using AI predictive analysis, Nvidia has demonstrated how GPUs can be used to speed up raytracing.

The same concepts have been used to make slow-motion movies from regular 30 fps video. In the old days when an animator would make a drawing (of, say, a duck), he or she would then make a second drawing with the duck in a different position, or maybe smiling. The studio would hire grunts, er interns, to draw all the frames in “be-tween.” Later, when animation software was developed, that was one of the first (2D) features – tweening

To make slo-mo, you need a lot of tweening, But photographs are a zillion times more difficult than a simple 2D animation. So, the slo-mo AI software is similar to the raytracing speedup software figuring out where things should be.

Thanks to the Gamers

The number of applications will continue to expand, quite possible exponentially, especially as AI is applied to more aspects of science, engineering, medicine, and security. However, it’s the volume of the game market that provides the economy of scale that allows such powerful processors as GPUs to be affordable.

So, all the scientists and analysts out there should say thank you to the gamers who are playing for the R&D and manufacturing of these marvelous massive parallel processors.

Jon Peddie is president of Jon Peddie Research, a Tiburon, CA-based consultancy specializing in graphics and multimedia that also publishes JPR’s “TechWatch.” He can be reached at jon@jonpeddie.com.