Data center-hosted artificial intelligence is rapidly proliferating in both government and commercial markets, and while it’s an exciting time for AI, only a narrow set of applications is being addressed, primarily limited to neural networks based on convolutional approach. Other categories of AI include general AI, symbolic AI and bio-AI, and all three require different processing demands and run distinctly different algorithms. Virtually all of today’s commercial AI systems run neural network applications. But much more control-intensive and powerful AI workloads using symbolic AI, bio-AI and general AI algorithms are ill-suited to GPU/TPU architectures.
Today, commercial and governmental entities that need AI solutions are using workarounds to achieve more compute power for their neural net applications, and chief among them is specialty processors like Google TPUs and NVIDIA GPUs, provisioned in data centers specifically for AI workloads.
However, using TPUs and GPUs, even if they are dedicated to AI processing tasks, can still be problematic. It drives up data center capital expenditures for AI-specific processors, and it drives up costs for software development (e.g., GPUs are notoriously difficult to program). In most hyperscale data centers today, there exists a combination of standard CPUs for normal data center workloads and specialty TPUs or GPUs (comprising approximately 5-10% of server rack space) dedicated to AI/neural net processing.
CPUs are easy to program but become slow and power-hungry when tasked with highly parallel AI applications. Specialty AI processors are faster and more power efficient than CPUs for neural net applications, but they are difficult to program.
Today, if embarrassingly parallel computation is the goal (i.e., executing each instruction mindlessly on a large number of data sets), such as in convolutional neural networks, TPUs/GPUs are a go-to solution. They are more efficient (and in the case of TPUs, they can be up to 30x faster) than CPUs for convolutional neural net processing. This is because the action of fetching and scheduling an instruction uses significantly more power than actually executing that instruction on a single data set. A specialty AI processor, such as a GPU, will fetch a single instruction and execute that instruction on 32 datasets simultaneously (maximizing throughput and minimizing power).
Google recently announced its third-generation TPU, which is still nowhere near the performance needed for real-time human brain simulation projects. And general AI, bio-AI and symbolic AI algorithms are not a good match for GPU/TPU processors.
The human brain needs to process huge amounts of information in order to take action in real time, and this requires massive processing power. Today’s supercomputers don’t even come close to the processing power of the human brain (which is approximately 1019 floating point operations per second). One of the fastest supercomputers on the planet today, China’s Sunway TaihuLight, with 10,649,600 cores, can achieve 93 petaflops (Rmax on Linpack benchmark suite). That’s a tiny fraction of what we need for simulation of the human brain in real time, which requires approximately 1019 flops (that’s 10 exaflops, or 10,000 petaflops).
We have a long way to go, but we are getting there. In fact, I predict it will be about two years, give or take.
If you’re not yet familiar with ongoing efforts to build a super supercomputer, one capable of simulating a human brain, consider the Human Brain Project, which was established by the European Union in 2013 to unite the fields of neuroscience, medicine and computing for both commercial and research needs.
SpiNNaker (spiking neural network architecture), which is part of the Human Brain Project, is being led by professor Steve Furber (the inventor of the ARM processor and current member of Tachyum’s Board of Advisors) at Manchester University. SpiNNaker’s goal is to simulate the equivalent of a rat brain (about 1000x less than a human brain) in real time, using around 1 million ARM processors configured as a spiking neural network, which simulates neuronal activity more accurately and uses much less power than “embarrassingly parallel” neural nets. If your brain was a neural network, it would boil inside your skull.
Along with the examples described above, my company, Tachyum, is working on a breakthrough processor architecture called Prodigy. Prodigy architecture offloads heavy lifting tasks normally done in hardware to a Tachyum-proprietary smart compiler.
It’s only taken you about four minutes to read this article. During that time, people searched the web almost 14 million times, logged into Facebook 3.8 million times, tweeted 1.8 million times, watched more than 17 million YouTube videos, and swiped right or left on 4.4 million Tinder profiles.
When cloud-based data centers offer users AI applications at a reasonable cost, tasks like manually looking at Tinder profiles and then swiping will seem downright archaic. The new data and AI centers will know which profiles to flag for you, and they will know which YouTube videos you will want to watch. Sooner than you think, data centers will be the place to access low-cost AI solutions for everyone.