Prasenjit Das

Many IOT applications require processing at the edge to convert data into insights in real time. Data processing and AI models are in essence methods performing vector transforms at scale. This vector processing was done using CPUs which are not optimized for vector processing. The next evolution in the journey was the use of GPU which had been originally designed for large scale pixel (think vectors) manipulation. While GPU are optimized to perform vector manipulation at speed they have high power requirements and are impractical for heavy processing on edge devices. GPU based edge processing is power limited and a compromise has to made on the quality of the decision to achieve timeliness and lower power consumption. The alternate is to send the data over the network to a cloud location for processing. This approach introduces network latency and is time limited. It works well when the data processing requires high quality decision and can compromise on real time decision making. It would be a failure for applications like brake assists, autonomous, assistive applications.

There are several concurrent research approaches being taken to limit the power requirements for edge processing. The first optimizes the traditional Von Neumann computer architecture, increasing CPU clock speed while reducing the size of the transistor gate to achieve lower power requirements. The transistor size has fallen exponentially over the past few years allowing more transistor to be packed into the same space while consuming less power. However as the size of the transistor reduce they become so small that they experience what is called quantum tunneling. This means the electrons instead of staying in the intended logic gate flow from one logic gate to another, essentially making it impossible to achieve an off state. One way to beat this "leaking electrons" problem is to build a quantum von Neuman computer. In a simplistic explanation since the quantum computer can have several states it could achieve order of magnitude higher computation for the same clock speeds. This eliminates to keep pushing against the "leaking electron" problem. It is unclear however whether a quantum von Neumann architecture would have a low power consumption.

The other options is to observe how the brain is able to achieve the processing capacity at low cycle speed and low power. Individual neurons fire at 200 Hz max and yet the brain is far superior to any Artificial Neural Network. The classical Von Neuman computer architecture marches data for execution of binary instructions over a connecting bus between the memory and CPU to the beat of the synchronous clock. The Von Neumann architecture is intrinsically power hungry as it requires higher clock speed to achieve higher throughput. The brain works differently, memory and processing happens within the same neuron structure. This is the essence of neuromorphic computing. Neuromorphic computing promises to achieve energy efficiency, low latency and high quality decisions. There are several promising neuromorphic chips on the horizon. Brain AI has achieved full commercialization of its neuromorphic Akida AIOT chip and MetaTF software framework. Intel is planning to commercialize Loihi2 neuromorphic chip and Lava software framework.

Low power edge computing has been one of my area of interest since I worked on this topic during my master thesis. I had proposed to achieve high computation throughput at low cycle speed by stacking several low cycle speed processors together as a neural network instead of one power hungry GPUs. It then led to me to study brain inspired vision algorithms and drifted away from the core question I had raised during my master thesis. It is therefore exciting to see promising new developments in the field of neuromorphic computing. As IOT and edge computing become more mainstream I hope this will be an exciting space for new research.

Prasenjit Das

Sunday 6 February 2022

Low Power Edge Computing