Enabling Machine Intelligence at High Performance & Low Power

Achieving high performance at ultra-low power is no ordinary challenge - Luckily, Myriad 2 is no ordinary processor.

Chip Architecture and Development Platform

Current mobile processing architectures aren’t well-suited for the computer vision and deep neural network workloads, given the fundamental challenges in balancing performance, heat dissipation and power. In addition, Moore’s law is slowing down which is causing the power and performance benefits in transitioning to the next process technology node to decrease. We believe that the next decade will mark the start of a new era of special-purpose processors focused on decreasing the energy per operation in a new way.

The design principles for Myriad 2 followed from a number of improvements including an increase in the number of programmable vector-processors, and additional dedicated hardware accelerators. As a Vision Processing Unit (VPU) System-on-Chip (SoC), Myriad 2 has a software-controlled, multi-core, multi-ported memory subsystem and caches which can be configured to allow a large range of workloads. Myriad 2 provides exceptionally high sustainable on-chip data and instruction bandwidth to support the twelve processors, 2 RISC processors and high-performance video hardware accelerators.

In order to guarantee sustained high performance and minimize power, the Movidius proprietary processor called SHAVE (Streaming Hybrid Architecture Vector Engine) contains wide and deep register-files coupled with a Variable-Length Long Instruction-Word (VLLIW) controlling multiple functional units including extensive SIMD capability for high parallelism and throughput at both a functional unit and processor level. The SHAVE processor is a hybrid stream processor architecture combining the best features of GPUs, DSPs and RISC with both 8/16/32 bit integer and 16/32 bit floating point arithmetic as well as unique features such as hardware support for sparse data structures. The architecture is designed to maximize performance-per-watt while maintaining ease of programmability, especially in terms of support for design and porting of multicore software applications.

The Myriad Development Kit (MDK) is our software suite of tools, libraries, and frameworks that includes a software development framework that enables developers to incorporate proprietary functions of their own and build arbitrary processing pipelines while taking advantage of the optimized software libraries provided. Enabling the mix of our off-shelf library functions with the developer's own proprietary kernel functions to attain an efficient application pipeline is what makes prototyping and developing on the Myriad 2 family of VPUs efficient. Being able to make updates to a kernel function and re-compile is so much more straightforward when you're not managing the data flow manually; our programming methodology incorporates auto-scheduling of functions, something handled in the Myriad 2 VPU itself, so that debugging and prototyping become easier.

Read More About Our Processing Platform

Deep Learning on the Myriad Platform

In order to deploy Deep Learning at the network edge and close to the sensors where data processing latency is lowest, performance and precision at very low power are still critical. The Myriad platform has a number of key elements suited to Deep Learning and convolutional neural network implementations in particular.

  • Performance: the raw performance of Myriad’s SHAVE processor engines achieve the hundreds of GFLOPS required in fundamental matrix multiplication compute that’s required for deep learning networks of various topologies.
  • On Chip RAM: deep networks create large volumes of intermediate data. Keeping all of this on chip enables our customers to vastly reduce the bandwidth that would otherwise create performance bottlenecks.
  • Flexible Precision: Native Support for Mixed Precision and Hardware Flexibility—the ability to support Deep Learning networks with industry-leading performance at best-in-class power efficiency is supported by Myriad’s flexibility in terms of mixed precision support. Both 16 bit and 32 bit floating point datatypes, as well as u8 and unorm8 types are supported. Additionally, existing hardware accelerators are easily repurposed to provide the flexibility needed to achieve high performance for convolution computation.
  • High performance libraries: The development kit includes dedicated software libraries that go hand-in-hand with the architecture to support sustained performance on matrix multiplication and multidimensional convolution.

Device makers look to platforms that offer the performance, power, and cost that enable cutting edge network topologies at the network edge. The Myriad platform delivers the right architectural and software elements to usher in a new era of deep learning in new, groundbreaking devices.

Read More

Related Videos