All News | Boards | Chips | Devices | Software | LinuxDevices.com Archive | About | Contact | Subscribe
Follow LinuxGizmos:
Twitter Facebook Pinterest RSS feed
*   get email updates   *

Zynq UltraScale+ board supports new Xilinx AI Platform

Oct 3, 2019 — by Eric Brown — 713 views

iWave unveiled a dev kit for its Linux-driven, Zynq Ultrascale+ based iW-Rainbow G30M module with support for a new Xilinx AI Platform. Xilinx is baking related AI technology into its soon-to-ship, Linux-powered 7nm Versal processors.

iWave Systems has launched an “iW-Rainbow G30D Zynq Ultrascale+ MPSoC Development Kit” for its iW-Rainbow G30M compute module, which runs Linux on the Arm Cortex-A53/FPGA Xilinx Zynq UltraScale+ MPSoC. In announcing the kit, iWave focused mostly on the platform’s ability to test the new Xilinx AI Platform, which it calls Xilinx/Deephi core. The Xilinx AI Platform, which spans from the edge to the datacenter, is based largely on its acquisition of edge AI firm DeePhi.


iW-Rainbow G30M

Farther below, we’ll take a closer look at the Xilinx AI Platform and how Xilinx is using some of this technology within its new 7nm, dual -A72/FPGA Versal ACAP chips. Xilinx showcased the Versal early this week at the Xilinx Developer Forum.

Also this week Xilinx announced a Vitis development platform for its FPGAs that is beyond the scope of this article. Based on open source libraries, Vitis is billed as an easier alternative to its Vivado Design Suite. The platform includes a Vitis AI component that appears to target the Versal.

 
iW-Rainbow G30D

The new Zynq Ultrascale+ MPSoC Development Kit with iW-Rainbow G30D carrier board extends the Linux 4.14-driven iW-Rainbow G30M module. The G30M module runs on a quad -A53 Zynq UltraScale+ MPSoC with 192K to 504K FPGA logic cells. The module ships with 4GB DDR4, 1GB for the FPGA, and 8GB of expandable eMMC. There’s also -40 to 85°C support among other features detailed in our Sep. 2018 iW-Rainbow G30M report.



iW-Rainbow G30D and block diagram
(click images to enlarge)

The 140 x 130mm iW-Rainbow G30D carrier board features 2x GbE ports and an SFP+ cage. You also get single DisplayPort, USB 2.0 host, USB Type-C, and debug console ports. Internal I/O includes SD, CAN, JTAG, and 20-pin I/O headers.

Dual FMC HPC connectors provide FPGA-related I/O including LVDS, 14 high-speed transceivers, dual 12-pin PMOD, SATA, PCIe x4, and more. The board has an RTC with battery holder plus a 12V input.

 
Xilinx AI Platform

The iW-Rainbow G30D announcement links to a web page for the Xilinx AI Platform, which it calls “Xilinx/Deephi Core.” The Xilinx AI Platform was developed in large part based on Xilinx’s acquisition of DeepPhi Technology Co. in July 2018. DeePhi was a Beijing-based start-up with expertise in in machine learning, deep compression, pruning, and system-level optimization for neural networks.



Xilinx AI Platform (left) and Xilinx Edge AI Platform architecture diagrams
(click images to enlarge)

The Deephi core algorithms can execute critical real-time tasks directly on the Zynq UltraScale+ FPGA, says iWave. The iW-Rainbow G30M supports “a huge portfolio of Deephi cores” for edge/AI applications, and the new dev kit now enables easier prototyping with the technology, says the company.


Baidu EdgeBoard

The Zynq Ultrascale+ has previously been featured as an AI processor on Baidu’s EdgeBoard, which was announced in January. However, the recently released EdgeBoard uses Baidu’s own Baidu Brain AI algorithms.

The Deephi “sparse neural network” Core technology features a Convolutional Neural Network (CNN) pruning technology and deep compression algorithm to reduce the size of AI algorithms for edge applications. The “ultra-low latency real-time inference” Deephi algorithms support AI/ML acceleration in face recognition and image/pose detection for smart surveillance, says iWave. Other applications include intuitive ADAS for automotive assistance, industrial automation predictive maintenance, and smart healthcare for real-time monitoring and diagnosis.



Early Xilinx slide deck showing planned integration of Deephi technology
(click image to enlarge)

As explained in this EE Journal analysis of the acquisition, DeePhi optimized its algorithms for the Zynq 7000 before moving on to the Zynq UltraScale+ MPSoC. As suggested in the chart above from a 2018 Xilinx slide deck (PDF), the Deephi technology, including pruning, quantizer, compiler, runtime, models, and FPGA IP, went on to form the bulk of what would later be marketed as the Xilinx AI Platform. It forms almost all the edge/embedded side, which is called the Xilinx Edge AI Platform.


Xilinx Edge AI Platform DPU architecture (left) and available Xilinx AI Platform models
(click images to enlarge)

The FPGA IP component in the Xilinx Edge AI Platform is called the Deep-learning Processing Unit (DPU). The hardware block is optimized to work with Xilinx FPGAs to accelerate AI algorithms with low latency.

The Xilinx Edge AI Platform supports AI frameworks including TensorFlow, Caffe, and Darknet, among others. Xilinx lists 18 available models for object, face, pedestrian, ADAS-related recognition, classification, detection, estimation, and localization (see chart above).

The Xilinx Edge AI Platform features a Linux-ready DNNDK (Deep Neural Network Development Kit) for deploying AI inference on Xilinx Edge AI platforms with a lightweight C/C++ API. DNNDK’s DEep ComprEssioN Tool (DECENT) “can reduce model complexity by 5x to 50x with minimal accuracy impact,” says Xilinx. There’s also a Deep Neural Network Compiler (DNNC), a Neural Network Runtime (N2Cube), and a profiler.

The datacenter version of the Xilinx AI Platform lacks the DPU but instead adds Xilinx’s xDNN (Xilinx Deep Neural Network Inference) FPGA architecture on the lowest FPGA IP level. Supported by a related xfDNN compiler and runtime, XDNN maps a range of neural network frameworks onto the high-end VU9P Virtex UltraScale+ FPGA for datacenters.

 
Versal ACAP

Last October, Xilinx announced a major new Versal ACAP (adaptive compute acceleration platform) processor family. The heterogeneous accelerated Versal “is the first platform to combine software programmability with domain-specific hardware acceleration” and built-in adaptability via the ACAP architecture,” says Xilinx.


Xilinx Versal

Built with a 7nm FinFET process compared to 16nm for the Zynq UltraScale+, Versal will comprise six separate processors, two of which will start rolling out before the end of the year. The initial Versal Prime and Versal AI Core models, which are primarily aimed at datacenter and high-end edge-AI devices, respectively, started sampling in June.

The Versal Prime, Premium, and HBM series processors target high-end datacenter and networking applications. The AI Core, AI Edge, and AI RF series target AI-enabled networking and edge devices and add an AI Engine block designed for low-latency AI inference.

The AI Engine appears to be based in part on the Deephi and Xilinx Edge AI Platform technology. The AI Engine features 1.3GHz VLIW/SIMD vector processors deployable in a tile structure. The cores communicate at “terabytes/sec” bandwidth to other engines.

As detailed in this Versal slide deck (PDF), all the Versal processors feature dual 1.7GHz Cortex-A72 cores supported by an embedded Linux runtime and dual 750MHz Cortex-R5 cores supported by FreeRTOS.



Versal block diagram
(click image to enlarge)

The programmable logic component is referred to not as an FPGA, but as Versal Adaptable Engines. The logic includes “fine-grained parallel processing, data aggregation, and sensor fusion.” It also offers a programmable memory hierarchy with “high bandwidth, low latency data movement between the engines and I/O,” says Xilinx.

The Adaptable Engines provide 4x higher density per logic block, presumably compared to the UltraScale+. Separate from the programmable logic is a DSP Engines block with up to 1GHz performance designed for accelerating wireless, machine learning, and HPC. As noted, selected models also provide the AI Engine.

Tying all these pieces together is a multi-terabit-per-second network-on-chip (NoC) that memory maps access to all resources for easier programmability. It also enables easily swapping of kernels and connectivity between different kernels.

The NOC works with a “Shell” component that includes a Platform Management Controller that offers security and boot features. It also includes a scalable memory subsystem and host and I/O interfaces. The Versal works with the new Vitis unified software platform and is backward compatible with Zynq UltraScale+.

The first two Versal versions are the currently documented Versal Prime and Versal AI Core. The AI Engine-enabled AI Core is equipped with 256KB of on-chip RAM with ECC and more than 1.9 million system logic cells. There are also more than 1,900 DSP engines optimized for high-precision floating point with low latency.



Xilinx Versal architecture (left) and Versal AI Core VCK190 eval kit
(click images to enlarge)

There’s already a Linux-powered Versal AI Core VCK190 eval kit. The AI Core is aimed at very high-end systems such as 5G infrastructure, automotive, and datacenter. We imagine, however, that most LinuxGizmos readers will be more interested in the upcoming — and currently undocumented — embedded AI Edge and AI RF platforms.

 
Further information

iWave’s iW-Rainbow G30D Zynq Ultrascale+ MPSoC Development Kit is available now at an undisclosed price. More information may be found on its product page. More on the Xilinx AI Platform may be found here and more on the Versal processors may be found here.

 

(advertise here)


Print Friendly, PDF & Email
PLEASE COMMENT BELOW

Please comment here...