Intel began sampling the Altera Stratix 10, a 14nm SoC that combines 4x Cortex-A53 cores with a Stratix V level FPGA, while using 70 percent less power.
Altera first announced the Stratix 10 SX back in 2013, but the SoC has been delayed, and has only begun sampling now. Along with the challenge of building the first 14nm FPGA, fabricated with Intel’s 14nm 3D Tri-Gate process, the rollout was also likely set back due to Intel’s recent acquisition of the chipmaker. Intel now says the SoC demonstrates “the most significant FPGA innovations in over a decade.”
The Stratix 10 is being targeted primarily at the datacenter. However, the Linux-supported SoC will likely end up in high-end embedded technology, as well. The SoC will target “data-intensive applications ranging from data centers, network infrastructure, cloud computing, and radar and imaging systems,” says Intel.
(click image to enlarge)
The Stratix 10 SX combines Stratix V level FPGA circuitry with a quad-core Cortex-A53 subsystem clockable to 1.5GHz. There’s also 1MB of L2 cache and 32 KB I/D cache.
Altera also lists a Stratix GX line that lacks the ARM subsystem, but is otherwise identical. The SX and GX SoCs are available in nine different SKUs with varying levels of Logic Elements (LEs), adaptive logic modules (ALMs), and other FPGA counts. These range from 484K LEs on the SX 500 or GX 500 to over 4.4 million on the SX or GX 4500.
In addition, Altera mentions a Stratix 10 GT family that appears to be a future offering. The high-end GT supports transceivers with a path to data rates up to 56Gbps. The following Intel claims appear to refer to the ARM-ready SX models.
Intel compares the Stratix 10 with the Stratix V FPGA, but it’s really more comparable to the Linux-ready Cyclone V and higher-end Arria 10 SoC, which similarly combine ARM Cortex and FPGA components. In fact, the Stratix 10 is footprint compatible with the Arria 10, which is recommended for use in early development of Stratix 10-based products.
Compared to Stratix V FPGAs, the Stratix 10 uses up to 70 percent less power while offering equivalent performance with the integration of up to 5.5 million logic elements, says Intel. The SoC delivers twice the core performance and over five times the density “compared to the previous generation,” says the chipmaker. The processor is further claimed to offer up to 10 TFLOPS of single-precision floating point DSP performance.
Stratix 10 ARM subsystem block diagram (left) and DSP block in standard-precision fixed-point mode
(click images to enlarge)
The Stratix 10 is notable for its High-Bandwidth Memory (HBM2) system-in-package (SiP) integration, which enables a high interconnect density between the FPGA and the companion die. Thanks to the HBM2 technology, the Stratix 10 can process up to 1TBps memory bandwidth, according to Intel.
With the SoC’s “HyperFlex” architecture, each routing segment on the device has its own associated “Hyper-Register.” These specialized registers are also implemented at the inputs of all functional blocks such as ALMs, embedded memory (M20K) blocks, and DSP blocks. The Hyper-Registers are “bypassable,” enabling design tools to “select the optimal register location automatically, after place-and-route, to maximize core performance,” says Intel.
The architecture’s “registers everywhere” approach is said to enable performance tuning without requiring additional ALM resources or additional changes or added complexity to the design’s place-and-route. Additionally, having Hyper-Registers built into the interconnect helps to reduce routing congestion.
Stratix 10 integration of HyperFlex and HBM2 SiP memory (left) and diagram showing “registers everywhere” HyperFlex design with Hyper-Registers
(click images to enlarge)
The HyperFlex architecture includes programmable clock tree synthesis, enabling enhanced core clocking. In addition, its “Hyper-Aware” design flow adds a Fast Forward Compile tool and a Hyper-Retimer step that supports performance optimization after place-and-route. The design flow also provides enhanced synthesis and place-and-route algorithms.
The Stratix 10 supports up to 144 transceivers with data rates up to 30Gbps, says Intel. The SoC supports over 2.5Tbps bandwidth for serial memory, and over 2.3Tbps for parallel memory interfaces, including support for DDR4 RAM at up to 2,666Mbps.
Security features are led by a Secure Device Manager (SDM). The SDM creates a unified, secure management system for the entire device, and controls configuration, device security, single event upset (SEU) responses, and power management.
Stratix 10 vs. Zynq UltraScale+ MPSoC
The Stratix 10 competes most directly with Xilinx’s 16nm-fabricated Zynq UltraScale+ MPSoC, which similarly combines four Cortex-A53 cores with a high-end FPGA, and also adds dual Cortex-R5 MCUs for real-time control. This quad-core version recently entered production in 16nm and 20nm TSMC-fabricated models, and will be followed next year by a dual-core CG version aimed at the embedded market.
The Stratix 10 is more squarely aimed at the datacenter, reflecting the role of Altera FPGAs in datacenter communications infrastructure equipment, among other applications. Intel sees FPGAs as a way to efficiently process the highly parallelized workloads and huge number of inputs that datacenters will face as they aggregate IoT data and process immense volumes of multimedia data such as video chat.
FPGAs have a unique capability to be reprogrammed in the field to handle new scenarios. FPGAs are also more challenging to develop for, but by integrating FPGAs with applications processors — a trend that began with the Xilinx Zynq-7000 — they have become more accessible.
Linux is the default OS in Altera’s SoC Embedded Design Suite (EDS), which incorporates ARM’s Development Studio 5 (DS-5) and Altera’s OpenCL SDK. There’s also the Altera Quartus Prime FPGA design software, which includes a new Spectra-Q engine that is optimized for the HyperFlex architecture and Hyper-Aware design flow, and which promises 8x faster compile times than previous Quartus compilers.
The anomaly of Intel selling ARM-based designs will likely be temporary, as Intel CEO Brian Krzanich has previously indicated he plans to combine Altera’s FPGAs with Intel’s x86 chips in both the IoT market and the datacenter. In the datacenter, the ARM cores would presumably be replaced with Xeons, and in the IoT space, Atoms. These changes will likely take time, however, so for the near term, Intel will be a major league ARM dealer.
The Stratix 10 is sampling now. More information may be found on the Intel/Altera Stratix 10 product page.