ONNC x NVDLA Release 1.3.0


  • Version: r1.3.0
  • Release Date: 2019/7/15


ONNC on NVDLA AI starter kits provides software development kit (SDK) and board support package (BSP) for IC design houses who plan to develop its own deep learning accelerator (DLA) based on NVDLA.

It’s a comprehensive set of FPGA-based prototyping tools including AI-related hardware and software components:

  • compiler (ONNC)
  • drivers (KMD/UMD)
  • Linux kernel
  • virtual platform(GreenSocs)
  • FPGA netlist
  • NVDLA verilog RTL code

AI Starter Kits optimize your design process and ensures that your software runs and exploits the full power of the underlying AI chips. Skymizer aims at providing a unique solution tailored to your needs to help you build a strong hardware/software co-design team and efficiently save your time ahead of AI chip fabrication.

Highlights


nv_large configuration support

nv_large hardware configuration --- the highest performance spec in NVDLA family. It contains 2048 MACs, 512K large CONV buffer and broad bus bandwidth (256 AXI width) and aims for the emergence of intelligent devices such as Advanced Driver Assistance Systems (ADAS) and smart factories. To support nv_large configuration, we create a new compiler onnc.large and upgrade c-model of GreenSocs virtual platform.

New Features

  • onnc-create provides nv_large configuration
  • new compiler - onnc.nv_large
  • kernel mode driver provides probe and initialization functions for nv_large
  • new GreenSocs virtual platform - vp.nv_large

23 New Optimization Options of ONNC - enable the power of compiler optimization

ONNC has 23 new optimization options for you to exploit performance from underlying microarchitecture. Most optimizations are intuitive and effective. You can find their descriptions here. ONNC's help manual also provides a brief of all optimization flags, by typing --help flag with a larger-than-three verbose level.

onnc.nv_large --help -verbose=3

Mathematic-equivalent optimizations

All optimizations keep the neural network mathematic-equivalent. Some optimizations may split a layer into multiple network by the limitation of hardware buffer size. Another optimizations may fuse a network into a layer for higher MAC utilization. Although topology of the network may be changed, the mathematical semantics of the network is mathematic-equivalent.

Network enablers

A group of optimizations in a compiler for deep learning accelerator (DLA) is used to enable layers which is not directly supported by microarchitecture. Take Concat layer for example, NVDLA microarchitecture doesn't have Concat processing unit. ONNC must transform a Concat layer into multiple convolution layers to make sure the network can run on NVDLA. ONNC turns on these network enabling optimization by default.

New optimization options are listed here.

Tutorial Program for AI Starter Kit

Version r1.3.0 contains a tutorial program to help users get a better understanding of user mode driver (UMD). The program demonstrates how to use UMD to write a Handwritten Digit Recognizer. (MNIST handwritten digits)

Resolved Issue

  • Fix ComputeGraph::topologicalSort() after the parts of the operator disappears
  • Fix calculation of address offset under image model error
  • Fix Conv weight and bias in the correct order
  • Fix AveragePool and MaxPool hardware implementation
  • Fix Conv-Relu fusion( If there is no bias, Conv will not execute Relu)
  • Fix compling single layer model crash issue
  • Fix onnx library expection output crash issue
  • Fix invalid loadable output when only Softmax is in the model
  • Fix LRN implementation set value error
  • Fix the calculation of group Conv read/write offset error
  • Fix the calculation of Concat shared memory offset error
  • Fix model output Tensor if other operator input crash
Back to Top