Tiny ONNC
Tiny, Fast and Powerful Neural Networks on MCUs
Tiny, Easy, and Powerful
Tiny ONNC is an easy-to-use tool, converting popular PyTorch models into a function within a series of CMSIS-NN or Andes-NN function calls. With powerful Open Neural Network Collection (ONNC) inside, Tiny ONNC supports rich PyTorch models which are designed for Microcontrollers (MCUs). Comparing with TensorflowLite for Microcontroller, the converted program is supreme tiny. The best case uses only 1/2 memory footprint on SRAM, and its code size is smaller than 1/10.
Best Supports for MCU Neural Network Libraries – CMSIS-NN and Andes LibNN
Tiny ONNC bridges the gaps between data scientists and embedded software engineers. With Tiny ONNC, data scientists can directly transform PyTorch models into quality C source code. Embedded engineers can focus on adding AI functions to applications instead of handling the internals of neural networks.
Tiny ONNC is based on PyTorch version 1.8.1. It uses Open Neural Network Collection (ONNC) as the backbone. PyTorch exports a neural network model into an ONNX file. Next, ONNC calibrator will import the ONNX file to quantize the floating-point weights/activation data to Qm.n format automatically. After calibration, ONNC compiler generates quality C source code that contains a series of neural network library calls. The generated C source conforms to the C99 standard and can be compiled by most C/C++ compilers, including GNU C/C++ compiler, Clang C/C++ compiler, ARM Keil compiler, and IAR C/C++ compiler.
Tiny ONNC software stack

Supreme Tiny – Smaller flash and SRAM consumption
Tiny ONNC leverages ONNC compiler to allocate memory spaces and shrink code size. Figure 2 shows ONNC saves more SRAM space than TensorFlow Lite for Microcontroller (TFLM) in all benchmarks in Tiny MLPerf v0.1. ONNC uses advanced compiler technology to handle the memory space at compilation time. All tensors in a network can freely compact together. It increases memory locality and reduces the peak memory size dramatically. It’s worth mentioning that ONNC doesn’t enable any tensor splitting or operator splitting algorithms in this experiment. In many cases, ONNC can save 46%~57% global SRAM in variant accelerators when enabling tensor splitting and operator splitting algorithms.
![]() |
![]() |
Figure 2 - SRAM consumption (data+bss) between Tiny ONNC and TensorflowLite for Microcontroller | Figure 3 - flash consumption (text+data) between Tiny ONNC and TensorflowLite for Microcontroller |
Comparing with TFLM, Tiny ONNC produces C source code directly. Only the called functions in the NN library are statically linked in the program. Optimizing linkers should strip unessential functions in the NN library and there is no extra burden on the code size.
We compiled the generated C code into a runnable program and compared the program code size with TFLM. Figure 3 shows Tiny ONNC has at most x10 advantage over TFLM. On average, it could save code size to 1/8 of the TFLM.
Support Environment

Comprehensive Software Development Kits (SDK) and Board Support Packages (BSP)
Compiler experts make not only a compiler but a chain of tools. Tiny ONNC includes a series of tools for embedded software engineers
Software Development Kits
Open Neural Network Compiler (ONNC) for ARM Cortex-M or for Andes RISC-V
Board Support Packages
Information
- PyTorch, the PyTorch logo and any related marks are trademarks of Facebook, Inc.
- TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.
- ARM mbed, the ARM mbed OS logo and any related marks are trademarks of Arm Limited.
- Andes, the Andes logo and any related marks are trademarks of Andes Technology Inc.