+886 2 8797 8337
12F-2, No.408, Ruiguang Rd., Neihu Dist., Taipei City 11492, Taiwan
Center of Innovative Incubator R819, No. 101, Section 2, Kuang-Fu Road, Hsinchu, Taiwan
Enable AI SoC to be easily adopted in frequently evolved applications by automatically compiling AI algorithms into chip’s machine code.
Leverage virtual platforms to conduct Performance-Guided Optimizations (PGO) to improve speed and reduce memory requirements by utilizing all available computing/memory resources in the complex heterogeneous multicore systems.
Align software development at a much earlier stage during SoC architecture exploration stage. Provide architecture-aware calibrations to maintain precision even in Int8 mode.
LLM will become a crucial interface for human-to-machine and machine-to-machine communication.
Inference and fine-tuning are transitioning to the edge and are poised for much greater growth than training.
Key benefits include security, privacy, personalization, response time, and cost-effectiveness.
LPUs offer superior performance, power efficiency, and cost-effectiveness compared to CPU/GPU/NPU solutions.
High memory bandwidth utilization.
Shortest response time with minimal MAC requirement.
Minimal yet efficient LLM-specific instruction set supports diverse decoder-only transformers, including LLaMA2, LLaMA3, Mistral, Phi-2, Phi-3, Gemma, etc.
Currently focusing on 7-13B models. Larger models require more DRAM capacities.
LLM Frameworks: HuggingFace Transformers, Nvidia Triton Inference Server, OpenAI API, and LangChain API.
Fine-Tuning and RAG Toolkits: HuggingFace PEFT, QLoRA, LlamaIndex, and LangChain.
Colossal models like GPT3 become mainstream in the industry. See how ONNC deals with such challenge by collaborating compiler with runtime and makes heterogeneous multi-cards/multi-server possible.
Recommendation Systems are extremely accuracy sensitive since they directly relate to enterprises' revenue stream. See how ONNC Calibrator minimizes precision error and helps you and your customers success.
It takes at least 18 months to develop an AI chip, while AI models and applications used in smartphones evolutes at all times. See how ONNC compiler prevents your AI chip phasing out before taping out.
Data movement optimization is a key breakthrough for minimize inference latency. See how ONNC runtime leverages this technology to provide best user experience for SmartTV users.
Designing heterogeneous multicore in SOCs tends to suffer from software fragmentation, see how ONNC's total solution reduces man-hours and R&D risk.
Equipping MCUs with AI super power makes smart home and smart factory possible. See how ONNC's compiler fits a deep learning model into extremely resource-constrained MCUs.
ONNC compiler is a bundle of C++ libraries and tools to boost your development of compiler for deep learning accelerators (DLAs). ONNC compiler targets on diverse SoC architectures from a simply single core system to a heterogeneous system with multi-level memory hierarchy and bus connection.
Forest Runtime executes compiled neural network models on the hardware platform of your choice. It provides common C++ APIs with C and Python bindings for various AI application doing inference. Forest Runtime is '''retargetable'''. It has modular architecture and we've ported it on diverse hardware platforms, including ''datacenter'', ''mobile'' and ''TinyML''.
ONNC Calibrator leverages hardware architecture information to keep AI System-on-Chips in high precision through the post-training quantization (PTQ) technique. The key indicator to validate a quantization technique is its precision drop.
CIRCLE offers a complete suite of services for ASIC turnkey projects, including design, manufacturing, packaging, and testing.
Region: South Korea
Contact: 감태오(Taeo)
Email: [email protected]
Mobile: +82-10-5412-9275
Striving toward designing the world’s leading chip technologies and working with end-users to embrace an era of autonomous driving, cloud storage, and the Metaverse.