Nexa SDK icon

Nexa SDK

Run, build & ship local AI in minutes

sdk.nexa.ai
Screenshot of Nexa SDK

About Nexa SDK

Nexa SDK is a versatile software development kit designed to run a wide array of models across diverse devices and backends, enhancing the flexibility and scalability of machine learning applications. It adeptly facilitates the execution of models pertaining to text, vision, audio, speech, and image generation on neural processing units (NPUs), graphics processing units (GPUs), and central processing units (CPUs). The SDK supports hardware from major manufacturers including Qualcomm, Intel, AMD, and Apple, while also being compatible with Apple MLX and GGUF. By integrating state-of-the-art models like Gemma3n and PaddleOCR, Nexa SDK allows for seamless functionality of advanced multimodal, large language, and object detection models.

Among its notable capabilities, Nexa SDK supports the OmniNeural-4B, the first NPU-aware multimodal model with native understanding of text, images, and audio. It optimizes models like NPULlama3.2-3B for different NPUs, improving throughput in application-specific contexts, particularly within Qualcomm and Intel environments. The platform showcases a collection of models including the NPULlama-3.2 series, NPUparakeet-v3 for high-throughput multilingual speech recognition, and Expanded models from the Qwen3 family offering enhanced general capabilities. Additionally, it provides support for embedding models such as NPUembeddinggemma, built from Google DeepMindโ€™s technologies, and PaddleOCR for multilingual text detection and recognition. The SDK also enhances image generation and object detection tasks with models like Prefect-illustrious-XL-v2.0p and YOLOv12โ€‘N.

By offering these expansive capabilities and optimizations for a wide range of models, Nexa SDK stands out as a comprehensive solution for developers looking to implement complex ML tasks efficiently across various platforms and hardware configurations.

Information shown may be outdated. Found an error? Report it here