In the rapidly evolving landscape of artificial intelligence, hardware advancements are pivotal in driving innovation and accessibility. Enter d-Matrix Inc., a Santa Clara-based hardware startup backed by Microsoft, which has just unveiled its first AI processor, Corsair. This groundbreaking processor promises to revolutionize AI inference by eliminating the need for traditional GPUs and expensive high-bandwidth memory (HBM). With significant performance and cost advantages, Corsair is poised to make generative AI more accessible and efficient.
d-Matrix Inc. has positioned Corsair as a purpose-built solution for demanding AI inference tasks, particularly in the realm of generative AI models. Unlike conventional AI processors that rely heavily on GPUs and HBM, Corsair leverages advanced digital in-memory computation (DIMC) and versatile datatype processing to deliver exceptional performance at a lower cost.
Corsair achieves an impressive 60,000 tokens per second at just 1 millisecond per token when running Llama3 8B models on a single server. For more complex models like Llama3 70B, Corsair still manages to deliver 30,000 tokens per second at 2 milliseconds per token within a single rack. These performance metrics not only surpass traditional GPU-based solutions but also translate into substantial savings in energy and operational costs.
Corsair’s architecture is built on Nighthawk and Jayhawk II tiles, utilizing a state-of-the-art 6nm manufacturing process. Each Nighthawk tile integrates four neural cores and a RISC-V CPU, specifically designed to support large-model inference. This integration facilitates efficient digital in-memory computation (DIMC), enabling high-speed data processing without the bottleneck of external memory.
The DIMC architecture delivers an ultra-high memory bandwidth of 150 TB/s, a critical factor in handling the vast data requirements of generative AI models. Additionally, Corsair supports block floating point (BFP) datatype processing, enhancing its ability to manage diverse computational tasks efficiently.
Corsair adopts a chiplet packaging approach, seamlessly integrating memory and computation to maximize efficiency. This design conforms to the industry-standard PCIe Gen5 full height full-length card form factor, ensuring compatibility with existing server infrastructures. Moreover, Corsair can be paired with DMX Bridge cards to scale performance further, making it a versatile solution for a wide range of AI applications.
Each Corsair card is powered by 2400 TFLOPs of 8-bit peak computing, coupled with 2GB of integrated performance memory and up to 256GB of off-chip memory capacity. This robust configuration ensures that Corsair can handle intensive AI tasks without compromising on speed or efficiency.
A key aspect of Corsair’s success is its collaboration with industry giants. Micron Technology, a crucial partner of Nvidia, is working closely with d-Matrix to enhance Corsair’s capabilities. This partnership leverages Micron’s expertise in memory technology, further boosting Corsair’s performance and reliability.
Additionally, Corsair’s development was a strategic response to the surging demand for generative AI. By reconfiguring its architecture, d-Matrix ensured that Corsair is optimized for transformer models and emerging applications such as agentic AI and interactive video generation. This foresight has positioned Corsair as a versatile and future-proof solution in the AI hardware market.
Corsair is currently available to early-access customers, with broader availability slated for the second quarter of 2025. Early adopters can experience firsthand the performance and cost benefits that Corsair offers, paving the way for widespread adoption in the AI community.
The phased rollout allows d-Matrix to refine and optimize Corsair based on real-world usage and feedback, ensuring that the processor meets the diverse needs of AI developers and enterprises. As generative AI continues to grow, Corsair is well-positioned to support the increasing computational demands.
Corsair’s architecture is particularly well-suited for transformers, a fundamental component in many AI models, including those used for natural language processing and machine learning. The processor’s ability to handle agentic AI, which involves autonomous decision-making processes, makes it a valuable tool for developing intelligent systems that require real-time data processing and response.
Furthermore, Corsair’s support for interactive video generation opens new avenues for content creation, gaming, and virtual reality applications. By enabling high-speed inference without the need for expensive GPU infrastructure, Corsair democratizes access to advanced AI capabilities, allowing more creators and developers to innovate.
Sid Sheth, co-founder and CEO of d-Matrix, emphasized the company’s vision: “We saw transformers and generative AI coming, and founded d-Matrix to address inference challenges around the largest computing opportunity of our time.” He further highlighted that Corsair’s compute platform brings blazing-fast token generation for high-interactivity applications with multiple users, making generative AI commercially viable.
d-Matrix’s Corsair represents a significant leap forward in AI hardware, offering a GPU-free alternative that delivers exceptional performance and cost efficiency. Backed by Microsoft and supported by strategic partnerships with companies like Micron Technology, Corsair is set to transform the AI inference landscape.
As generative AI continues to expand its applications across various industries, Corsair provides the necessary infrastructure to support this growth. With its impressive performance metrics, innovative architecture, and strategic collaborations, Corsair is not just another AI processor—it is a catalyst for the next wave of AI innovation.
Early-access customers are already experiencing the benefits of Corsair, and as broader availability approaches in 2025, the AI community eagerly anticipates the widespread impact of this groundbreaking technology. Corsair’s introduction marks a new era in AI hardware, where performance, efficiency, and accessibility converge to unlock the full potential of artificial intelligence.