Optimal and Efficient AI Inference: A Better Approach to Hyperscale Computing

4 min readApr 20, 2022

Why We Invested In d-Matrix

Today, we’re excited to announce our lead investment in the $44 million Series A funding round of d-Matrix, alongside a group of world class investors that include M12 (Microsoft Venture Fund) and SK Hynix. We’re joined in this round by existing investors Nautilus Venture Partners, Marvell Technology, and Entrada Ventures.

d-Matrix is a one-of-a-kind AI compute platform that delivers best-in-class economics for AI inference acceleration in the datacenter. The company’s computing platform is based on the digital in-memory compute (IMC) technique and is focused on attacking the physics of memory-compute integration, enabling clients to gain more than 10X compute efficiency for the same power envelope.

As has been well documented, the demand for compute is growing exponentially, particularly for inference workloads. The decline of Moore’s Law means that traditional architectures are beginning to approach their performance limit. Traditional CPUs will no longer suffice as a solution as inference workloads become greater in size and complexity. d-Matrix is providing a better way to break through the AI barrier anchored by three differentiating technologies: 1) digital IMC, 2) mapper, numerics, and sparsity, and 3) proprietary chiplets and interconnect.

Learn more at https://www.d-matrix.ai/technology

The Advantage of d-Matrix’s Approach

Data centers have reached their limit in terms of their ability to keep pace with the growing demand for inference workloads. This is largely because CPUs are constrained by the inefficiencies of the prevailing von Neumann architecture, in which instruction data and program data rely on a common memory, and where the processor cores operate serially on that instruction data.

While GPUs offer many more cores and allow more operations to be performed in parallel, hyperscale companies (with a large, global footprint) make hundreds of trillions (or nearly 1 quadrillion) inferences and process billions of language translations daily. This outsized growth manifests itself across a few core areas that d-Matrix will address with its Corsair product: power, AI model growth, cost and scale of data center inference and training, real time latency, and inference types.

While the von Neumann architecture is currently sufficient at blindly accelerating generic workloads, it fails to provide meaningful acceleration for data and compute rich inference workloads that d-Matrix will address. d-Matrix has 3 core differentiating technologies that support its AI-first architecture:

Digital IMC: This notable advancement provides super-fast performance (thousands of times faster) and scale of ever-increasing quantities of data, and simplifies access to growing numbers of data sources. By storing data in RAM and processing it in parallel, it supplies real-time insights that enable businesses to deliver immediate responses.
Modularity: d-Matrix’s chiplets can be arranged in groups of 1, 2, 4, or 8 on an organic substrate. Arranging 8 chiplets on a cost-efficient organic substrate is a leap forward from current capabilities and has yet to be accomplished by any other chipmaker
Software: The company has built software to efficiently compile, map, and distribute workloads across multiple chiplets across multiple cores, which will enable the full performance capability of the Corsair hardware product.

The Growing Opportunity

All AI applications must first be trained with vast amounts of data before they can be deployed to production for inference. The cloud is an ideal location for training because it provides access to flexible horizontally scalable compute and storage resources — and the more information an AI application reviews during training, the better its algorithm will become. Further, the cloud can reduce expenses because it allows GPUs and other expensive hardware to train multiple AI models.

d-Matrix has built a novel, defensible technology that can outperform traditional CPUs and GPUs across a wide variety of valuable inference workloads and will be the first AI accelerator to bring IMC into the data center. As Moore’s law comes to an end, general purpose CPUs and GPUs are falling behind. d-Matrix’s sophisticated architecture will yield major improvements in performance and energy efficiency across a number of inference workloads from NLP to image classification.

In-Memory Compute Arrives in the Datacenter

We are honored and excited to be a part of the d-Matrix story as they fundamentally transform the economics of large multi-modal model inference in the datacenter. Learn more at www.dmatrix.ai

Optimal and Efficient AI Inference: A Better Approach to Hyperscale Computing

Why We Invested In d-Matrix

The Advantage of d-Matrix’s Approach

The Growing Opportunity

In-Memory Compute Arrives in the Datacenter

Written by Playground Global

No responses yet