[Proposal] Add Matrix Multiplication (MatMul) as a Tensor-Level Abstraction in XLS IR

## Motivation

XLS currently operates primarily at a scalar level, where computations are expressed as fine-grained operations in IR. While this design is powerful for general computation, it is not optimal for emerging machine learning workloads, which rely heavily on tensor-level operations such as matrix multiplication.

Introducing a matrix multiplication (MatMul) abstraction at the IR level would allow XLS to better support hardware-efficient generation of ML accelerators.

---

## Problem

Currently, matrix multiplication must be manually expressed as a combination of scalar operations, which:

- Increases IR complexity
- Limits optimization opportunities
- Prevents the compiler from making tensor-level scheduling and architectural decisions
- Leads to suboptimal hardware in terms of area, timing, and power (PPA)

---

## Proposed Solution

I propose introducing a `MatMul` (or equivalent) tensor-level operation in XLS IR, along with multiple lowering strategies.

### 1. IR-Level Abstraction

- Define a `MatMul` operation in IR
- Represent matrix multiplication as a first-class operation instead of scalar expansion

---

### 2. Lowering Strategies

Implement multiple lowering paths:

#### (a) Combinational Expansion
- Fully unrolled scalar multiplications
- High area, low latency

#### (b) Pipelined MAC Tree
- Reduces critical path
- Improves timing performance

#### (c) Systolic Array Mapping (if possible)
- Hardware-efficient for ML workloads
- Enables scalable parallelism

---

### 3. Integration with Scheduling

- Allow the XLS scheduler to exploit MatMul-level knowledge
- Improve pipeline balancing and resource sharing

---

### 4. OpenROAD Integration

After RTL generation, use OpenROAD flow to:

- Evaluate Area
- Evaluate Timing
- Evaluate Power
- Evaluate Routability

This enables a feedback loop:

Compiler → RTL → Physical Design → PPA Metrics → Compiler Optimization


---

## Expected Outcomes

- Introduction of tensor-level abstraction in XLS
- Multiple hardware-efficient lowering strategies
- Improved PPA compared to scalar-expanded implementations
- A reproducible flow combining compiler + physical design tools

---

## Why this matters

This work bridges the gap between:

- Compiler design
- Hardware generation
- Physical design evaluation

It aligns with modern ML accelerator design trends and enables XLS to generate optimized hardware for real-world workloads.

---

## Status

I am currently exploring the XLS codebase and have successfully set up the environment and started contributing via documentation improvements.

I am interested in implementing this proposal incrementally through further contributions.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Add Matrix Multiplication (MatMul) as a Tensor-Level Abstraction in XLS IR #3983

Motivation

Problem

Proposed Solution

1. IR-Level Abstraction

2. Lowering Strategies

(a) Combinational Expansion

(b) Pipelined MAC Tree

(c) Systolic Array Mapping (if possible)

3. Integration with Scheduling

4. OpenROAD Integration

Expected Outcomes

Why this matters

Status

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Proposal] Add Matrix Multiplication (MatMul) as a Tensor-Level Abstraction in XLS IR #3983

Description

Motivation

Problem

Proposed Solution

1. IR-Level Abstraction

2. Lowering Strategies

(a) Combinational Expansion

(b) Pipelined MAC Tree

(c) Systolic Array Mapping (if possible)

3. Integration with Scheduling

4. OpenROAD Integration

Expected Outcomes

Why this matters

Status

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions