Intro to MLIR

Introduction


MLIR is a sub-project within the LLVM ecosystem. Traditionally, LLVM provides the backend infrastructure to execute programs, offering APIs to analyze and optimize the Intermediate Representation (IR) as needed. However, LLVM IR relies on a fixed, low-level instruction set. For language creators, this has historically meant building a frontend to parse an AST and lowering it directly to LLVM IR to leverage existing optimization passes.

The challenge arises when developers try to implement domain-specific features. Too often, the compiler community ends up reinventing the wheel, re-implementing similar optimizations and structures repeatedly. MLIR was created to solve this problem, specifically designing tools to increase reusability and extensibility across compiler infrastructure.

I recently started my PhD at Virginia Tech and am currently taking a course on LLVM. To be honest, I enrolled because I have almost zero self-discipline to learn it on my own. My true motive isn’t just to learn LLVM, but to master MLIR. However, I realized I needed a solid understanding of the underlying LLVM infrastructure to effectively build projects in MLIR. I’m taking the Compiler Optimization course by Prof. Binoy Ravindran in Spring 2026. Since the course started, I’ve been trying to learn MLIR in parallel. The problem? Every tutorial out there focuses heavily on architecture design and high-level decisions, rather than showing me how to actually build something with MLIR.

I come from the “YouTube generation” of programmers. I learn best when someone shows me the absolute minimal code to get something running, and then we figure out the rest later. My philosophy is: First build, then ask questions. I’m writing this blog series to document my journey starting with MLIR with minimal prior knowledge. I hope that by recording my “build-first” approach, others who are struggling with the steep learning curve might gain something from it, too.

Installing LLVM & MLIR (The Hard Way is the Easy Way)

First things first: we need to install LLVM and MLIR. While you can sometimes find binaries, the only “real” way to work with MLIR—which changes almost daily—is to build from source.

  1. Install System Dependencies Before we even touch LLVM code.

    Fedora

       sudo dnf install git cmake ninja-build gcc-c++ python3-devel libomp-devel
    

    Debian/Ubuntu:

       sudo apt-get update
       sudo apt-get install git cmake ninja-build build-essential python3-dev libomp-dev
    
  2. Clone the Source. I use --depth 1 to avoid downloading the entire commit history, which saves a massive amount of time and disk space.
       git clone --depth 1 https://github.com/llvm/llvm-project.git
    
  3. Configure and Build. Create a directory where we will eventually install the binaries, and then set up the build.

       mkdir ~/llvm-dev
    
       cd llvm-project
       mkdir build && cd build
    
       cmake -G Ninja -S ../llvm -B . \
           -DLLVM_ENABLE_PROJECTS="mlir;clang;openmp;lld" \
           -DLLVM_TARGETS_TO_BUILD="Host;NVPTX;AMDGPU" \
           -DCMAKE_BUILD_TYPE=Release \
           -DLLVM_ENABLE_ASSERTIONS=ON \
           -DCMAKE_INSTALL_PREFIX=$HOME/llvm-dev \
           -DLLVM_PARALLEL_LINK_JOBS=1
    
  4. Build and Install. Time to compile now! Adjust -j16 based on your CPU cores. I have 20 cores, but I strictly use 16 to leave some breathing room for the OS and my ever running youtube.

       # This will take 30-60 minutes depending on your machine.
       ninja -j16
          
       # Install the binaries to ~/llvm-dev directory.
       ninja install
    
  5. Final Step (Environment Management).

    I have a stable LLVM installed globally for all my assignment purposes, therefore instead of permanently polluting my global PATH (which can break other builds), I prefer to load the environment only when I actually need it. This keeps my system clean and makes it easier to switch between different LLVM versions later. Add the following alias to your .bashrc:

    1. Open the shell configuration file (.bashrc by default in most Linux distros).

       nano ~/.bashrc
      
    2. Scroll to the very end of the file and add this line:
       alias load-llvm-dev='export PATH="$HOME/llvm-dev/bin:$PATH" && echo "Dev LLVM Loaded"'
      
    3. Save and exit (Ctrl+O, Enter, then Ctrl+X).
    4. Reload the configuration so the changes take effect immediately.
       source ~/.bashrc
      

    Now, whenever you open a fresh terminal to work on this project, just type:

       load-llvm-dev
    

    And I am ready to go for that session.

Verify the Installation

Building LLVM takes a long time, and a lot can go wrong. Before we move on, let’s confirm that out shell can actually find the tools and that they run.

First, load your environment:

load-llvm-dev

Now check if mlir-opt is available:

mlir-opt --version

If you see something like LLVM version ... Optimized build..., you are in business.

Just seeing the version isn’t enough. As usual in any programming language or framework, let’s write our first “Hello World” in MLIR.

First create a file named hello.mlir with the absolute bare minimum valid MLIR code:

module {
  func.func @main() {
    return
  }
}

Now run ti through the optimizer:

mlir-opt hello.mlir

If it prints the exact same code block to your terminal, congratulations! You have successfully built LLVM and parsed your first piece of MLIR. You are now officially just one step behind me.

I plan to use the rest of this post to explain the architectural design of MLIR—mostly for the sake of completeness, as every other resource seems to do. But I believe in learning by doing. So, if you prefer, jump ahead to the next post to start coding. You can treat this section as a reference to come back to once you’ve gotten your hands dirty.