Hello World in MLIR

Introduction


MLIR is not a programming language like C++ or Python; it is a compiler infrastructure—a framework of tools rather than a single tool. Because of this, there is no single “Hello World” program. As I’ve been learning, I’ve realized there are actually three different ways to use this framework, and you need to understand all of them.

  1. Write MLIR program directly using existing dialects.
  2. Create a MLIR pass for either optimization of analysis.
  3. Build our own dialects by extending the language itself by defining new operations and types.

To really understand how MLIR works, I’m going to walk through a minimal “Hello World” for all three scenarios.

Write MLIR Program

The MLIR distribution includes a rich set of in-tree dialects, most of which target high-performance tensor compilers and hardware-specific code generation. However, before we tackle those, we need to understand the basics of the IR itself. We will start by manually writing a program using two core dialects: func (for function abstraction) and arith (for basic arithmetic operations). This will allow us to see how MLIR represents logic and how the infrastructure processes it without getting bogged down in complex types.

Our goal is to get from high-level abstractions down to executable code. To do that, we’ll take a high-level MLIR program (using func and arith), lower it into MLIR’s LLVM Dialect, which is basically a 1:1 mapping of LLVM instructions inside MLIR, and finally translate that into pure LLVM IR (.ll) so the machine can actually run it. Our MLIR program should define a function which takes two integer (i32) arguments and add them together and return the results.

In MLIR, everything is an operation. Operations are the core unit of abstraction and computation, similar in many ways to LLVM instructions. Even defining a function is an operation! The func dialect has an operation called func to define functions. We use the @ sigil for global symbols (like function names) and the % sigil for local values (variables).

func.func @add(%arg0: i32, %arg1: i32) -> i32 {

This tells the compiler that we are defining a function called @add which takes two i32 arguments %arg0 and %arg1 and returns a i32 value.

Now let’s take a look at the logic inside the block.

func.func @add(%arg0: i32, %arg1: i32) -> i32 {
    %0 = arith.addi %arg0, %arg1 : i32
    func.return %0 : i32
}

Here, arith.addi does the heavy lifting. It grabs %arg0 and %arg1, adds them up, and assigns the result to %0. Since every function block needs to end explicitly, we then use the func.return operation to send that %0 value back out.

Go ahead and save this code in a file called add.mlir. Next, we are going to use the mlir-opt tool to lower our high-level dialects (func and arith) down to the LLVM Dialect. If that sounds confusing, don’t worry, it tripped me up at first, too. You might be wondering: Why are we translating to an ‘LLVM dialect’ inside MLIR instead of just spitting out actual LLVM IR? > That question actually hits on the entire purpose of MLIR. By keeping LLVM instructions represented as an MLIR dialect, the framework can preserve high-level structure and debug information during the translation process. The ultimate goal for almost any MLIR pipeline is to gradually step down the abstraction ladder until you hit this LLVM dialect. Once you are there, escaping out to standard LLVM IR is trivial.

mlir-opt add.mlir --convert-arith-to-llvm --convert-func-to-llvm > add.llvm.mlir

So, what is happening with this command?

func  ----
          \____ llvm
arith ----/

As mentioned earlier, this process is called Lowering, where we are taking a MLIR program written using func and arith dialects and lower them into a single, lower-level dialect-llvm. You can use mlir-opt --help to look into all other possible mlir-opt flags.

Now, ensure that you have the file add.llvm.mlir, which contains the translated code in the LLVM dialect as follows:

module {
  llvm.func @add(%arg0: i32, %arg1: i32) -> i32 {
    %0 = llvm.add %arg0, %arg1 : i32
    llvm.return %0 : i32
  }
}

At first glance, it looks almost identical. But look closely at the prefixes. Every operation has shifted from a high-level abstraction to a specific LLVM instruction representation and now explictly wrapped in a module. In MLIR, operations cannot just float in the coid; they must live inside a block. Top-level operations like functions needs a container. The module operation acts as that container. Even though we did not explictly define a module ourselves, a module is implicit in MLIR programs which allows you to write top-level functions for convenience, but internally, it treats them as if they are inside a module. module defines a symbol table and creates a scope where global names (like @add) are defined and unique. This is roughly corresposnding to a translation unit in C++ or an object file. Our current program is entirly written using only one dialect-LLVM dialect and translated using mlir-opt as above.

We are almost there. We have lowered out high-level logic into the low-level LLVM dialect. But here is the catch: we still cannot execute this. The machine does not know what MLIR is. It doesn’t know about dialects and operations. To run this, we need to leave the MLIR framework entirely and generate actual LLVM IR, the standard text-based format (.ll) that the LLVM backend understands.

MLIR framework has a tool specifically for this purpose, mlir-translate. While mlir-opt is for transforming code within MLIR (optimization, lowering), mlir-translate is for exporting code out of MLIR. We will use mlir-translate to take out LLVM-dialect module and serialize it into valid LLVM IR using the following command:

mlir-translate --mlir-to-llvmir add.llvm.mlir -o add.ll

If you open add.ll, you will see something familiar to any compiler engineer: standard, raw LLVM IR. We now have add.ll, which contains our @add function in valid LLVM IR. But if you try to compile it directly, nothing will happen. Why? Because we don’t have a main function. Every C/C++ (and LLVM) programs needs an entry point. Our @add function is just a library function right now; it’s waiting to be called, but nobody is calling it. Therefore, let’s create a simple driver program to fix this problem.

driver.ll

; Declare the external 'add' function (it exists in our other file)
declare i32 @add(i32, i32)

define i32 @main() {
  ; Call 'add' with arguments 10 and 32
  %result = call i32 @add(i32 10, i32 32)
  
  ; Return the result as the exit code
  ret i32 %result
}

This is raw LLVM IR. It tells the compiler: “There is a function called @add somewhere else. I want to call it with 10 and 32, and I want to return the answer as my program’s exit code.” Now that we have the main function too, let’s use clang to link them together into a real executable binary and run it.

clang add.ll driver.ll -o add
./add

But wait!, where is the output? If you run it, you won’t see anything. That’s expected! We didn’t tell the program to print anything (which requires library calls like printf). We told it to return the results as the exit code. To see the exit code in your terminal, run this:

./add ; echo $?

If everything worked, you should see: 42

Success! We did it. We started with high-level MLIR (func + arith), lowered it to the LLVM dialect, translated it to LLVM IR, linked it with a driver, and executed it on the bare metal.

You have officially written, compiled, and run your first MLIR program.

Create MLIR Pass

This is where things get real. Writing MLIR code (.mlir) is like writing Python, you are a user. Writing a Pass means writing C++, you are now a compiler engineer. In this section, we aren’t just running mlir-opt; we are building our own version of it. If you have used LLVM before, the traditional workflow: compile your pass into a shared library (.so) and then load it dynamically into the standard opt tool using a flag like -load-pass-plugin. In MLIR, we rarely do that. Instead, we almost always build our own custom version of mlir-opt.

Why? Because, MLIR relies heavily on C++ templates and static registration. Because of how C++ templates work, they don’t play nice across library boundaries, trying to load an MLIR pass dynamically can lead to weird ABI issues or missing symbols. So the “MLIR Way” is to create a new C++ executable that links statically against the core MLIR libraries, the standard dialect libraries and our custom pass library. This results in a tool that functions exactly like mlir-opt, but extended to specifically include your pass.

Let’s create a new directory for our “Hello World” MLIR pass called PrintOpsPass which can simply print the names of each operations. If you look at the C++ code for a pass, it can be intimidating. There are templates nested inside templates and macros with names that look like they were shouted by a compiler. But once you strip away the noise, it’s actually quite elegant.

Create a file called my-opt.cpp, and start writing the pass by wrapping everything in an anonymous namespace (namespace {...}). In C++, this is a trick to make sure our class is only visible inside this specific file. We do this because eventually, we might link dozens of passes together into one tool, and we don’t want out PrintOpsPass colliding with someone else’s PrintOpsPass.

using namespace mlir;

namespace {
      struct PrintOpsPass : PassWrapper<PrintOpsPass, Operation<ModuleOp>> {}
}

The above code defines our pass but at first glance it looks scary, especially, if you have not used CRTP(Curiously Recurring Template Pattern) in C++ before. Let’s look into it to clear away all the noise. We are defining our pass by inheriting from PassWrapper. This is a helper class MLIR provides to handle all the boilerplate of registering a pass. Also, notice the template arguments: PrintOpsPass and OperationPass<ModuleOp>. We are defining struct PrintOpsPass, but we are passing PrintOpsPass as a template argument to its own parent! This is known as the CRTP.

In standard C++ OOP, runtime polymorphism is typically implemented using virtual functions, which introduce indirection via vtable lookups. While this overhead is negligible in most applications, it can be significant in compiler hot paths that execute at scale. Therefore we are passing PrintOpsPass into PassWrapper to give the parent class a compile-time pointer to its child. This allows PassWrapper to implement boilerplate methods (like clone() or getId()) for us, specifically customized for our class, without needing a single virtual function call. This is called static polymorphism.

The second argument we are passing to the PassWrapper is called Traits. In MLIR, capabilities are often injected via these template arguments. This specific trait defines the behavioural constraints of your pass. By adding the OperationPass trait, we are telling the framework; “This pass is designed to run on specific operations.”. Also by specializing it with <ModuleOp>, we are adding a constraint; “This pass only works on Module operations.” If you tried to run this pass on a standard function, the compiler (via this trait) would know it’s illegal before the code even runs.

Now that we have created the basic class structure, we need to give our pass a unique identity and wire it up so we can actually trigger it from the commandline.

struct PrintOpsPass : PassWrapper<PrintOpsPass, OperationPass<ModuleOp>> {
    // This creates a unique id for the pass
    MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(PrintOpsPass);

    StringRef getArgument() const override { return "print-opt"; }
    StringRef getDescription() const override { return "This is a pass that will print the operation names. This is a analysis pass."; }
};
  • MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID, will inject a static variable into our pass class that will exist at a unique memory address in our program. MLIR then uses that raw memory address as the unique identifier for our pass.
  • getArgument(), hooks our C++ code directly into the compiler’s commandline parser. Later, when we compile out tool, we will be able to trigger this exact C++ class just by typing ./our-custom-mlir-opt --print-opt in the terminal.
  • getDescription(), hooks its string argument automatically into the --help menu. If someone runs ./out-custom-mlir-opt --help, the MLIR framework will automatically list --print-opt right next to this description.

Finally, we reach the heart of the pass, the runOnOperation() method. If the metadata above is how we trigger the pass from the command line, runOnOperation() is what actually executes when that trigger is pulled.

struct PrintOpsPass : PassWrapper<PrintOpsPass, OperationPass<ModuleOp>> {
    // ... (metadata stuff we just wrote) ...

    // pass logic
    void runOnOperation() override {
        // getOperation() returns the top-level ops that this pass is running on (Usually the Module)
        // .walk() automatically iterates through every nested operation inside it
        getOperation()->walk([](Operation *op){
            llvm::outs() << "Visiting Op: " << op->getName() << ";\n";
        });
    }
};

The first thing we do is call getOperation(). Because we told the framework earlier that this is an OperationPass<ModulePass>, this call comes with a guarantee that it will return top-level Module operation that acts as the container for our entire program.

Next we need to look at every single operation inside that module. Remember, an MLIR program is structured like a massive tree: Modules contain Functions, Functions contain Blocks, and Blocks contain Operations (which might contain even more Blocks!). For this purpose, MLIR gives us the .walk() utility. Finally, we do a simple print to grab the operation’s name.

Now we have our basic pass, but it’s just a C++ class floating in the coid. To actually run it against our previous add.mlir file, we need to build an executable.

Remember earlier when we talked about building our own custom version of mlir-opt? We aren’t actually compiling a shared library to plug into an existing tool; we are compiling a brand new, standalone compiler tool from scratch. To do that we need a main() function. You can put this right at the bottom of the same C++ file.

#include "mlir/InitAllDialects.h"
#include "mlir/InitAllPasses.h"
#include "mlir/Pass/PassManager.h"
#include "mlir/Tools/mlir-opt/MlirOptMain.h"

using namespace mlir;
// ... (our PrintOpsPass struct is up here) ...

int main(int argc, char **argv) {
    // create a registry and load all standard MLIR dialects
    DialectRegistry registry;
    registerAllDialects(registry);

    // register our custom pass with the command line parser
    PassRegistration<PrintOpsPass>();

    // run the tool (handles command line arguments, parsing, etc.)
    return asMainReturnCode(
        MlirOptMain(argc, argv, "My Custom MLIR Optimizer", registry)
    );
}

This looks surprisingly short for a custom compiler tool, right? That is the beauty of MLIR’s modular design.

When our tool reads the add.mlir file, it is going to encounter operations like func.func and arith.addi. If we don’t explicitly teach our tool what func and arith mean, it will throw a parsing error immediately. By creating a DialectRegistry and calling registerAllDialects, we are loading the definitions of every standard in-tree MLIR dialect into out tool’s memory.

After registering all in-tree dialects, we have to register our custom pass using the PassRegistration<>(), so that the commandline parser would know to trigger our pass function when user gives --print-opt flag.

Finally, the most important part, MlirOptMain. We could manually write the code to open a .mlir file, read the text, parse it into an MLIR module, initilize a Pass Manager, schedule our pass, catch errors and print the output back to terminal. But that would take a hundreds of lines of tedious boilerplate. Instead, MLIR provides MlirOptMain. We just hand it our commandline arguments (argc,argv) and our loaded registry. It takes over, acts exactly like the standard mlir-opt tool we used earlier, and handles all the file I/O and threading for us automatically.

We have our C++ pass, and we have our driver. Now we just need to compile it. If you have spent any time in C++ compiler engineering, you know what time it is: CMake time. I won’t bore you with a deep dive into CMake syntax—our focus is on the IR, not the build system. Assuming you saved all that C++ code we just wrote into a file named my-opt.cpp, here is the CMakeLists.txt file you need to tie everything together.

cmake_minimum_required(VERSION 3.13)
project(my-opt)

find_package(MLIR REQUIRED CONFIG)

list(APPEND CMAKE_MODULE_PATH ${MLIR_CMAKE_DIR})
list(APPEND CMAKE_MODULE_PATH ${LLVM_CMAKE_DIR})

include(TableGen)
include(AddLLVM)
include(AddMLIR)
include(HandleLLVMOptions)

include_directories(${LLVM_INCLUDE_DIR})
include_directories(${MLIR_INCLUDE_DIR})
link_directories(${LLVM_LIBRARY_DIR})
add_definitions(${LLVM_DEFINITIONS})

# Define our executable
add_executable(my-opt my-opt.cpp)

llvm_update_compile_flags(my-opt)

# Link the necessary MLIR libraries
# We need MlirOptLib for the main entry point, and the others for IR/Passes.
target_link_libraries(my-opt PRIVATE
  MLIRIR
  MLIRSupport
  MLIRPass
  MLIROptLib    # contains MlirOptMain
  MLIRRegisterAllDialects    # contains standard dialects
  MLIRRegisterAllPasses      # contains standard passes
)

While we are skipping a line-by-line breakdown, pay attention to the target_link_libraries at the bottom. This is where the magic happens. We are taking our tiny, custom my-opt executable and linking it against the massive, pre-compiled MLIR libraries (like MLIRIR and MLIROptLib).

Once you have this boilerplate working, save it. You can reuse this exact setup later when you graduate from simple analysis passes to building more complex projects.

Let’s finally build. Open your terminal in the diretory containing your my-opt.cpp and CMakeLists.txt, and run the standard CMake build command (Don’t forget to use the load-llvm-dev to load the installed LLVM from the previous section):

mkdir build
cd build
cmake ..
make

If everything is linked correctly to your LLVM installation, you should now have a shiny new executable named my-opt sitting in your build folder. Let’s use our custom tool to compile the add.mlir from earlier.

./my-opt --print-opt ./add.mlir -o /dev/null

Because our compiler tool inherits from MLIR’s core infrastructure, it acts exactly like the standard mlir-opt. By default, it will take the input file, run our pass, and then print the entire resulting MLIR code back to the terminal. Since we only want to see our pass’s analysis log and not a giant wall of MLIR code, we use -o /dev/null to tell the tool to quietly discard the final IR output.

When you run it, your terminal should output exactly what we asked for:

Visiting Op: builtin.module;
Visiting Op: func.func;
Visiting Op: arith.addi;
Visiting Op: func.return;

Success!!!

Our custom tool successfully parsed the .mlir file, walked the AST from top-level module all the way down to the individual arithmetic instructions, and execute our custom C++ logic at every node.

Build MLIR Dialects