Introduction

Note: Mun & this book are currently under active development, any and all content of this book is not final and may still change.

Mun is an embeddable scripting language designed for developer productivity.

  • Ahead of time compilation
    Mun is compiled ahead of time (AOT), as opposed to being interpreted or compiled just in time (JIT). By detecting errors in the code during AOT compilation, an entire class of runtime errors is eliminated. This allows developers to stay within the comfort of their IDE instead of having to switch between the IDE and target application to debug runtime errors.

  • Statically typed
    Mun resolves types at compilation time instead of at runtime, resulting in immediate feedback when writing code and opening the door for powerful refactoring tools.

  • First class hot-reloading
    Every aspect of Mun is designed with hot reloading in mind. Hot reloading is the process of changing code and resources of a live application, removing the need to start, stop and recompile an application whenever a function or value is changed.

  • Performance
    AOT compilation combined with static typing ensure that Mun is compiled to machine code that can be natively executed on any target platform. LLVM is used for compilation and optimization, guaranteeing the best possible performance. Hot reloading does introduce a slight runtime overhead, but it can be disabled for production builds to ensure the best possible runtime performance.

  • Cross compilation
    The Mun compiler is able to compile to all supported target platforms from any supported compiler platform.

  • Powerful IDE integration
    The Mun language and compiler framework are designed to support source code queries, allowing for powerful IDE integrations such as code completion and refactoring tools.

Case Studies

A collection of case studies that inspired the design choices made in Mun.

Abbey Games

Abbey Games uses Lua as its main gameplay programming language because of Lua's ability to hot reload code. This allows for rapid iteration of game code, enabling gameplay programmers and designers to quickly test and tweak systems and content. Lua is a dynamically typed, JIT compiled language. Although this has some definite advantages, it also introduces a lot of problems with bigger codebases.

Changes in Lua code can have large implications throughout the entire codebase and since we cannot oversee the entire codebase at all times runtime errors are bound to occur. Runtime errors are nasty beasts because they can pop up after a long period of time and after work on the offending piece of code has already finished. They are also often detected by someone different from the person who worked on the code. This causes great frustration and delay, let alone when the runtime error is detected by a user of the software.

Lua amplifies this issue due to its dynamic and flexible nature. It would be great if we could turn some of these runtime errors into compile time errors. That way programmers are notified of errors way before someone else runs into them. The risk of causing implicit runtime errors causes programmers to distrust their refactoring tools. This in turn reduces the likelihood of programmers refactoring their code.

Even though Lua offers immense flexibility, we noticed that certain opinionated patterns recur a lot and as such have become standard practice. Introducing these practices assists us in daily development a lot, but requires more code and complexity than desirable. Having syntactic sugar would greatly help reduce complexity in our code base, but would also introduce magic or custom keywords that are foreign to both new developers and IDE's.

Rapid iteration is key to prototyping game concepts and features. Proper IDE-integration of a scripting language gives a huge boost to productivity.

Getting Started

Let's start your Mun journey by installing the Mun CLI and creating a simple Mun library. We'll then show you how to make it hot reloadable by embedding it into an application.

Installation

First we need to install the Mun CLI (command-line interface), which acts as an all-in-one tool for Mun application development. Pre-built binaries are available for macOS, Linux, and Windows (64-bit only). Download and extract the binaries to a location of your preference.

You are now ready to write your first Mun code!

Hello, fibonacci?

Most programming languages start off with a "Hello, world!" example, but not Mun. Mun is designed around the concept of hot reloading. Our philosophy is to only add new language constructs when those can be hot reloaded. Since the first building blocks of Mun were native types and functions our divergent example has become fibonacci, hence "Hello, fibonacci?".

Creating a Project Directory

The Mun compiler is agnostic to the location of a project directory, as long as all source files are in the same place. Let's open a terminal to create our first project directory:

mun new hello_fibonacci

This command creates a new directory called hello_fibonacci with the following contents:

hello_fibonacci
├── src
|   └── mod.mun
└── mun.toml

The mun.toml file contains meta information about your package, such as the name, version, and author.

Writing and Running a Mun Library

Next, open the src/mod.mun source file and enter the code in Listing 1-1. Mun source files always end with the .mun extension. If your file name consists of multiple words, separate them using underscores.

Filename: src/mod.mun

# pub fn main() {
#   fibonacci_n();
# }

pub fn fibonacci_n() -> i64 {
    let n = arg();
    fibonacci(n)
}

fn arg() -> i64 {
    5
}

fn fibonacci(n: i64) -> i64 {
    if n <= 1 {
        n
    } else {
        fibonacci(n - 1) + fibonacci(n - 2)
    }
}

Listing 1-1: A function that calculates a fibonacci number

Save the file and go back to your terminal window. You are now ready to compile your first Mun library. Enter the following command to compile the file:

cd hello_fibonacci
mun build

The mun build command compiles all source files in the project and generates the runtime assemblies required to run the code. After running mun build an entry point assembly is created at target/mod.munlib which can be used to run the code. Contrary to many other languages, Mun doesn't support standalone applications, instead it is shipped in the form of Mun libraries - recognizable by their *.munlib extension. That's why Mun comes with a command-line interface (CLI) that can both compile and run Mun libraries. To run a Mun library, enter the following command:

mun start target/mod.munlib --entry fibonacci_n

The result of fibonacci_n (i.e. 5) should now appear in your terminal. Congratulations! You just successfully created and ran your first Mun library.

Hello, Hot Reloading!

Mun distinguishes itself from other languages by its inherent hot reloading capabilities. The following example illustrates how you can create a hot reloadable application by slightly modifying the Hello, fibonacci? example. In Listing 1-2, the fibonacci_n function has been removed and the pub keyword has been added to both args and fibonacci.

Filename: src/mod.mun

# pub fn main() {
#   fibonacci(arg());
# }

pub fn arg() -> i64 {
    5
}

pub fn fibonacci(n: i64) -> i64 {
    if n <= 1 {
        n
    } else {
        fibonacci(n - 1) + fibonacci(n - 2)
    }
}

Listing 1-2: A function that calculates a fibonacci number

Apart from running Mun libraries from the command-line interface, a common use case is embedding them in other programming languages.

Mun embedded in C++

Mun exposes a C API and complementary C++ bindings for the Mun Runtime. Listing 1-3 shows a C++ application that constructs a Mun Runtime for the hello_fibonacci library and continuously invokes the fibonacci function and outputs its result.

Filename: main.cc

#include <iostream>

#include "mun/mun.h"

int main(int argc, char *argv[]) {
    if (argc < 2) {
        return 1;
    }

    auto lib_path = argv[1];
    mun::RuntimeOptions options;
    mun::Error error;
    if (auto runtime = mun::make_runtime(lib_path, options, &error)) {
        while (true) {
            auto arg = mun::invoke_fn<int64_t>(*runtime, "arg").wait();
            auto result =
                mun::invoke_fn<int64_t>(*runtime, "fibonacci", arg).wait();
            std::cout << "fibonacci(" << std::to_string(arg) << ") = " << result
                      << std::endl;

            runtime->update();
        }

        return 0;
    }

    std::cerr << "Failed to construct Mun runtime due to error: "
              << error.message() << std::endl;

    return 2;
}

Listing 1-3: Hello, Fibonacci? embedded in a C++ application

Mun embedded in Rust

As the Mun Runtime is written in Rust, it can be easily embedded in Rust applications by adding the mun_runtime crate as a dependency. Listing 1-4 illustrates a simple Rust application that builds a Mun Runtime and continuously invokes the fibonacci function and prints its output.

Filename: mod.rs

extern crate mun_runtime;
use mun_runtime::Runtime;
use std::env;

fn main() {
    let lib_path = env::args().nth(1).expect("Expected path to a Mun library.");

    // Safety: We assume that the library that is loaded is a valid munlib
    let builder = Runtime::builder(lib_path);
    let mut runtime = unsafe { builder.finish() }.expect("Failed to spawn Runtime");

    loop {
        let arg: i64 = runtime.invoke("arg", ()).unwrap();
        let result: i64 = runtime.invoke("fibonacci", (arg,)).unwrap();
        println!("fibonacci({}) = {}", arg, result);

        unsafe { runtime.update() };
    }
}

Listing 1-4: Hello, Fibonacci? embedded in a Rust application

Hot Reloading

The prior examples both update the runtime every loop cycle. In the background, this detects recompiled code and reloads the resulting Mun libraries.

To ensure that the Mun compiler recompiles our code every time the mod.mun source file from Listing 1-2 changes, the --watch argument must be added:

mun build --watch

When saved, changes in the source file will automatically take effect in the running example application. E.g. change the return value of the arg function and the application will log the corresponding Fibonacci number.

Some changes, such as a type mismatch between the compiled application and the hot reloadable library, can lead to runtime errors. When these occur, theruntime will log the error and halt until an update to the source code arrives.

That's it! Now you are ready to start developing hot reloadable Mun libraries.

Basic Concepts

This section describes the basic concepts of the Mun programming language.

Values and types

Mun is a statically typed language, which helps to detect type-related errors at compile-time. A type error is an invalid operation on a given type, such as an integer divided by a string, trying to access a field that doesn't exist, or calling a function with the wrong number of arguments.

Some languages require a programmer to explicitly annotate syntactic constructs with type information:

int foo = 3 + 4;

However, often variable types can be inferred by their usage. Mun uses type inferencing to determine variable types at compile time. However, you are still forced to explicitly annotate variables in a few locations to ensure a contract between interdependent code.

# pub fn main() {
#   bar(1);
# }
fn bar(a: i32) -> i32 {
    let foo = 3 + a;
    foo
}

Here, the parameter a and the return type must be annotated because it solidifies the signature of the function. The type of foo can be inferred through its usage.

Integer types

An integer is a number without a fractional component. Table 3-1 shows the built-in integer types in Mun. Each variant can be either signed or unsigned and has an explicit size. Signed and unsigned refer to whether it is necessary to have a sign that indicates the possibility for the number to be negative or positive.

LengthSignedUnsigned
8-biti8u8
16-biti16u16
32-biti32u32
64-biti64u64
128-biti128u128
archisizeusize

Table 2-1: Integer Types in Mun

Signed integer types start with i, unsigned integer types with u, followed by the number of bits that the integer value takes up. Each signed variant can store numbers from -(2n - 1) to 2n - 1 - 1 inclusive, where n is the number of bits that variant uses. Unsigned variants can store numbers from 0 to 2n - 1. By default Mun uses 32-bit signed integers.

The size of the isize and usize types depend on the target architecture. On 64-bit architectures,isize and usize types are 64 bits large, whereas on 32-bit architectures they are 32 bits in size.

Floating-Point Types

Real (or floating-point) numbers (i.e. numbers with a fractional component) are represented according to the IEEE-754 standard. The f32 type is a single-precision float of 32 bits, and the f64 type has double precision - requiring 64 bits.

pub fn main() {
    let f = 3.0; // f64
}

The Boolean Type

The bool (or boolean) type has two values, true and false, that are used to evaluate conditions. It takes up one 1 byte (or 8 bits).

pub fn main() {
    let t = true;

    let f: bool = false; // with explicit type annotation
}

Literals

There are three types of literals in Mun: integer, floating-point and boolean literals.

A boolean literal is either true or false.

An integer literal is a number without a decimal separator (.). It can be written as a decimal, hexadecimal, octal or binary value. These are all examples of valid literals:

# pub fn main() {
let a = 367;
let b = 0xbeaf;
let c = 0o76532;
let d = 0b0101011;
# }

A floating-point literal comes in two forms:

  • A decimal number followed by a dot which is optionally followed by another decimal literal and an optional exponent.
  • A decimal number followed by an exponent.

Examples of valid floating-point literals are:

# pub fn main() {
let a: f64 = 3.1415;
let b: f64 = 3.;
let c: f64 = 314.1592654e-2;
# }

Separators

Both integer and floating-point literals can contain underscores (_) to visually separate numbers from one another. They do not have any semantic significance but can be useful to the eye.

# pub fn main() {
let a: i64 = 1_000_000;
let b: f64 = 1_000.12;
# }

Type suffix

Integer and floating-point literals may be followed by a type suffix to explicitly specify the type of the literal.

Literal typeSuffixes
Integeru8, i8, u16, i16, u32, i32, u64, i64, i128, u128, usize, isize, f32, f64
Floating-pointf32, f64

Table 2-2: Literal suffixes in Mun

Note that integer literals can have floating-point suffixes. This is not the case the other way around.

# pub fn main() {
let a: u8 = 128_u8;
let b: i128 = 99999999999999999_i128;
let c: f32 = 10_f32; // integer literal with float suffix 
# }

When providing a literal, the compiler will always check if a literal value will fit the type. If not, an error will be emitted:

# pub fn main() {
let a: u8 = 1123123124124_u8; // literal out of range for `u8`
# }

Numeric operations

Mun supports all basic mathematical operations for number types: addition, subtraction, division, multiplication, and remainder.

pub fn main() {
    // addition 
    let a = 10 + 5;

    // subtraction
    let b = 10 - 4;

    // multiplication
    let c = 5 * 10;

    // division
    let d = 25 / 5;

    // remainder
    let e = 21 % 5;
}

Each expression in these statements uses a mathematical operator and evaluates to a single value. This is valid as long as both sides of the operator have the same type.

Unary operators are also supported:

pub fn main() {
    let a = 4;
    // negate
    let b = -a;
    
    let c = true;
    // not
    let d = !c;
}

Shadowing

Redeclaring a variable by the same name with a let statement is valid and will shadow any previous declaration in the same block. This is often useful if you want to change the type of a variable.

# pub fn main() {
let a: i32 = 3;
let a: f64 = 5.0; 
# }

Use before initialization

All variables in Mun must be initialized before usage. Uninitialized variables can be declared but they must be assigned a value before they can be read.

# pub fn main() {
# let some_conditional = false;
let a: i32;
if some_conditional {
    a = 4;
}
let b = a; // invalid: a is potentially uninitialized
# }

Note that declaring a variable without a value is often a bad code smell since the above could have better been written by returning a value from the if/else block instead of assigning to a. This avoids the use of an uninitialized value.

# pub fn main() {
# let some_conditional = true;
let a: i32 = if some_conditional {
    4
} else {
    5
}
let b = a;
# }

Functions

Together with struct, functions are the core building blocks of hot reloading in Mun. Throughout the documentation you've already seen a lot of examples of the fn keyword, which is used to define a function.

Mun uses snake case as the conventional style for function and variable names. In snake case all letters are lowercase and words are separated by underscores.

pub fn main() {
    another_function();
}

fn another_function() {

}

Function definitions start with an optional access modifier (pub), followed by the fn keyword, a name, an argument list enclosed by parentheses, an optional return type specifier, and finally a body.

Marking a function with the pub keyword allows you to publicly expose that function, for usage in other modules or when hot reloading. Otherwise the function will only be accessible from the current source file.

Type Access Modifier

By default, a type or function defined in a module is not accessible outside of its file and submodules. You can expand accessibility in three ways:

  • pub: accessible within the package and externally (incl. marshalling to the host language)
  • pub(package): accessible within the current package but not by anything else
  • pub(super): accessible within parent module and its submodules
# pub fn main() {
#   bar();
# }
// This function is not accessible outside of this code
fn foo() {
    // ...
}

// This function is accessible from anywhere.
pub fn bar() {
    // Because `bar` and `foo` are in the same file, this call is valid.
    foo()
}

// This function is accessible from the parent module and its submodules
pub(super) fn baz() {
    // ...
}

// This function is accessible from the entire package but not externally
pub(package) fn foobar() {
    // ...
}

When you want to interface from your host language (C++, Rust, etc.) with Mun, you can only access pub functions. These functions are hot reloaded by the runtime when they or functions they call have been modified.

Function Arguments

Functions can have an argument list. Arguments are special variables that are part of the function signature. Unlike regular variables you have to explicitly specify the type of the arguments. This is a deliberate decision, as type annotations in function definitions usually mean that the compiler can derive types almost everywhere in your code. It also ensures that you as a developer define a contract of what your function can accept as its input.

The following is a rewritten version of another_function that shows what an argument looks like:

pub fn main() {
    another_function(3);
}

fn another_function(x: i32) {
}

The declaration of another_function specifies an argument x of the i32 type. When you want a function to use multiple arguments, separate them with commas:

pub fn main() {
    another_function(3, 4);
}

fn another_function(x: i32, y: i32) {
}

Function Bodies

Function bodies are made up of a sequence of statements and expressions. Statements are instructions that perform some action and do not return any value. Expressions evaluate to a result value.

Creating a variable and assigning a value to it with the let keyword is a statement. In the following example, let y = 6; is a statement.

pub fn main() {
    let y = 6;
}

Statements do not return values and can therefore not be assigned to another variable.

Expressions do evaluate to something. Consider a simple math operation 5 + 6, which is an expression that evaluates to 11. Expressions can be part of a statement, as can be seen in the example above. The expression 6 is assigned to the variable y. Calling a function is also an expression.

The body of a function is just a block. In Mun, not just bodies, but all blocks evaluate to the last expression in them. Blocks can therefore also be used on the right hand side of a let statement.

# pub fn main() {
#   foo();
# }
fn foo() -> i32 {
    let bar = {
        let b = 3;
        b + 3
    };
    // `bar` has a value 6
    bar + 3
}

Returning Values from Functions

Functions can return values to the code that calls them. We don't name return values in the function declaration, but we do declare their type after an arrow (->). In Mun, a function implicitly returns the value of the last expression in the function body. You can however return early from a function by using the return keyword and specifying a value.

fn five() -> i32 {
    5
}

pub fn main() {
    let x = five();
}

There are no function calls or statements in the body of the five function, just the expression 5. This is perfectly valid Mun. Note that the return type is specified too, as -> i32.

Whereas the last expression in a block implicitly becomes that blocks return value, explicit return statements always return from the entire function:

pub fn main() {
    let _ = foo();
}
fn foo() -> i32 {
    let bar = {
        let b = 3;
        return b + 3;
    };

    // This code will never be executed
    return bar + 3;
}

Control flow

Executing or repeating a block of code only under specific conditions are common constructs that allow developers to control the flow of execution. Mun provides if/else expressions and loops.

if expressions

An if expression allows you to branch your code depending on conditions.

pub fn main() {
    let number = 3;

    if number < 5 {
        number = 4;
    } else {
        number = 6;
    }
}

All if expressions start with the keyword if, followed by a condition. As opposed to many C-like languages, Mun omits parentheses around the condition. Only when the condition is true - in the example, whether the number variable is less than 5 - the consecutive code block (or arm) is executed.

Optionally, an else expression can be added that will be executed when the condition evaluates to false. You can also have multiple conditions by combining if and else in an else if expression. For example:

pub fn main() {
    let number = 6;
    if number > 10 {
        // The number if larger than 10
    } else if number > 8 {
        // The number is larger than 8 but smaller or equal to 10
    } else if number > 2 {
        // The number is larger than 2 but smaller or equal to 8
    } else {
        // The number is smaller than- or equal to 2.
    }
}

Using if in a let statement

The if expression can be used on the right side of a let statement just like a block:

pub fn main() {
    let condition = true;
    let number = if condition {
        5
    } else {
        6
    };
}

Depending on the condition, the number variable will be bound to the value of the if block or the else block. This means that both the if and else arms need to evaluate to the same type. If the types are mismatched the compiler will report an error.

loop expressions

A loop expression can be used to create an infinite loop. Breaking out of the loop is done using the break statement.

pub fn main() {
    let i = 0;
    loop {
        if i > 5 {
            break;
        }

        i += 1;
    }
}

Similar to if/else expressions, loop blocks can have a return value that can be returned through the use of a break statement.

# pub fn main() {
#   count(4, 4);
# }
fn count(i: i32, n: i32) -> i32 {
    let loop_count = 0;
    loop {
        if i >= n {
            break loop_count;
        }

        loop_count += 1;
    }
}

All break statements in a loop must have the same return type.

# pub fn main() {
let a = loop {
    break 3;
    break; // expected `{integer}`, found `nothing`
};
# }

while expressions

while loops execute a block of code as long as a condition holds. A while loop starts with the keyword while followed by a condition expression and a block of code to execute upon each iteration. Just like with the if expression, no parentheses are required around the condition expression.

pub fn main() {
    let i = 0;
    while i <= 5 {
        i += 1;
    }
}

A break statement inside the while loop immediately exits the loop.

Unlike a loop expression, a break in a while loop cannot return a value because a while loop can exit both through the use of a break statement and because the condition no longer holds. Although we could explicitly return a value from the while loop through the use of a break statement it is unclear which value should be returned if the loop exits because the condition no longer holds.

extern functions

Extern functions are declared in Mun but their function bodies are defined externally. They behave exactly the same as regular functions but their definitions have to be provided to the runtime when loading a Mun library. Failure to do so will result in a runtime link error, and loading the library will fail. Take this code for example:

extern fn random() -> i64;

pub fn random_bool() -> bool {
    random() % 2 == 0
}

Listing 2-1: Random bool in Mun

The random function is marked as an extern function, which means that it must be provided to the runtime when loading this library.

First building the above code as main.munlib and then trying to load the library in Rust using:

extern crate mun_runtime;
use mun_runtime::Runtime;

fn main() {
    // Safety: We assume that the library that is loaded is a valid munlib
    let builder = Runtime::builder("main.munlib");
    let mut runtime = unsafe { builder.finish() }.expect("Failed to spawn Runtime");

    let result: bool = runtime.invoke("random_bool", ()).unwrap();
    println!("random bool: {}", result);
}

Listing 2-2: Load listing 2-1 without adding extern function

will result in an error:

Failed to link: function `random` is missing.

This indicates that we have to provide the runtime with the random method, which we can do through the use of the insert_fn method. Let's add a method that uses the current time as the base of our random method:

extern crate mun_runtime;
use mun_runtime::Runtime;

extern "C" fn random() -> i64 {
    let result = std::time::Instant::now().elapsed().subsec_nanos() as i64;
    println!("random: {}", result);
    result
}

fn main() {
    // Safety: We assume that the library that is loaded is a valid munlib
    let builder =
        Runtime::builder("main.munlib").insert_fn("random", random as extern "C" fn() -> i64);
    let mut runtime = unsafe { builder.finish() }.expect("Failed to spawn Runtime");

    let result: bool = runtime.invoke("random_bool", ()).unwrap();
    println!("random_bool: {}", result);
}

Listing 2-3: Load listing 2-1 with custom random function

Note that we have to explicitly cast the function random to extern "C" fn() -> i64. This is because each function in Rust has its own unique type.

When we run this now, the error is gone and you should have a function that returns a random boolean in Mun.

use keyword

So far we've worked with a single source file when writing Mun code. Let's look at an example of a multi-file project. We'll rewrite the fibonacci example from Listing 1-1, by splitting it into two files src/fibonacci.mun and src/mod.mun - the former being a submodule of the latter - as shown in Listing 2-4 and Listing 2-5, respectively.

Alternatively, you can also create a fibonacci submodule using the path src/fibonacci/mod.mun. Both are equally valid and it's up to you to decide what you prefer.

Filename: src/fibonacci.mun

pub fn fibonacci(n: i64) -> i64 {
    if n <= 1 {
        n
    } else {
        fibonacci(n - 1) + fibonacci(n - 2)
    }
}

Listing 2-4: The fibonacci function extracted into its own submodule

To show Mun where to find an item in the module tree, we use a path in the same way we use a path when navigating a filesystem. If we want to call a function, we need to know its path.

A path can take two forms:

  • An absolute path starts from the package's root.
  • A relative path starts from the current module and uses self, super, or an identifier in the current module.

Both absolute and relative paths are followed by one or more identifiers separated by double colons (::). It's up to you to decide which style is preferable.

Filename: src/mod.mun

# pub fn main() {
#   fibonacci_n();
# }

fn arg() -> i64 {
    5
}

pub fn fibonacci_n() -> i64 {
    let n = arg();
    fibonacci::fibonacci(n)
}

Listing 2-5: The fibonacci function being called using its full path.

Writing full paths can result in inconveniently long and repetitive code. Fortunately, there is a way to simplify this process. We can bring a path into a module's scope once and then call the items in that path as if they are local items with the use keyword.

In Listing 2-6, we bring the fibonacci::fibonacci function into the module's scope, allowing us to directly call the function..

Filename: src/mod.mun

use fibonacci::fibonacci;

# pub fn main() {
#   fibonacci_n();
# }

fn arg() -> i64 {
    5
}

pub fn fibonacci_n() -> i64 {
    let n = arg();
    fibonacci(n)
}

Listing 2-6: Bringing the fibonacci function into the module's scope with the use keyword

The Glob Operator

To bring all public items defined in a path into scope, we can specify that path followed by the glob operator (*):

use fibonacci::*;

This brings all public items defined in the fibonacci module into the current scope.

Be careful when using the glob operator!

Glob can make it harder to tell what names are in scope and where a name used in your program was defined.

Warning

Array functionality is still very basic as you cannot resize arrays (incl. pushing elements) at runtime. You can only get, set, and replace array elements. Future releases of Mun will extend this functionality.

Arrays

An array - is a collection of elements that all have the same type. In this chapter we'll demonstrate how to use arrays and how Mun's hot reloading works for arrays.

Warning

Array functionality is still very basic as you cannot resize arrays (incl. pushing elements) at runtime. You can only get, set, and replace array elements. Future releases of Mun will extend this functionality.

Dynamically Sized Arrays

In Mun, the default array type is dynamically sized and heap-allocated. To create an array, write its values as a comma-separated list inside square brackets, as shown in Listing 3-1.

let array = [1, 2, 3, 4, 5];

Listing 3-1: Creating an array instance

An array's type is written using square brackets around the type of each element, as demonstrated in Listing 3-2. This is only necessary when the compiler cannot automatically deduce the type of the array based on the context; although you you can always manually notate your code.

let array: [u64] = [1, 2, 3, 4, 5];

Listing 3-2: Creating an array instance of type [u64]

Accessing Array Elements

An array is a single chunk of heap-allocated memory of dynamic size. You can access elements of an array using indexing, as illustrated in Listing 3-3.

fn main() {
    let array = [1, 2, 3, 4, 5];

    let first = array[0];
    let second = array[1];
}

Listing 3-3: Accessing elements of an array instance`

In this example, the variable named first will get the value 1, because that is the value at index [0] in the array. The variable named second will get the value 2 from index [1] in the array.

Invalid Array Element Access

Warning

Mun is still in early development and should thus be considered unsafe! In particular for arrays, as they're allowed to perform out-of-bounds memory access.

Currently, Mun does not check for invalid array element access. As such, it will attempt to access an array element, even if it is out of bounds. You will have to manually prevent out-of-bounds access; e.g. as shown in Listing 3-4.

pub fn generate() -> [u64] {
    [5, 4, 3, 2, 1]
}

pub fn add_one(array: [u64], len: usize) -> [u64] {
    let idx = 0;
    loop {
        array[idx] += 1;
        idx += 1;

        if idx >= len {
            break array
        }
    }
}

fn main() {
    add_one(generate(), 5);
}

Listing 3-4: Preventing invalid element access of an array instance

When Mun implements a way to panic, this will change.

Warning

Array functionality is still very basic as you cannot resize arrays (incl. pushing elements) at runtime. You can only get, set, and replace array elements. Future releases of Mun will extend this functionality.

Marshalling Arrays

When embedding Mun in other languages, you will probably want to marshal arrays to and from another language. Mun provides a homogeneous interface for marshalling any array through an ArrayRef- a reference to a heap-allocated array. The Mun Runtime automatically handles the conversion from a function return type into an ArrayRef and function arguments into Mun arrays.

Marshalling reuses the memory allocated by the Mun garbage collector for arrays.

Listing 3-5 shows how to marshal array instances from Mun to Rust and vice versa, using the generate and add_one functions - previously defined.

extern crate mun_runtime;
use mun_runtime::{Runtime, ArrayRef};
use std::env;

fn main() {
    let lib_path = env::args().nth(1).expect("Expected path to a Mun library.");

    // Safety: We assume that the library that is loaded is a valid munlib
    let builder = Runtime::builder(lib_path);
    let mut runtime = unsafe { builder.finish() }
        .expect("Failed to spawn Runtime");

    let input: ArrayRef<'_, u64> = runtime.invoke("generate", ()).unwrap();

    assert_eq!(input.len(), 5);
    assert!(input.capacity() >= 5);

    let output: ArrayRef<'_, u64> = runtime
        .invoke("add_one", (input.clone(), input.len()))
        .unwrap();

    assert_eq!(output.iter().collect::<Vec<_>>(), vec![6, 5, 4, 3, 2]);
}

Listing 3-5: Marshalling array instances

Array methods

The API of ArrayRef contains two other methods for interacting with its data: capacity and len; respectively for retrieving the array's capacity and length:

    let input: ArrayRef<'_, u64> = runtime.invoke("generate", ()).unwrap();

    assert_eq!(input.len(), 5);
    assert!(input.capacity() >= 5);
}

Iterating elements

To obtain an iterator over the ArrayRef instance's elements, you can call the iter function, which returns an impl Iterator:

    let output: ArrayRef<'_, u64> = runtime
        .invoke("add_one", (input.clone(), input.len()))
        .unwrap();

    assert_eq!(output.iter().collect::<Vec<_>>(), vec![6, 5, 4, 3, 2]);

Structs

A struct - or structure - is a custom data type that groups related values together into a named data structure. In this chapter we'll compare the two types of supported structures, demonstrate how to use them, and how Mun's hot reloading works for structures.

Records vs Tuples

Mun supports two types of structures: record structs and tuple structs. A record struct definition specifies both the name and type of each piece of data, allowing you to retrieve the field by name. For example, Listing 4-1 shows a record struct that stores a 2-dimensional vector.

pub struct Vector2 {
    x: f32,
    y: f32,
}

Listing 4-1: A record struct definition for a 2D vector

In contrast, tuple struct definitions omit field names; only specifying the field types. Using a tuple struct makes sense when you want to associate a name with a tuple or distinguish it from other tuples' types, but naming each field would be redundant. Listing 4-2 depicts a tuple struct that stores a 3-dimensional vector.

pub struct Vector3(f32, f32, f32)

Listing 4-2: A tuple struct definition for a 3D vector

Create a Struct Instance

To use a record struct, we create an instance of that struct by stating the name of the struct and then add curly braces containing key: value pairs for each of its fields. The keys have to correspond to the field names in the struct definition, but can be provided in any order. Let's create an instance of our Vector2, as illustrated in Listing 4-3.

pub struct Vector2 {
    x: f32,
    y: f32,
}
let xy = Vector2 {
    x: 1.0,
    y: -1.0,
};

Listing 4-3: Creating a Vector2 instance

To create an instance of a tuple struct, you only need to state the name of the struct and specify a comma-separated list of values between round brackets - as shown in Listing 4-4. As values are not linked to field names, they have to appear in the order specified by the struct definition.

pub struct Vector3(f32, f32, f32)
let xyz = Vector3(-1.0, 0.0, 1.0);

Listing 4-4: Creating a Vector3 instance

Field Init Shorthand

It often makes sense to name function variables the same as the fields of a record struct. Instead of having to repeat the x and y field names, the field init shorthand syntax demonstrated in Listing 4-5 allows you to avoid repetition.

pub struct Vector2 {
    x: f32,
    y: f32,
}
pub fn vector2_new(x: f32, y: f32) -> Vector2 {
    Vector2 { x, y }
}

Listing 4-5: Creating a Vector2 instance using the field init shorthand syntax

Access Struct Fields

To access a record's fields, we use the dot notation: vector.x. The dot notation can be used both to retrieve and to assign a value to the record's field, as shown in Listing 4-6. As you can see, the record's name is used to indicate that the function expects two Vector2 instances as function arguments and returns a Vector2 instance as result.

pub struct Vector2 {
    x: f32,
    y: f32,
}
pub fn vector2_add(lhs: Vector2, rhs: Vector2) -> Vector2 {
    lhs.x += rhs.x;
    lhs.y += rhs.y;
    lhs
}

Listing 4-6: Using Vector2 instances' fields to calculate their addition

A tuple struct doesn't have field names, but instead accesses fields using indices - starting from zero - corresponding to a field's position within the struct definition (see Listing 4-7).

pub struct Vector3(f32, f32, f32)
pub fn vector3_add(lhs: Vector3, rhs: Vector3) -> Vector3 {
    lhs.0 += rhs.0;
    lhs.1 += rhs.1;
    lhs.2 += rhs.2;
    lhs
}

Listing 4-7: Using Vector3 instances' fields to calculate their addition

Unit Struct

Sometimes it can be useful to define a struct without any fields. These so-called unit structs are defined using the struct keyword and a name, as shown in Listing 4-8.

pub struct Unit;

Listing 4-8: A unit struct definition.

Struct Memory Kind

By default, Mun is a garbage collected language. This means that memory is allocated on the heap and automatically freed by the Mun Runtime when your memory goes out of scope. Sometimes this behavior is undesired, and you want to manually control when a value is freed.

Mun allows you to specify this so-called memory kind in a struct definition: gc for garbage collection or value to pass a struct by value; defaulting to gc when neither is specified. Listing 4-9 shows the previously created struct definition of a Vector2, which has the default gc memory kind.

pub struct Vector2 {
    x: f32,
    y: f32,
}

Listing 4-9: A record struct definition for a 2D vector, defaulting to the gc memory kind

To manually specify the memory kind, add round brackets containing either gc or value after the struct keyword, as illustrated in Listing 4-10.

pub struct(value) Vector2 {
    x: f32,
    y: f32,
}

Listing 4-10: A record struct definition for a 2D vector, with the value memory kind

Marshalling Structs

When embedding Mun in other languages, you will probably want to retrieve, modify and send structures across the boundary - of the two languages. When this so-called marshalling occurs, there is often an associated performance penalty because the Mun Runtime needs to perform runtime checks to validate the provided data types.

Mun provides a homogeneous interface for marshalling any struct through a StructRef- a reference to a heap-allocated struct. The Mun Runtime automatically handles the conversion from a function return type into a StructRef and function arguments into Mun structs.

For structs with the gc memory kind, marshalling reuses the memory allocated by the garbage collector, but for structs with the value memory kind this requires their value to be copied into heap memory.

Listing 4-11 shows how to marshal Vector2 instances from Mun to Rust and vice versa, using the vector2_new and vector2_add functions - previously defined.

extern crate mun_runtime;
use mun_runtime::{Runtime, StructRef};
use std::env;

fn main() {
    let lib_path = env::args().nth(1).expect("Expected path to a Mun library.");

    // Safety: We assume that the library that is loaded is a valid munlib
    let builder = Runtime::builder(lib_path);
    let mut runtime = unsafe { builder.finish() }
        .expect("Failed to spawn Runtime");

    let a: StructRef = runtime.invoke("vector2_new", (-1.0f32, 1.0f32)).unwrap();
    let b: StructRef = runtime.invoke("vector2_new", (1.0f32, -1.0f32)).unwrap();
    let added: StructRef = runtime.invoke("vector2_add", (a, b)).unwrap();
}

Listing 4-11: Marshalling Vector2 instances

Accessing Fields

The API of StructRef consists of three generic methods for accessing fields: get, set, and replace; respectively for retrieving, modifying, and replacing a struct field. The desired field is specified using a string field_name parameter, which is identical to the one used with the dot notation in Mun code.

extern crate mun_runtime;
use mun_runtime::{Runtime, StructRef};
use std::env;

fn main() {
    let lib_path = env::args().nth(1).expect("Expected path to a Mun library.");

    // Safety: We assume that the library that is loaded is a valid munlib
    let builder = Runtime::builder(lib_path);
    let mut runtime = unsafe { builder.finish() }
        .expect("Failed to spawn Runtime");

    let mut xy: StructRef = runtime.invoke("vector2_new", (-1.0f32, 1.0f32)).unwrap();
    let x: f32 = xy.get("x").unwrap();
    xy.set("x", x * x).unwrap();
    let y = xy.replace("y", -1.0f32).unwrap();
}

Listing 4-12: Accessing fields of a StructRef

Hot Reloading

Mun is able to hot reload structs, as well as arrays of structs. Both apply recursively, so a struct containing a struct member field and an array of arrays can also be hot reloaded.

To understand how we might use hot reloading, let's create the skeleton for a simulation. Start by creating a new project called buoyancy:

mun new buoyancy

and replace the contents of src/mod.mun with Listing 4-13.

The new_sim function constructs a SimContext, which maintains the simulation's state, and the sim_update function will be called every frame to update the state of SimContext. As Mun doesn't natively support logging, we'll use the extern function log_f32 to log values of the f32 type.

The subject of our simulation will be buoyancy; i.e. the upward force exerted by a fluid on a (partially) immersed object that allows it to float. Currently, all our simulation does it to log the elapsed time, every frame.

Filename: src/mod.mun

extern fn log_f32(value: f32);

pub struct SimContext;

pub fn new_sim() -> SimContext {
    SimContext
}

pub fn sim_update(ctx: SimContext, elapsed_secs: f32) {
    log_f32(elapsed_secs);
}
Listing 4-13: The buoyancy simulation with state stored in `SimContext`

To be able to run our simulation, we need to embed it in a host language. Listing 4-14 illustrates how to do this in Rust.

extern crate mun_runtime;
use mun_runtime::{Runtime, StructRef};
use std::{env, time};

extern "C" fn log_f32(value: f32) {
    println!("{}", value);
}

fn main() {
    let lib_dir = env::args().nth(1).expect("Expected path to a Mun library.");

    // Safety: We assume that the library that is loaded is a valid munlib
    let builder = Runtime::builder(lib_dir)
        .insert_fn("log_f32", log_f32 as extern "C" fn(f32));
    let mut runtime = unsafe { builder.finish() }
        .expect("Failed to spawn Runtime");

    let ctx = runtime.invoke::<StructRef, ()>("new_sim", ()).unwrap().root();

    let mut previous = time::Instant::now();
    const FRAME_TIME: time::Duration = time::Duration::from_millis(40);
    loop {
        let now = time::Instant::now();
        let elapsed = now.duration_since(previous);

        let elapsed_secs = if elapsed < FRAME_TIME {
            std::thread::sleep(FRAME_TIME - elapsed);
            FRAME_TIME.as_secs_f32()
        } else {
            elapsed.as_secs_f32()
        };

        let _: () = runtime.invoke("sim_update", (ctx.as_ref(&runtime), elapsed_secs)).unwrap();
        previous = now;

        unsafe { runtime.update() };
    }
}
Listing 4-14: The buoyancy simulation embedded in Rust

Now that we have a runnable host program, let's fire it up and see that hot reloading magic at work! First we need to start the build watcher:

mun build --watch --manifest-path=buoyancy/mun.toml

This will create the initial mod.munlib that we can use to run our host program in Rust:

cargo run -- buoyancy/target/mod.munlib

Your console should now receive a steady steam of 0.04... lines, indicating that the simulation is indeed running at 25 Hz. Time to add some logic.

Insert Struct Fields

Our simulation will contain a spherical object with radius r and density do that is dropped from an initial height h into a body of water with density dw. The simulation also takes the gravity, g, into account, but for the sake of simplicity we'll only consider vertical movement. Let's add this to the SimContext struct and update the new_sim function accordingly, as shown in Listing 4-15.

# pub fn main() {
#   new_sim();
#   new_sphere();
# }
pub struct SimContext {
    sphere: Sphere,
    water: Water,
    gravity: f32,
}

pub struct Sphere {
    radius: f32,
    density: f32,
    height: f32,
    velocity: f32,
}

pub struct Water {
    density: f32,
}

pub fn new_sim() -> SimContext {
    SimContext {
        sphere: new_sphere(),
        water: new_water(),
        gravity: 9.81,
    }
}

fn new_sphere() -> Sphere {
    Sphere {
        radius: 1.0,
        density: 250.0,
        height: 1.0,
        velocity: 0.0,
    }
}

fn new_water() -> Water {
    Water {
        density: 1000.0,
    }
}
Listing 4-15: Struct definitions of the buoyancy simulation

Runtime Struct Field Initialization

Upon successful compilation, the runtime will hot reload the new structs. Memory of newly added structs will recursively be zero initialized. This means that all fundamental types of a newly added structs and its child structs will be equal to zero.

We can verify this by replacing the log_f32(elapsed_secs) statement with:

    log_f32(ctx.gravity);

Indeed the console now receives a stream of 0 lines. Luckily there is a trick that we can employ to still manually initialize our memory to desired values by using this behavior to our advantage. Let's first add token: u32 to the SimContext:

    token: u32,

and set it to zero in the new_sim function:

        token: 0,

As before, the token value will be initialized to zero when the library has been hot reloaded. Next, we add a hot_reload_token function that returns a non-zero u32 value, e.g. 1:

fn hot_reload_token() -> u32 {
    1
}

Finally, we add this if statement to the sim_update function:

    if ctx.token != hot_reload_token() {
        let default = new_sim();
        ctx.sphere = default.sphere;
        ctx.water = default.water;
        ctx.gravity = default.gravity;
        ctx.token = hot_reload_token();
    }

This piece of code will be triggered every time the hot_reload_token function returns a different value, but only once - allowing us to initialize the value of SimContext.

Edit Struct Fields

Time to add the actual logic for simulating buoyancy. The formula for calculating the buoyancy force is force = submerged volume * water density * gravity.

fn calc_submerged_ratio(s: Sphere) -> f32 {
    let bottom = s.height - s.radius;
    let diameter = 2.0 * s.radius;
    if bottom >= 0.0 {
        0.0
    } else if bottom <= -diameter {
        1.0
    } else {
        -bottom / diameter
    }
}

fn calc_sphere_volume(radius: f32) -> f32 {
    let pi = 3.1415926535897;
    let r = radius;

    3.0/4.0 * pi * r * r * r
}

fn calc_buoyancy_force(s: Sphere, w: Water, gravity: f32, submerged_ratio: f32) -> f32 {
    let volume = calc_sphere_volume(s.radius);
    volume * submerged_ratio * w.density * gravity
}

Next we need to convert force into acceleration using acc = force / mass. We don't readily have the sphere's mass available, but we can derive it using the sphere's volume and density: mass = volume * density. Instead of doing this every frame, let's replace the sphere's density field with a mass field:

pub struct Sphere {
    radius: f32,
    mass: f32,      // density: f32,
    height: f32,
    velocity: f32,
}

and pre-calculate it on construction:

fn new_sphere() -> Sphere {
    let radius = 1.0;
    let density = 250.0;

    let volume = calc_sphere_volume(radius);
    let mass = density * volume;

    Sphere {
        radius,
        mass,
        height: 1.0,
        velocity: 0.0,
    }
}

To initialize the sphere's mass field, we can employ the same trick as before; this time only initializing the sphere and incrementing hot_reload_token to 2:

    if ctx.token != hot_reload_token() {
        let default = new_sphere();
        ctx.sphere = default;
        ctx.token = hot_reload_token();
    }

Editing a field's name is only one of three ways that you can edit struct fields in Mun. In order of priority, these are the changes that the Mun Runtime is able to detect:

  1. If an old field and new field have the same name and type, they must have remained unchanged. In this case, the field can be moved.
  2. If an old field and new field have the same name, they must be the same field. In this case, we accept a type conversion and the field can potentially be moved.
  3. If an old field and new field have different names but the same type, the field could have been renamed. As there can be multiple candidates with the same type, we accept the renamed and potentially moved field that is closest to the original index of the old field.

Some restrictions do apply:

  • A struct cannot simultaneously be renamed and its fields edited.
  • A struct field cannot simultaneously be renamed and undergo a type conversion.

In both of the above cases, the difference will be recognized as two separate changes: an insertion and a deletion of the struct/field.

Remove Struct Fields

We now have all of the building blocks necessary to finish our buoyancy simulation. If the sphere is (partially) submerged, we calculate and add the buoyancy acceleration to the velocity. We also always subtract the gravitational acceleration from the velocity to ensure that the sphere drops into the water.

One important thing to take into account when running simulations is to multiply the accelerations and velocities with the elapsed time, as we are working in discrete time.

Last but not least, let's log the sphere's height to the console, so we can verify that the simulation is running correctly.

    let submerged_ratio = calc_submerged_ratio(ctx.sphere);
    if submerged_ratio > 0.0 {
        // Accelerate using buoyancy
        let buoyancy_force = calc_buoyancy_force(
            ctx.sphere,
            ctx.water,
            ctx.gravity,
            submerged_ratio
        );
        let buoyancy_acc = buoyancy_force / ctx.sphere.mass;
        ctx.sphere.velocity += buoyancy_acc * elapsed_secs;
    }
    
    // Accelerate using gravity
    ctx.sphere.velocity -= ctx.gravity * elapsed_secs;

    // Apply velocity
    ctx.sphere.height += ctx.sphere.velocity * elapsed_secs;

    log_f32(ctx.sphere.height);

When the simulation has been hot reloaded, the console should now log height values of the ball that are indicative of a sphere bobbing on the waves.

Now that our simulation is completed, we no longer need the token field, hot_reload_token function, and if statement. The token field can be safely removed and the simulation hot reloaded without losing any state.

Well done! You've just had your first experience of hot reloading.

Developer documentation

These chapters provide some insight in the design of the Mun compiler and language server.

  • Salsa provides some information about Salsa and how we use it.
  • Building LLVM details how to build LLVM for Mun.

Salsa

The Mun backend makes extensive use of Salsa to perform on-demand, incremental computations. salsa lets you define your program as a set of queries. Every query is used like a function K -> V that maps from some key K to a value of V. Changing the input K also invalidates the result of computing K -> V. However, if the result V does not change after changing K the invalidation is not propagated to computations that use V as an input. This enables Mun to cache a lot of computations between compilations, enabling very fast rebuilding on incremental changes.

For more in depth information on Salsa please refer to the Salsa repository or the Salsa book.

Databases

Queries are grouped together in so-called databases defined as traits. Database traits can have super traits that enable them to build upon other queries.

Mun Salsa Database super trait relations

Figure 4-1: Super trait relations between Salsa databases

This design is heavily inspired by Rust Analyzer.

SourceDatabase

The SourceDatabase provides queries and inputs that form the basis of all compilation because it contains the original source text.

The SourceDatabase holds SourceRoots which are groupings of files. SourceRoots define the hierarchy of files relative to a certain root. For the system it's not interesting where the files come from and how they are actually organized on disk. SourceRoots provide an abstraction over the filesystem that provides just enough information for the backend to function.

Another relevant query is the line_index query which provides line and column information for characters in a source file. This is used for instance when emitting human-readable diagnostic messages with line and column numbers.

AstDatabase

The AstDatabase provides syntax tree transformations on top of the source text. The parse function returns the Concrete Syntax Tree (CST) of a file provided by the mun_syntax crate. Any errors related to parsing are stored in the tree itself.

As an example of where Salsa helps Mun achieve faster incremental compilation: changing a single character in a file will also change the CST which in turn will invalidate a lot of computations. However, most changes occur within blocks (like functions or structs). To not invalidate everything in the changed file, another query transforms the CST to an AstIdMap which assigns ids to all top-level items in a file. This enables referring to those ids in later stages of compilation which will change much less often.

InternDatabase

The InternDatase links certain locations in the syntax tree to ids. For instance, it associates a function syntax node with a FunctionId. This id is then used everywhere in code to refer to the specific function instance. The same happens for other items like structs. The purpose of this interning is again to allow better incremental compilation for queries that only need to refer to certain constructs without having to refer to the volatile syntax tree elements.

DefDatabase

The DefDatabase provides definitions extracted from the syntax trees. It provides desugared or lowered versions of language constructs that enable other queries to query specific information for these constructs. Extracting this data again enables faster incremental compilation because if the extracted data does not change dependent queries also will not change.

HirDatabase

The HirDatabase resolves all types and names from the definitions. It also performs type inferencing. The resulting High-Level Representation (HIR) is the input for the LLVM Intermediate Representation (LLVM IR). The HIR data structures can also contain errors, but only if there are no errors present will generating IR be valid.

CodeGenDatabase

Finally, the CodeGenDatabase provides queries to construct assemblies from groups of files. Assemblies are the final product of the Mun compiler. They are the binaries that can be executed by the runtime. This stage is where Salsa really shines; only if something actually changed in the HIR will an assembly be created.

Language Server

The language server uses the same backend as the compiler. Whereas the compiler creates assemblies the language server queries the different databases to provide a user with code intelligence. It also uses the exact same diagnostics paths as the compiler, which means there is never a mismatch between the compiler and the language server - and vice versa. If compiling your code results in an error it will also be immediately visible in your IDE.

Building LLVM

Most, if not all, dependencies can be build by cargo except for LLVM. The Mun compiler makes heavy use of LLVM for all code-generation capabilities. Installing it, however, can be tricky. This document is a short guide on how to install LLVM on your machine so you can build Mun yourself.

Currently, Mun targets LLVM 14 so everything in this document refers to that version. However, these instructions should also hold for newer versions.

Prebuild binaries

On some OSes prebuild binaries are available.

NOTE: not all prebuild releases contain all libraries required by Mun. For instance, prebuild releases of LLVM for Windows are missing required executables and libraries.

Windows

For Windows, we maintain a repository which contains releases that can can be used to build Mun. These releases are also used on our CI runners.

To use a release, download and extract it to your machine. To make sure the build pipeline can find the binaries, add an environment variable called LLVM_SYS_140_PREFIX that points to the folder where you extracted the release. It is also possible to add the bin folder of the release to your path but using the environment variables allows you to have multiple LLVM releases on your machine.

For LLVM 8 you should add the LLVM_SYS_80_PREFIX environment variable, for LLVM 14 add LLVM_SYS_140_PREFIX.

Debian & Ubuntu

LLVM provides APT repositories for several versions of LLVM which contain all the required binaries required to build Mun. Visit the LLVM APT website to find the correct APT repository to use. To add the repository:

# Retrieve the archive signature
wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add -

# Add the repository
# ${REPO_NAME} should be something like:
# deb http://apt.llvm.org/focal/ llvm-toolchain-focal-14 main
#
# The `add-apt-repository` command is installed by the `software-properties-common` package:
# sudo apt install software-properties-common 
add-apt-repository "${REPO_NAME}"

Once you have the proper APT repository configured you can install the required LLVM binaries with:

apt install llvm-14 llvm-14-* liblld-14* libclang-common-14-dev

MacOS

Brew contains a cask for LLVM that can be used to build Mun:

brew install llvm@14

After installing LLVM, you can either add the bin folder of the release to your path; or you can add a release-specific environment variable called LLVM_SYS_140_PREFIX that points to the release:

export LLVM_SYS_140_PREFIX=$(brew --prefix llvm@14)

Adding the LLVM_SYS_140_PREFIX variable is usually easier because the LLVM binaries will not conflict with any preinstalled version of LLVM and it allows you to easily install another version of LLVM side-by-side.

For LLVM 8 you should add the LLVM_SYS_80_PREFIX environment variable, for LLVM 14 add LLVM_SYS_140_PREFIX.

Building from source

If there are no prebuild packages available for your OS, your best bet is to install LLVM from source. The build time of LLVM is quite long so this is a relatively time-consuming process.

You need at least:

Download a dump of the LLVM repository from the LLVM github repository and extract it somewhere, e.g.:

wget -qO- \
  https://github.com/llvm/llvm-project/archive/llvmorg-14.0.1.tar.gz | \
  tar xzf -

Then build the required components and install them to ~/local.

cd llvm-project-llvmorg-14.0.1/llvm
mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="lld;clang" -DCMAKE_INSTALL_PREFIX=$HOME/local -DCMAKE_INSTALL_PREFIX=$HOME/local -DLLVM_ENABLE_LIBXML2=OFF
make install -j

After LLVM is build, make sure to add the $HOME/local/bin to you path or add an environment variable LLVM_SYS_140_PREFIX (or LLVM_SYS_140_PREFIX depending on the LLVM version you installed) that points to $HOME/local.

RFCs

Request for Comments (RFC) are documents that outline a proposed feature to be added to Mun. They are added in a pull request where the community is able to comment on the proposal. This is the time to gather feedback, support and reach a consensus.

Eventually, somebody on the Mun core team will either accept the RFC by merging the pull request, at which point the RFC is 'active', or reject it by closing the pull request.

Once an RFC becomes active then authors may implement it and submit the feature as a pull request to the Mun repository. An active RFC does not mean the feature will ultimately be merged. It means all the major stakeholders have agreed to the feature and are amenable to merging it.

Modifications to active RFCs can be done in followup PR's.

Summary

This is an RFC to introduce the concept of dynamically sized arrays in Mun. These are arrays where the length of the array is not yet known at compile time. This is different from statically sized arrays, where the size of the array is known at compile time.

Motivation

Reasons for having dynamically allocated arrays.

  • Having only statically sized arrays limits the use of Mun to only use types which have a known size.

  • Dynamically sized arrays pose more flexibility than statically sized arrays. See C# and Swift as an example where statically sized arrays are rarely used.

  • Tuples can already be used for statically sized arrays (although not very ergonomically).

    struct(value) ArrayOfFiveFloats(f32,f32,f32,f32,f32)
    
  • Dynamically sized arrays are easily understandable from a user perspective

Detailed design

This RFC proposes to add the language construct of a dynamically sized array as well as several additions to the language which are required as supporting features.

The type of a dynamically sized arrays is introduced as a new language construct indicated as [T]. This is similar to Rusts array syntax as well as Swifts shortened array form.

let an_array: [f32] = construct_array()
fn construct_array() -> [f32] {
    // ... code
}

Arrays are reference types and can contain both reference and values types.

let x: [Foo] = // ...
let y: [f32] = // ...

Constructing arrays

Arrays can be constructed using array literals: a comma-separated list of values. Without any other information, Mun creates an array that includes the specified values, automatically inferring the array's element type. For example:

// An array of integers.
let odd_numbers = [1,3,5,7,9,11,13,15]

Since Mun doesn't allow uninitialized values, arrays cannot be preallocated with default values and then initialized in a second operation. To accommodate for this common behavior arrays can be dynamically resized.

let i = 0;
let array: [i32] = []
while i < count {
    array.push(i);
    i += 1;
}

This behavior is equivalent in Swift and is similar to a Vec<T> in a Rust.

Every array reserves a specific amount of memory to hold its contents. When you add elements to an array and that array begins to exceed its reserved capacity, the array allocates a larger region of memory and copies its elements into the new storage.

TODO: In the future it would be nice if you can create an array with an initially allocated size. This will reduce the number of reallocations required when constructing a large array. To copy Swift:

let array = [i32]::with_capacity(some_initial_size);

TODO: In the future it would be nice if you can create an array by replicating a certain expression.

// constructs an array of `count` elements all initialized to 0.0
let array = [0.0; count] 

// constructs an array of `count` elements all initialized to `Foo {}`
let array = [Foo{}; count]

// constructs an array of `count` elements all initialized to `value`
let array = [value; count]

Accessing Array Values

Array's can be indexed with the index operator. Array indexes are 0 based.

let an_array: [f32] = construct_array()
let first_element = an_array[0]
an_array[1] = 5.0

For now, accessing elements out of bounds results in undefined behavior. When we have support for exceptions/panics/error handling this should be resolved.

To get number of elements in an array you can use the len() member function. To add a new element you can use the push(element: T) function.

Generics

Arrays are the first generic feature in Mun so this has to be added in the backend (HIR and IR). The goal of this RFC is not to implement generics so full parser support is not required. Implementing access, len(), and push(element: T) can be implemented by hardcoding this in HIR.

Features to be implemented

This is a high-level list of required features to be implemented to support arrays in Mun.

ABI support for array types

Similar to structs, arrays are complex types that reference another type. We can probably implement this by adding something along the lines of:

enum TypeInfoData {
    // ...
    Array(element: TypeInfo const*)
}

Syntax support for arrays

  • Parsing of array types: [T].
  • Parsing of array literals: [1,2,3]
  • Indexing expressions: a[x]

HIR support for arrays

  • Add a Ty for arrays.
  • Support for parsing array literals.
  • Support for array element inferencing. e.g
    let a = [1,2,3,4]
    foo(a[12])
    // what is the type of a? [i32]? [i8]? [usize]?
    

Code generation support for arrays

  • Construction operations. Probably requires a new intrinsic.
  • Indexing operations
  • Path-expression indexing operations

Garbage collection support

The garbage collector has to be able construct and traverse array elements.

Runtime support

Similar to StructRef and RootedStruct we will need ArrayRef and RootedArray.

Optionally, it would be nice if you could create an empty array from Rust or C#. This will require implementing TypeInfo and TypeRef to be able to construct an array of a certain type.