Requirements

This document describes the requirements for the project.

Software Versions

  • GHC: 9.6.6
  • HLS: 2.9.0.1
  • Cabal: 3.10.3.0
  • Stack: 3.1.1

Table of Contents - System Requirements

Linux Debian

The following distro packages are required: build-essential curl libffi-dev libffi6 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 Version >= 11 && <= 12

The following distro packages are required: build-essential curl libffi-dev libffi7 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 Version >= 12

The following distro packages are required: build-essential curl libffi-dev libffi8 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5

Linux Ubuntu

The following distro packages are required: build-essential curl libffi-dev libffi6 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 Version >= 20.04 && < 20.10

The following distro packages are required: build-essential curl libffi-dev libffi7 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 Version >= 20.10 && < 23

The following distro packages are required: build-essential curl libffi-dev libffi8ubuntu1 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 Version >= 23

The following distro packages are required: build-essential curl libffi-dev libffi8ubuntu1 libgmp-dev libgmp10 libncurses-dev

Linux Fedora

The following distro packages are required: gcc gcc-c++ gmp gmp-devel make ncurses ncurses-compat-libs xz perl

Linux CentOS

The following distro packages are required: gcc gcc-c++ gmp gmp-devel make ncurses ncurses-compat-libs xz perl Version >= 7 && < 8

The following distro packages are required: gcc gcc-c++ gmp gmp-devel make ncurses xz perl

Linux Alpine

The following distro packages are required: binutils-gold curl gcc g++ gmp-dev libc-dev libffi-dev make musl-dev ncurses-dev perl tar xz

Linux (generic)

You need the following packages: curl g++ gcc gmp make ncurses realpath xz-utils. Consult your distro documentation on the exact names of those packages.

FreeBSD

The following distro packages are required: curl gcc gmp gmake ncurses perl5 libffi libiconv

Windows

On Windows, msys2 should already have been set up during the installation, so most users should just proceed. If you are installing manually, make sure to have a working mingw64 toolchain and shell.

Comparative Study - Glados Project

This study evaluates the decisions made for the Glados Project, where we aim to design and implement a custom imperative programming language using Haskell. This language will be compiled into WebAssembly (Wasm) for efficient execution. The study explores the reasons for selecting Haskell, the imperative programming paradigm, and WebAssembly as the compilation target. We also compare tools, approaches, and technologies in parsing, runtime, and language design, emphasizing performance, scalability, and maintainability.


1. Programming Language for Development: Haskell

Relevance and Justification

Haskell was chosen as the development language due to its powerful capabilities for constructing parsers, interpreters, and compilers. Its functional nature and abstractions enable the development of highly modular and maintainable codebases.

  • Pros:

    • Strong support for parsing libraries like Parsec and Megaparsec, simplifying grammar and syntax design.
    • Immutability and pure functional programming paradigms facilitate reasoning about program correctness.
    • Rich type system and algebraic data types (ADTs) enable safe and clear representation of language constructs.
  • Cons:

    • Steep learning curve for developers unfamiliar with functional programming.
    • Runtime performance of Haskell programs might lag behind C++ for specific tasks; however, its suitability for compilation pipeline development outweighs this limitation.

Comparative Study:

  • Rust: While offering excellent performance and memory safety, Rust's low-level nature can make the development of parsers and compilers less expressive compared to Haskell.
  • Python: Easy to use for prototyping compilers but lacks the performance and static type-checking needed for complex language implementations.

2. Programming Paradigm: Imperative

Relevance and Justification

While Haskell is a functional language, we have chosen to design Glados as an imperative programming language, reflecting the needs of developers accustomed to control-flow structures and mutable state.

  • Pros:

    • Familiarity for most developers due to the prevalence of imperative languages like C, Java, and Python.
    • Straightforward translation into WebAssembly, which is inherently imperative.
    • Simplifies implementation of state-based computations and control-flow mechanisms like loops and conditionals.
  • Cons:

    • Imperative languages require careful management of mutable state, which may introduce bugs like race conditions or unexpected side effects.

Comparative Study:

  • Functional Languages: While offering declarative programming benefits, functional languages introduce a steeper learning curve for imperative-oriented developers.
  • Object-Oriented Languages: Overhead introduced by class hierarchies and object management is unnecessary for the lightweight, performance-oriented design of Glados.

3. Compilation Target: WebAssembly (Wasm)

Relevance and Justification

WebAssembly was selected as the compilation target due to its cross-platform compatibility, lightweight binary format, and growing adoption for high-performance web and server-side applications.

  • Pros:

    • Performance: WebAssembly offers near-native execution speed, making it ideal for computationally intensive programs.
    • Portability: Wasm can be executed in web browsers and standalone runtimes, enhancing deployment flexibility.
    • Security: Sandboxed execution model prevents unauthorized access to the host environment.
  • Cons:

    • Limited debugging tools compared to traditional binaries.
    • Learning curve for developers new to Wasm bytecode.

Comparative Study:

  • LLVM IR: Offers high-performance backend support and optimizations but lacks the portability and security guarantees of WebAssembly.
  • JavaScript: While highly portable, its performance and scalability fall short for resource-intensive applications.

4. Parsing Approach

Relevance and Justification

Efficient parsing is critical to language design. For Glados, we are using Megaparsec, a Haskell library for constructing parsers.

  • Pros:

    • Combines composability and expressiveness with strong error reporting.
    • Haskell’s monadic parsing simplifies handling complex grammar.
    • Modular design allows easy addition or modification of language constructs.
  • Cons:

    • Parsing performance might not match low-level tools like ANTLR, but the trade-off is acceptable given Haskell’s strengths in parser design.

Comparative Study:

  • ANTLR: While high-performing and widely used, its Java-centric ecosystem can be less intuitive for Haskell developers.
  • Flex/Bison: Suitable for low-level parser generation but lacks the abstraction and readability of Haskell-based solutions.

5. Runtime Architecture

Relevance and Justification

The Glados runtime is designed to operate within a WebAssembly-compatible virtual machine for portability and performance.

  • Pros:

    • Lightweight: Wasm binaries are optimized for fast startup and low resource usage.
    • Cross-platform: Ensures compatibility across web, mobile, and server environments.
  • Cons:

    • Developing custom runtime features (e.g., garbage collection) requires additional effort.

Comparative Study:

  • Custom Virtual Machine: Provides more control but introduces significant development overhead compared to Wasm-based solutions.
  • Native Executables: Lack the portability and sandboxing advantages of Wasm.

6. Development Tools

Relevance and Justification

For development, we chose tools that complement Haskell's ecosystem and facilitate collaboration, testing, and documentation.

  • GitHub: Version control and CI/CD integration.
  • Stack: Simplifies dependency management and builds in Haskell.
  • Visual Studio Code: Primary editor, enhanced with Haskell plugins.

Comparative Study:

  • GitLab: Similar to GitHub but less widely adopted for open-source Haskell projects.
  • Cabal: Provides flexibility but lacks the simplicity and automation of Stack.

7. Documentation

Relevance and Justification

We prioritize interactive, markdown-based documentation to ensure clarity and maintainability. mdBook was chosen for its seamless integration with modern development workflows.

Comparative Study:

  • Sphinx: Offers advanced documentation features but has a steeper learning curve.
  • Doxygen: Excellent for API documentation but less effective for general-purpose guides.

8. Testing Strategy: Stack with Unit Tests

Relevance and Justification

Testing is a cornerstone of ensuring the correctness, stability, and maintainability of the Glados programming language. We have chosen Stack, a Haskell build tool, to manage our testing framework. Stack integrates seamlessly with Haskell's ecosystem and supports automated unit testing.

  • Pros:

    • Integration with Stack: Simplifies test suite setup and execution.
    • Automation: Enables continuous integration pipelines with GitHub Actions, ensuring code changes do not introduce regressions.
  • Cons:

    • Writing comprehensive test cases can be time-consuming, particularly for complex grammar and edge cases in the language design.

Testing Methodology

  1. Unit Tests:

    • Focus on testing individual components like the parser, AST transformations, and WebAssembly generation.
    • Example: Ensuring that specific syntactic constructs are correctly parsed into the corresponding AST nodes.
  2. End-to-End Tests:

    • Verify that complete programs written in Glados compile correctly to WebAssembly and execute as intended.
    • Example: A Glados program performing arithmetic should produce the correct output in a WebAssembly runtime.
  3. Property-Based Testing:

    • Tools like QuickCheck are used to test properties of language constructs.
    • Example: Testing that the order of arithmetic operations respects operator precedence.

Comparative Study

  • Stack:

    • Provides native integration with Haskell’s testing libraries.
    • Streamlines dependency management for test-specific modules.
    • Automatically integrates with CI/CD pipelines.
  • Cabal: Offers similar functionality but lacks the simplicity and automation of Stack, making it less suitable for rapid development and testing workflows.

  • External Testing Frameworks: While tools like Pytest or JUnit are effective for their respective ecosystems, they are not applicable in a Haskell-based project.


Conclusion

The Glados Project leverages Haskell for its robust parsing capabilities, develops an imperative language to cater to developer familiarity, and compiles to WebAssembly for portability and performance. This approach ensures that the language is both developer-friendly and technically optimized for modern environments. The choices, supported by comparative analysis, provide a strong foundation for a scalable and maintainable language implementation.

Keywords

Types

  • Signed Integer: i32, i64
  • Unsigned Integer: u8, u16, u32, u64 -- Not implemented yet
  • Floating point: f32, f64
  • Structure: { } -- Not implemented yet
  • Array: [ ] -- Not implemented yet
  • Enum: < > -- Not implemented yet

Syntax

  • Variable: let
  • Typing: :, ->
  • Conditions: if elif else
  • Loops: while
  • Qualifiers: mut, *
  • Functions: fn
  • Dereference: @
  • Reference: &
  • Type Definition: type

Exemples

Variables

To declare a variable, use the let keyword followed by the variable name, a type, and a value. Typing variables is done using : and a combination of type, type qualifier, and pointer declaration.

let a: i32 = 42;

All variables are const by default. You'll need to explicitly declare it as mutable with the mut qualifier.

let a: i32 = 42; // const i32
let b: mut i32 = 42; // mutable i32

This also applies to pointers:

let ptr1: *i32 = 42; // pointer to const i32
let ptr2: *mut i32 = 42; // pointer to mutable i32

Pointers (Work in progress)

The syntax for pointers in xenon is similar to C, with one important difference, Dereferencing pointers is done with @

let a: i32 = 42; // const i32
let ptr: *i32 = &a; // pointer to a
let b: i32 = @ptr; // b = 42

Functions

Functions are declared with the fn keyword, followed by the function name, arguments, and return type, defined by ->.

fn add(a: i32, b: i32) -> i32
{
    return a + b;
}

Loops

Loops in xenon are limited to while loops.

let i: mut i32 = 0;

while (i < 10) {
    i += 1;
}

Conditionals

Conditionnals follow similar patterns than other languages, using if, elif and else keywords.

let a: i32 = 42;

if (a == 42) {
    // do something
} elif (a == 43) {
    // do something else
} else {
    // do something else
}

Structs (Work in progress)

You can define structs by using brackets and the 'type' keyword, which is used to define a new type like a struct or an enum.

type Point = {
    x: i32,
    y: i32
};

let p: Point = {1, 2};

Declaring an unammed struct is also possible.

let p: {x: i32, y: i32} = {1, 2};

As well as accessing struct members like you would with an array.

let p: {x: i32, y: i32} = {1, 2};
let x: i32 = p[0]; // x = 1
let y: i32 = p[1]; // y = 2

Arrays (Work in progress)

Arrays are declared with the [size: type] syntax.

let arr: [10: i32] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];

Two-dimensional arrays (and more) are also possible.

let tab: [2: [2: i32]] = [[0, 1], [2, 3]];

Compilation Process Documentation

This document outlines the various stages involved in the compilation process of the Xenon language, from parsing the source code to generating the final WebAssembly (WASM) binary.

1. Lexer

The lexer is responsible for converting the raw source code into a stream of tokens. Tokens are the basic building blocks of the language, such as keywords, identifiers, literals, and operators.

2. Parser

The parser takes the stream of tokens produced by the lexer and constructs an Abstract Syntax Tree (AST). The AST represents the hierarchical structure of the source code.

3. Semantic Analyzer

The semantic analyzer checks the AST for semantic errors, such as type mismatches, undeclared variables, and other logical errors. It ensures that the program adheres to the language's rules.

4. Optimizer

The optimizer runs pre-computation steps to evaluate constant expressions, unwrap and remove branches of code that are guaranteed to be/will never be executed and perform loop unrolling where applicable.

More details about the optimization steps on the here

5. Fill WASM Module Data

The Fill WASM Module Data module is responsible for converting the IR into a WASM module structure. This involves collecting types, functions, globals, and other sections required for the WASM binary.

6. Write WASM

The Write WASM module encodes the WASM module structure into a binary format that can be executed by a WASM runtime. This involves encoding various sections such as types, functions, exports, and code.

Compilation Process (Mermaid Diagram)

graph TD
    style A fill:#ff99ff,stroke:#333333,stroke-width:4px
    style B fill:#99bbff,stroke:#333333,stroke-width:4px
    style C fill:#ffbb99,stroke:#333333,stroke-width:4px
    style D fill:#9f9,stroke:#333,stroke-width:4px
    style E fill:#ffb,stroke:#333,stroke-width:4px
    style F fill:#f99,stroke:#333,stroke-width:4px
    style G fill:#bbb,stroke:#333,stroke-width:4px
    style H fill:#c9f,stroke:#333,stroke-width:4px
    style S fill:#bbb,stroke:#333,stroke-width:4px

    subgraph Compilation Process
        S["<b style='color:black'>Source Code</b><br/><span style='color:black'>.xn source file</span>"] -->|Source File| A["<b style='color:black'>Lexer</b><br/><span style='color:black'>Converts source code to tokens</span>"]
        A -->|Token Stream| B["<b style='color:black'>Parser</b><br/><span style='color:black'>Generates Abstract Syntax Tree</span>"]
        B -->|Abstract Syntax Tree| H["<b style='color:black'>Interpreter</b><br/><span style='color:black'>Interprete xenon code</span>"]
        B -->|Abstract Syntax Tree| C["<b style='color:black'>Semantic Analyzer</b><br/><span style='color:black'>Checks types and logic</span>"]
        C -->|Validated AST| D["<b style='color:black'>Optimizer</b><br/><span style='color:black'>Optimizes and simplifies code</span>"]
        D -->|Optimized AST| E["<b style='color:black'>Fill WASM Module Data</b><br/><span style='color:black'>Prepares WASM structure</span>"]
        E -->|WASM Module Structure| F["<b style='color:black'>Write WASM</b><br/><span style='color:black'>Encodes final binary</span>"]
        F -->|WASM Binary| G["<b style='color:black'>WASM VM</b><br/><span style='color:black'>Executes the binary</span>"]
    end

Optimizer

Using the Interpreter's tool set, the optimizer is able to modify the code structure by using multiple compiler optimization methods.

Constant Folding

By keeping track of defined constant variables, references to them are substituted with their actual value.

let x: i32 = 12;
return x;

Becomes

return 12;

This also works for binary operation that uses literal and/or constants.

let x: i32 = 10 + 5;
return x * 2;

Becomes

return 30;

Conditional Unwrapping

Using constant folding, Conditional branches that have constant value are pre-computed, the resulting body (if/else) is then inserted at its place. The other body is discarded.

let foo: mut f32 = 1.7;
let x: i32 = 1;

if (x) {
    foo = 12.2;
} else {
    foo = 4.6;
}

Becomes

let foo: mut f32 = 1.7;
foo = 12.2;

Returning Branch Unpacking

When entering an If statement that is guaranteed to return, its else branch (if any) is unpacked to the main body since the rest of the body will never be executed if the main branch of the If is entered.

if (foo == bar) {
    foo = 20;
    return 1;
} else {
    foo = 13;
}
bar = 20;
return 0;

Becomes

if (foo == bar) {
    foo = 20;
    return 1;
}
foo = 13;
bar = 20;
return 0;

Dead Code Elimination

Code that succeed a return statement is discarded as it will never be executed. This pairs well with returning branch unpacking as it can lead to mutiple return statements on the same branch.

bar = 10;
return 0;
let foo: mut i32 = 4;
while (foo > 0) {
    bar = bar * 2;
}

Becomes

bar = 10;
return 0;

Function Inlining

Functions that do simple operations (including those that are optimized as such) are inlined when used as expressions. The function definition itself is not discarded.

fn compute(a: i32, b: i32) -> i32
{
    let c: i32 = 12;
    return a * b * c;
}

let x: mut i32 = compute(10, 2);

Becomes

fn compute(a: i32, b: i32) -> i32
{
    return a * b * 12;
}

let x: mut i32 = 240;

Loop Unrolling

Some while loops can have their content unrolled into the main branch when specific conditions are met:

  • The condition of the while loop is constant expect a single mutable variable (named the loop variable)
  • The body of the loop modifies the loop variable in a constant way (i.e. i = i + 1 or i = compute(i, 13) assuming compute is an inlineable function)
  • The while in finite and has at most 16 loops
let i: mut i32 = 0;
let nb: mut i32 = 1;

while (i < 3) {
    nb = nb * nb + i;
    i = i + 1;
}

Becomes

let i: mut i32 = 0;
let nb: mut i32 = 1;

nb = nb * nb + i;
i = i + 1;
nb = nb * nb + i;
i = i + 1;
nb = nb * nb + i;
i = i + 1;

Interpreter

The Xenon Interpreter allows you to run Xenon code without the need to compile it beforehand. Whether or not you should compile or interpret your code depends on your specific needs:

Compiled

  • Execution speed: Running compiled code is generally faster as there isn't the overhead that comes with interpreting code.
  • Program encapsulation: If your program is specifically made to execute a single task, it may be more appropriate to compile your code.

Interpreted

  • Quick Usage: Using an interpreter is quite useful when you need to execute some small code with varying input on the fly.
  • Control: Interpreters usually offer some additional features that allow you interact with your program in a way you cannot with compiled code. The Xenon Interpreter offers those features in the form of Commands.

Usage

If not already done, you'll need to clone the Xenon repository and build the interpreter (xin).

Then, simply run the program:

./xin

To run a single file without starting the command line interface, use the -e flag. Note that this execution mode will print the output of functions onto the terminal.

./xin -e main.xn

You can also pre-load Xenon files while launching it:

./xin foo.xn bar.xn

You'll then be put into a command line interface where can write the code. It supports standard CLI input features such as line editing and command history. When writing a single statement, you do not have to terminate it by a semicolon.

Xenon Interpreter x.x.x
Type '/help' for a list of commands.
>> [YOUR COMMAND]

Note: The interpreter does not support redirecting multiple lines of code into it when running it (i.e. ./xin < foo.xn). If you need to declare a complex multi-line construct like a function that cannot be written in a single line, you can write it into a file and load it into the interpreter.

Commands

When in the CLI, you can enter special commands to interact with the live execution. All commands must be prefixed with / to be recognized.

  • env: Lookup data from the virtual environment (variables, functions and custom types), using it without any argument will dump the entire environment. Adding any number of argument will only show the ones that match them.
  • load: Load xenon files into the virtual environment. Each given argument must be a path to a valid xenon file. If not arguments are given, this command is a no op. Loading a Xenon file into the interpreter will import all of its variables, functions and custom types (Note: this will also execute all standalone statments like function calls and variable reassignments)
  • exit: Exit the interpreter.
  • help: Print a list of commands.

Interpreter process (Mermaid Diagram)

graph TD
    style A fill:#ff99ff,stroke:#333333,stroke-width:4px
    style B fill:#99bbff,stroke:#333333,stroke-width:4px
    style C fill:#ffbb99,stroke:#333333,stroke-width:4px
    style D fill:#9f9,stroke:#333,stroke-width:4px
    style E fill:#ffb,stroke:#333,stroke-width:4px
    style F fill:#f99,stroke:#333,stroke-width:4px
    style G fill:#bbb,stroke:#333,stroke-width:4px
    style H fill:#c9f,stroke:#333,stroke-width:4px
    style S fill:#bbb,stroke:#333,stroke-width:4px

    subgraph Interpreter Process
        S["<b style='color:black'>Source Code</b><br/><span style='color:black'>.xn source file</span>"] -->|Source File| A["<b style='color:black'>Lexer</b><br/><span style='color:black'>Converts source code to tokens</span>"]
        A -->|Token Stream| B["<b style='color:black'>Parser</b><br/><span style='color:black'>Generates Abstract Syntax Tree</span>"]
        B -->|Abstract Syntax Tree| H["<b style='color:black'>Interpreter</b><br/><span style='color:black'>Interprete xenon code</span>"]
        B -->|Abstract Syntax Tree| C["<b style='color:black'>Semantic Analyzer</b><br/><span style='color:black'>Checks types and logic</span>"]
        C -->|Validated AST| D["<b style='color:black'>Optimizer</b><br/><span style='color:black'>Optimizes and simplifies code</span>"]
        D -->|Optimized AST| E["<b style='color:black'>Fill WASM Module Data</b><br/><span style='color:black'>Prepares WASM structure</span>"]
        E -->|WASM Module Structure| F["<b style='color:black'>Write WASM</b><br/><span style='color:black'>Encodes final binary</span>"]
        F -->|WASM Binary| G["<b style='color:black'>WASM VM</b><br/><span style='color:black'>Executes the binary</span>"]
    end

Virtual Machine (xrun)

This document describes a lightweight virtual machine designed to execute WebAssembly (WASM) files. Inspired by tools like Wasmer, this VM enables developers to run WASM binaries, invoke specific functions, and pass arguments directly from the command line.

Key Features

  1. Command-Line Interface: The VM accepts WASM files as input and allows invoking specific functions with optional arguments.
  2. Function Invocation: Supports invoking a default main function or any user-specified function within the WASM file.
  3. Input Flexibility: Accepts command-line arguments to be passed to the WASM module.
  4. Bytecode Parsing: Reads and interprets the binary content of the provided WASM file.
  5. Compact and Modular Design: Inspired by Wasmer's philosophy, focusing on simplicity and extensibility.

Argument Handling

  • Help (-h, --help): Displays usage instructions.
  • Invoke Specific Function: Command format: xrun <file.wasm> --invoke function_name [args...].
  • Default Function: If no function is specified, the VM invokes main in the WASM file.

Similarity to Wasmer

  • Shared Goals:

    • Execute WASM files efficiently.
    • Allow invocation of exported functions with arguments.
    • Provide a flexible, CLI-based interface for developers.
  • Key Differences:

    • This VM is more lightweight and tailored to custom workflows.
    • It emphasizes modularity and simplicity, with a minimal dependency footprint.

Usage Examples

  1. Invoke Default main:

    xrun my_program.wasm
    
  2. Invoke Specific Function:

    xrun my_program.wasm --invoke myFunction arg1 arg2
    
  3. Show Help:

    xrun -h
    

Conclusion

This virtual machine bridges the gap between WebAssembly execution and custom command-line workflows. By focusing on modularity and performance, it serves as a lightweight alternative to Wasmer for executing and interacting with WASM modules.

Unit Tester (xtest)

This script provides a mechanism to test a single file containing both source code and inline test definitions. It compiles the source file to WebAssembly, runs the specified tests, and validates the outputs.


Features

  • Parses inline test definitions from the source file.
  • Compiles the source file to a WebAssembly (.wasm) file.
  • Executes the WebAssembly functions and validates their outputs against the expected results.
  • Provides clear color-coded feedback for test results.
  • Handles edge cases like floating-point comparisons (e.g., appending .0 to outputs).

Usage

Run the script with the following command:

./xtest <file>

Parameters

  • <file>: The source file to be tested. This file must exist and contain the expected inline test cases.

Example:

./xtest examples/my_test_file.xn

File Requirements

The source file must:

  1. Be in a format compatible with the compiler (xcc) and the interpreter (xin).
  2. Include test definitions in the following format:
// TEST "function_name" IN "arguments" OUT "expected_output"

Example Test Cases

// TEST "add" IN "2 3" OUT "5"
// TEST "subtract" IN "10 4" OUT "6"

In this example:

  • The add function is expected to return 5 when invoked with 2 and 3.
  • The subtract function is expected to return 6 when invoked with 10 and 4.

Script Workflow

  1. Argument Validation:

    • Ensures that a single file argument is provided.
    • Checks if the specified file exists.
  2. Test Case Parsing:

    • Scans the file for lines matching the // TEST format.
    • Extracts the function name, arguments, and expected output.
  3. Compilation:

    • Compiles the source file to WebAssembly using xcc.
    • Handles compilation errors gracefully.
  4. Test Execution:

    • Invokes each function defined in the test cases.
    • Compares the output with the expected result.
  5. Results:

    • Displays color-coded results for each test:
      • Green ✔ for passed tests.
      • Red ✘ for failed tests.
    • Cleans up the generated .wasm file.
  6. Exit Status:

    • Exits with 0 if all tests pass.
    • Exits with 1 if any test fails.

Example Output

Compiled examples/my_test_file.xn to examples/my_test_file.wasm.
Running test cases:
add(2 3) = 5 ✔
subtract(10 4) = 7 (expected 6) ✘
Some tests failed.

Integration of Tests in a File

  1. Write your source code as usual.

  2. Add inline test definitions for functions you wish to validate. Use the exact format:

    // TEST "function_name" IN "arg1 arg2 ..." OUT "expected_result"
    
  3. Ensure that your functions are accessible and invokable as described in the test cases.

  4. Save the file and run the script to validate it.


Dependencies

  • xcc: Compiler to convert the source code to WebAssembly.
  • xrun: Virtual machine to execute the generated WebAssembly files.
  • xin: Interpreter to run the source code directly.
  • wasmer: Tool to execute the generated WebAssembly files.

Ensure these tools are installed and accessible in your PATH and/or present in the same directory as where the script is run.


Troubleshooting

  • File Not Found: Verify the file path and ensure the file exists.
  • Compilation Errors: Check the source code for syntax or compatibility issues.
  • Unexpected Results: Ensure test definitions match the actual function behavior and output.
<program> ::= <statement>*

<statement> ::= <variable_declaration>
              | <function_declaration>
              | <while_loop>
              | <if_statement>
              | <type_declaration>
              | <return_statement>

<variable_declaration> ::= "let" <identifier> ":" <type> "=" <expression> ";"

<function_declaration> ::= "fn" <identifier> "(" <parameters> ")" "->" <type> "{" <body> "}"

<parameters> ::= <parameter> ("," <parameter>)*
<parameter> ::= <identifier> ":" <type>

<while_loop> ::= "while" "(" <expression> ")" "{" <body> "}"

<if_statement> ::= "if" "(" <expression> ")" "{" <body> "}" <elif_statements> <else_statement>?
<elif_statements> ::= ("elif" "(" <expression> ")" "{" <body> "}")*
<else_statement> ::= "else" "{" <body> "}"

<type_declaration> ::= "type" <identifier> "=" <type_definition> ";"
<type_definition> ::= "{" <field> ("," <field>)* "}"
                    | "[" <integer> ":" <type> "]"
                    | "<" <identifier> ("," <identifier>)* ">"

<return_statement> ::= "return" <expression> ";"

<body> ::= <statement>*

<expression> ::= <literal>
               | <identifier>
               | <binary_operation>
               | <unary_operation>
               | <function_call>
               | "(" <expression> ")"

<binary_operator> ::= "+" | "-" | "*" | "/" 
                   | "&&" | "||" 
                   | "==" | "!=" | "<" | ">" | "<=" | ">=" 
                   | "&" | "|" | "^" | "<<" | ">>"

<unary_operator> ::= "!" | "-" | "~"

<function_call> ::= <identifier> "(" <arguments> ")"
<arguments> ::= <expression> ("," <expression>)*

<literal> ::= <integer_literal>
            | <float_literal>
            | <string_literal>
            | <boolean_literal>

<integer_literal> ::= [0-9]+
<float_literal> ::= [0-9]+"."[0-9]+
<string_literal> ::= "\"" [^"]* "\""
<boolean_literal> ::= "true" | "false"

<identifier> ::= [a-zA-Z_][a-zA-Z0-9_]*
<type> ::= "i32" | "i64" | "f32" | "f64" | <identifier>

Inspiration from Rust

Our language draws significant inspiration from Rust, a systems programming language that emphasizes safety, concurrency, and performance. Rust was designed to address common security issues found in other systems programming languages like C and C++. By adopting Rust's principles and features, we aim to create a language that ensures memory safety, provides strong concurrency guarantees, and maintains robust type safety.

Security Review of Rust

Rust is a systems programming language that emphasizes safety, concurrency, and performance. It was designed to address common security issues found in other systems programming languages like C and C++. Here is a review of Rust from a security perspective:

Memory Safety

One of Rust's primary goals is to ensure memory safety without sacrificing performance. Rust achieves this through its ownership system, which enforces strict rules on how memory is accessed and managed.

Ownership and Borrowing

Rust's ownership model ensures that each piece of data has a single owner, and the compiler enforces rules that prevent data races and dangling pointers. Borrowing allows references to data without transferring ownership, but the compiler ensures that these references do not outlive the data they point to.

Lifetimes

Rust uses lifetimes to track how long references to data are valid. This prevents use-after-free errors and ensures that references do not outlive the data they point to.

Concurrency

Rust provides strong guarantees for safe concurrency, which helps prevent common concurrency issues such as data races.

Fearless Concurrency

Rust's ownership and type system ensure that data races are impossible at compile time. The language enforces that only one thread can mutate data at a time, and multiple threads can only read data if no thread is mutating it.

Send and Sync Traits

Rust uses the Send and Sync traits to ensure that types are safe to transfer between threads (Send) and safe to reference from multiple threads (Sync). The compiler enforces these traits, preventing unsafe concurrency patterns.

Type Safety

Rust's strong static type system helps catch many errors at compile time, reducing the likelihood of runtime errors.

Pattern Matching

Rust's pattern matching allows for exhaustive handling of different cases, reducing the chances of unhandled cases and logic errors.

Option and Result Types

Rust encourages the use of Option and Result types for handling nullable values and error handling, respectively. This reduces the likelihood of null pointer dereferences and unhandled errors.

Undefined Behavior

Rust aims to eliminate undefined behavior, which is a common source of security vulnerabilities in other languages.

No Null Pointers

Rust does not have null pointers, which eliminates a common source of bugs and security vulnerabilities.

No Buffer Overflows

Rust's safe abstractions prevent buffer overflows by ensuring that all memory accesses are within bounds.

No Uninitialized Memory

Rust ensures that all variables are initialized before use, preventing undefined behavior due to uninitialized memory.

Unsafe Code

Rust allows the use of unsafe blocks for operations that cannot be checked by the compiler, such as interfacing with low-level hardware or other languages. However, the use of unsafe is explicitly marked, and developers are encouraged to minimize its use.

Explicit Unsafe

The unsafe keyword makes it clear where potentially dangerous operations are occurring, allowing developers to audit and review these sections of code more carefully.

Encapsulation

Unsafe code can be encapsulated in safe abstractions, allowing the rest of the codebase to remain safe while still performing necessary low-level operations.

Conclusion

Rust's design prioritizes safety and security, making it a strong choice for systems programming where security is critical. Its ownership model, strong type system, and concurrency guarantees help prevent many common security vulnerabilities found in other languages. While unsafe code is allowed, it is explicitly marked and can be encapsulated to minimize its impact on the overall safety of the codebase. Overall, Rust provides a robust foundation for writing secure and reliable software.

Security Features Implemented

In the Xenon compiler, several security features have been implemented to ensure the safety and reliability of the compiled code. These features are inspired by the security principles found in languages like Rust, which prioritize memory safety and type safety. Below is a description of the key security features implemented in the Xenon compiler:

Memory Safety

Memory safety is a critical aspect of the Xenon compiler, ensuring that programs do not encounter common memory-related vulnerabilities such as buffer overflows, use-after-free errors, and null pointer dereferences.

No Memory Usage

Xenon does not allow memory usage, which is very safe, eliminating a common source of bugs and security vulnerabilities.

Type Safety

Type safety ensures that operations on data are performed in a manner consistent with the data's type, preventing type-related errors and vulnerabilities.

Strong Static Typing

Xenon uses a strong static type system to catch type errors at compile time. This reduces the likelihood of runtime errors and ensures that operations on data are type-safe.

Const and Mutable Variables

Xenon distinguishes between const and mutable variables, allowing developers to control the mutability of data. This helps prevent unintended modifications to data and ensures that data is accessed safely.

Semantic Analysis

Semantic analysis checks the Abstract Syntax Tree (AST) for semantic errors, ensuring that the program adheres to the language's rules and preventing logical errors.

Type Checking

The semantic analyzer checks for type mismatches, ensuring that operations on data are type-safe.

Variable Initialization

The semantic analyzer ensures that all variables are initialized before use, preventing undefined behavior due to uninitialized memory.

Return Statement Validation

The semantic analyzer checks that all function bodies have valid return statements, ensuring that functions return the expected types.

Optimizer and Code Generation

The optimizer and code generation stages ensure that the compiled code is safe and efficient.

Optimizer Validation

The optimizer validates the code to ensure that it adheres to the language's rules and does not contain any undefined behavior.

Safe Code Generation

The code generation stage ensures that the generated WebAssembly (WASM) code is safe and free from common vulnerabilities such as buffer overflows and use-after-free errors.

Error Handling

Robust error handling ensures that the compiler can gracefully handle errors and provide meaningful feedback to the user.

Result Type

The Result type is used throughout the compiler to handle errors in a type-safe manner. This ensures that errors are explicitly handled and not ignored.

Descriptive Error Messages

The compiler provides descriptive error messages, helping users understand and fix issues in their code.

Conclusion

The Xenon compiler incorporates several security features inspired by Rust to ensure the safety and reliability of the compiled code. These features include memory safety, type safety, semantic analysis, safe code generation, and robust error handling. By prioritizing these security principles, the Xenon compiler aims to provide a secure and reliable foundation for writing and executing Xenon programs.