Vondel
A simple computer architecture, ISA, Assembler and Interpreter build with Rust.
Usage
First off all, we need to build our binaries using:
cargo build -r --bins
Then for testing the demonstration programs we need to assemble it and run with our uarch.
We can accomplish this by just using the following commands:
Running Factorial
cargo run -r --bin assembler -- -i programs/factorial_hardware.asm && \
cargo run -r --bin uarch -- --rom a.rom --ram a.ram
Running Power
cargo run -r --bin assembler -- -i programs/power_hardware.asm && \
cargo run -r --bin uarch -- --rom a.rom --ram a.ram
Running CSW
cargo run -r --bin assembler -- -i programs/csw.asm && \
cargo run -r --bin uarch -- --rom a.rom --ram a.ram
Running Div
cargo run -r --bin assembler -- -i programs/div_hardware.asm && \
cargo run -r --bin uarch -- --rom a.rom --ram a.ram
File Structure
The Vondel file structure follows a convetional Rust program, with a src folder that handles all of our source code.
More info on the chapter File Structure
Interpreter
The Vondel Interpreter was built on top of the Monkey Language using the
book Writing an Interpreter with GO but on Rust.
More info on the chapter Interpreter
Assembler
The Vondel Assembler was based on RISC V assemblers with our custom taste for Mnemonics and operations that our Uarch can accept
More info on the chapter Assembler
Microarchitecture
The basic idea behind Vondel's design is to follow some tips from the book Structured Computer Organization - Andrew S. Tanenbaum while implementing the ability to perform operations even if the main clock is in a non-edge level in order to reduce clock cycles.
More info on the chapter Microarchitecture
File Structure of Vondel
The Vondel file structure follows a convetional Rust program, with a src folder that handles all of our source code.
vondel
├── Cargo.lock
├── Cargo.toml
├── docs
│ ├── book.toml
│ ├── diagrams
│ └── src
├── LICENSE
├── programs
│ ├── csw.asm
│ ├── div.asm
│ ├── div_hardware.asm
│ ├── factorial.asm
│ ├── factorial_hardware.asm
│ ├── power.asm
│ └── power_hardware.asm
├── README.md
└── src
├── assembler
├── assembler.rs
├── bin
├── inter
├── inter.rs
├── lib.rs
├── main.rs
└── uarch
Comparision with the Proposed File Structure
The main difference between our project and the proposed one it's the amount of complexity.
For example, our assembler has it's own folder, where each step has it's own file, instead of a single file to handle lexing, parsing and enconding.
Our memory, alu, clock and computer are on a single module called uarch, this module contains all the business logic for handling instructions.
The disk module was embedded on both uarch and assembler, that are on the bin folder.
Docs
The place where all documentation of this program lives
Here we split chapters into folders where mdbook, the software responsible for this site, can generate it for us automatically
src
├── assembler
│ ├── comparision.md
│ ├── evaluator.md
│ ├── lexer.md
│ ├── parser.md
│ ├── README.md
│ └── specs.md
├── chapter_1.md
├── files.md
├── interpreter
│ ├── environment.md
│ ├── evaluator
│ ├── language.md
│ ├── lexer.md
│ ├── object.md
│ ├── parser.md
│ └── README.md
├── README.md
├── SUMMARY.md
└── uarch
├── implementations.md
├── README.md
└── uinstruction.md
Programs
The place where lives all of our assembly code for this lecture.
The ones that has _hardware implements functions that are built on hardware on the uarch, so they are faster than iteractive versions.
.
├── csw.asm
├── div.asm
├── div_hardware.asm
├── factorial.asm
├── factorial_hardware.asm
├── power.asm
└── power_hardware.asm
Src
The source code for generating our interpreter, assembler and uarch
Each module has a unique folder that has all files for creating a library crate.
The bin folder is responsible for creating our binaries that will be executed on someone's PC
.
├── assembler
│ ├── cli.rs
│ ├── evaluator.rs
│ ├── lexer.rs
│ ├── parser.rs
│ ├── sections.rs
│ └── tokens.rs
├── assembler.rs
├── bin
│ ├── assembler.rs
│ ├── interpreter.rs
│ └── uarch.rs
├── inter
│ ├── ast
│ ├── ast.rs
│ ├── cli.rs
│ ├── doc
│ ├── environment.rs
│ ├── evaluator
│ ├── evaluator.rs
│ ├── lexer.rs
│ ├── object.rs
│ ├── repl.rs
│ └── tokens.rs
├── inter.rs
├── lib.rs
├── main.rs
└── uarch
├── alu.rs
├── cli.rs
├── mem.rs
└── mod.rs
Interpreter
The Vondel Interpreter was built on top of the Monkey Language using the
book Writing an Interpreter with GO but on Rust 😂
Monkey is a simple programming language designed for educational purposes, focusing on simplicity and ease of understanding. This interpreter allows you to run Monkey code and see the results in real-time.
Language Specs
The Vondel Language is simpler than a common Monkey Lang implementation because we need to implement a custom evaluator using our own microarchitecture. And doing Strings, Structs and Vectors on this UArch will not be a good experience for us.
Basically, what we implemented in the Vondel Language, which includes integers, booleans, functions, and lambda functions, you have a foundation for building a variety of programs.
More info on the sup-chapter Language Specifications.
Lexer
Our lexer don't allow UTF-8 enconding for processing text and just rely on common ASCII.
It's a simple lexer that use a lot of String allocations instead of string slices for performance optimizations.
It also uses one of Rust most powerfull features enums for processing non Identifier tokens so that we can
rely on Rust strong type system.
More info on the sup-chapter Lexer
Parser
The parser for the Vondel language utilizes a powerful parsing technique called Pratt Parsing. It allows for efficient and flexible parsing of expressions with varying levels of precedence so that we can parse complex expressions without too much hassle.
More info on the sup-chapter Parser
Object
Because Vondel is a dynamic language we use Objects for an intermediate representation and evaluations
More info on the sup-chapter Object
Environment
It allows for the storage and retrieval of variables and their corresponding values in a flexible and efficient manner. The module supports nested environments, enabling scoping and variable lookup in outer environments.
More info on the sup-chapter Environment
Evaluator
For the evaluation module we just define a EvaluationError enum for handling most evaluation and runtime errors.
A Evaluator trait for creating custom evaluators and evaluate_buffer fn that handles receives anything that implements Evaluator, a Program and return a prints the result of this evaluation
We also define two differente evaluators for our users:
- Rust Evaluator - that uses Rust for doing all evaluations
- Custom Evaluator - that uses our custom microarchitecture for evaluting these ASTs
More info on the sup-chapter Evaluator
Vondel Language Specification
This specification outlines the syntax and behavior of the Vondel language, a simple programming language that supports arithmetic operations, functions, and lambda functions.
Basics
- Vondel is a dynamically-typed language, meaning that variable types are inferred at runtime.
- All statements in Vondel end with a semicolon (
;). - Vondel is whitespace insensitive, meaning that spaces and line breaks are ignored except where necessary for separating tokens.
Data Types
Vondel Suported Data Types:
- Integers: Represented by whole numbers without fractional or decimal parts.
- Booleans: Represented by the keywords
trueandfalse. - Null: Represented by the keyword
null.
Variables
Variables in Vondel are dynamically typed and declared using the let keyword.
let x = 10;
let name = "Alice";
let flag = true;
Arithmetic Operations
Vondel supports the following arithmetic operations:
- Addition:
+ - Subtraction:
- - Multiplication:
* - Division:
/ - Modulo:
%
let x = 10 + 5;
let y = x * 2 - 3;
let z = (x + y) / 2;
Functions
Vondel allows the definition and invocation of functions.
Function Definition
Functions are defined using the fn keyword followed by the parameter list in parentheses and the function body in curly braces.
let add = fn(x, y) {
return x + y;
};
Function Invocation
To invoke a function, use the function name followed by arguments in parentheses.
let result = add(3, 4); // result = 7
Return Statement
The return statement is used to exit a function and optionally return a value.
let multiply = fn(x, y) {
return x * y;
};
Lambda Functions
Vondel supports lambda functions, also known as anonymous functions or function literals.
Lambda Function Definition
Lambda functions are defined using the fn keyword followed by the parameter list in parentheses and the function body in curly braces. They can be assigned to variables.
let multiply = fn(x, y) {
return x * y;
};
let square = fn(x) {
return multiply(x, x);
};
Lambda Function Invocation
To invoke a lambda function, use the function name followed by arguments in parentheses.
let result = square(5); // result = 25
Higher-Order Functions
Vondel supports higher-order functions, which are functions that can accept other functions as arguments or return functions as results.
let applyFunc = fn(func, x, y) {
return func(x, y);
};
let result = applyFunc(multiply, 3, 4); // result = 12
Examples
let add = fn(x, y) {
return x + y;
};
let result = add(3, 4); // result = 7
let multiply = fn(x, y) {
return x * y;
};
let square = fn(x) {
return multiply(x, x);
};
let area = square(5); // area = 25
let fibonacci = fn(x) {
if (x == 0) {
0
} else {
if (x == 1) {
1
} else {
fibonacci(x - 1) + fibonacci(x - 2);
}
}
};
fibonacci(9) // 34
Limitations
The Vondel language described in this specification is a simplified version of a programming language and has several limitations, including but not limited to:
- Limited data types and operations.
- Lack of control flow statements such as loops and conditionals.
- Limited built-in functions and standard library.
- The language is intentionally designed to be minimalistic and educational, focusing on core concepts and syntax.
References
The Monkey programming language: monkey lang site
Lexer Design Decision
The lexer provided is responsible for tokenizing input strings into a sequence of tokens. This README explains the design decisions made in implementing the lexer.
Overall Design
The lexer is implemented as a struct called Lexer with the following fields:
ch: Represents the current character being processed.input: Represents the input string as a vector of bytes.read_position: Tracks the current position while reading the input.position: Tracks the position of the current character being processed.
The Lexer struct also includes several methods to perform different tasks:
new: Initializes a newLexerinstance with the given input string.skip_whitespace: Skips whitespace characters until a non-whitespace character is encountered.peek_char: Returns the next character in the input without consuming it.read_char: Reads the next character from the input and updates the lexer's internal state.next_token: Returns the next token from the input.parse_operator: Parses an operator token based on the current and next characters in the input.tokenizer: Tokenizes the current character based on its type.read_name: Reads an identifier token from the input.read_number: Reads an integer number token from the input.get_deez_toks: Returns a vector of all tokens found in the input.
The design follows a simple and straightforward approach to tokenizing the input string.
Tokenization
The tokenizer method is responsible for tokenizing the current character based on its type. It uses pattern matching to identify different types of characters and returns the corresponding token type.
- Whitespace characters are skipped using the
skip_whitespacemethod. - Single-character tokens like commas, semicolons, parentheses, and squirlies are identified directly.
- Operator tokens are parsed by the
parse_operatormethod, which takes into account both single and double-character operators. - Identifiers are recognized by checking if the current character is alphabetical or an underscore. The
read_namemethod is called to read the complete identifier. - Integer numbers are recognized by checking if the current character is a digit. The
read_numbermethod is called to read the complete number. - Any other character is considered illegal and is represented by an
Illegaltoken type.
Usage
To use the lexer, follow these steps:
-
Import the
Lexerstruct and theTokenTypeenum from the appropriate module or file. -
Create a new instance of the
Lexerby calling thenewmethod and passing the input string as a parameter.#![allow(unused)] fn main() { let input = String::from("let x = 5 + 3;"); let mut lexer = Lexer::new(input); } -
Retrieve tokens from the lexer by calling the next_token method. It returns the next token in the input string.
#![allow(unused)] fn main() { let token = lexer.next_token(); } -
Continue calling next_token to retrieve subsequent tokens until you reach the end of the input. The end of input is indicated by the TokenType::Eof token.
#![allow(unused)] fn main() { loop { let token = lexer.next_token(); if token == TokenType::Eof { break; } // Process the token as needed } } -
Or you can use
get_deez_toksmethod to return all tokens at once as aVec<TokenType>#![allow(unused)] fn main() { let all_toks = lexer.get_deez_toks(); } -
Use the retrieved tokens for further processing or analysis based on your specific requirements.
Please note that this example assumes you have the appropriate code structure and dependencies in place for the lexer to work correctly. Adjust the code snippets based on your specific implementation.
Testing
The lexer includes a set of unit tests implemented using the #[cfg(test)] attribute and the mod tests block.
These tests ensure the correctness of the lexer implementation by verifying the tokenization of different input strings.
The tests cover various scenarios, including skipping whitespace, reading identifiers and numbers, handling operators, and tokenizing complex input strings with multiple tokens.
Conclusion
The lexer design follows a modular and intuitive approach to tokenize input strings. It provides flexibility to extend the lexer's functionality if required. The unit tests provide confidence in the correctness of the lexer implementation.
Parser Design Decisions
The Vondel parser is designed to parse the Vondel programming language, which is built on top of the Monkey language. The parser follows the Pratt parsing approach to handle expressions and statements in the program.
Pratt Parser Explanation
Pratt parsing, also known as TDOP, is a parsing technique that uses a Recursive Descent Approach to parse expressions based on their precedence and associativity.
The main idea behind Pratt parsing is to associate each token or operator with two parsing functions: a prefix parsing function and an infix parsing function. The prefix parsing function is responsible for parsing expressions that start with the token, while the infix parsing function handles expressions that involve the token as an operator.
Let's illustrate how that bitch works with this simple expression (-5 + 3) * 2
-
Tokenization: The expression is tokenized as follows:
(,-,5,+,3,),*,2 -
Precedence and Associativity: Let's assign the following precedence levels:
*(highest),+(lower),-(lower) -
Prefix Parsing: The prefix parsing function for the
(token creates a grouping expression. -
Prefix Parsing for
-: The parser encounters the-token and checks if it's a prefix operator. Since it is, the parser creates a unary expression for it. -
Infix Parsing: The parser encounters the
)token and skips it since it doesn't have an associated infix parsing function. -
Right Binding Power: The right binding power for
*is the same as its precedence level. -
Right-hand Side Expression: The parser invokes the infix parsing function for
*and creates a binary expression with the*operator and the right-hand side expression. -
Recursive Descent: The parser examines the next token (
2) and creates an integer expression for it. -
Parsing Complete: The parsing process is complete, resulting in a fully parsed expression.
Here's the visualization of the Pratt parsing process for the expression (-5 + 3) * 2:
*
/ \
+ 2
/ \
- 3
|
5
Error Handling
The parser uses the anyhow crate for error handling. It defines a custom ParserError enum to represent different parsing errors that can occur.
The ErrorWithCtx struct is used to associate the error message with the corresponding context in the source code.
Precedence
The parser defines the Precedence enum to represent the precedence levels of different operators in the language.
The precedence_of function assigns a precedence level to each token type used in Pratt parsing.
Statement Types
The parser defines the StatementType enum to represent different types of statements in the program.
This includes Let statements for variable declarations, Return statements, Expression statements, and Block statements for code blocks.
Expression
The Expression module provides the definition and functionality for handling different types of expressions in the code.
PrefixOp
The PrefixOp enum represents the prefix operators available in the language, including Bang and Minus.
InfixOp
The InfixOp enum represents the infix operators available in the language, such as Plus, Minus, Asterisk, Slash, Equal, NotEqual, LessThan, and GreaterThan. It provides methods to retrieve the operator as a string representation.
Expression
The Expression enum represents various types of expressions, including Identifier, Integer, Prefix, Infix, Boolean, If, FunctionLiteral, and Call. It also provides methods to get the type of the expression as a string and constructors for creating specific types of expressions.
Constructors
new_ident(ident: &String): Creates a newIdentifierexpression with the given identifier.new_integer(int: &String) -> Result<Self>: Creates a newIntegerexpression with the parsed integer value from the input string. Returns an error if the parsing fails.new_prefix(op: &TokenType, exp: Expression) -> Result<Self>: Creates a newPrefixexpression with the specified operator and right-hand side expression. Returns an error if the operator is not allowed.new_infix(left: Expression, op: &TokenType, right: Expression) -> Result<Self>: Creates a newInfixexpression with the specified left-hand side, operator, and right-hand side expressions. Returns an error if the operator is not allowed.new_boolean(t: &TokenType) -> Result<Self>: Creates a newBooleanexpression with the specified boolean value. Returns an error if the boolean token is not allowed.new_if(cond: Expression, cons: StatementType, alt: Option<StatementType>): Creates a newIfexpression with the condition, consequence, and optional alternative statements.new_function(params: Vec<Expression>, block: StatementType): Creates a newFunctionLiteralexpression with the specified parameters and block statement.new_call(func: Expression, args: Vec<Expression>): Creates a newCallexpression with the specified function expression and argument expressions.
Program Structure
The parser uses the Program struct to store the parsed statements and errors. The get_deez_program method is responsible for parsing the entire program and populating the Program struct.
Lexer Integration
The parser expects a slice of TokenType tokens as input, which are generated by a lexer. This allow us to do some kind of decoupling between lexing and parsing.
Usage
To use the parser follow these steps:
-
Create a vector of tokens by hand or using a lexer.
#![allow(unused)] fn main() { let input = "let tubias = 123;" let toks = Lexer::new(input).get_deez_tokens(); } -
Create a new
Parserand pass this tokens as parameters.#![allow(unused)] fn main() { let mut parser = Parser::new(&toks); let program = parser.get_deez_program(); } -
Do whatever you want with these statements or errors.
#![allow(unused)] fn main() { // Access the parsed statements and handle errors if needed for statement in program.statements { // Process each statement... } // Handle parsing errors, if any for error in program.errors { // Handle each error... } }
Testing
The approach for testing this parser was decoupling it from the lexer. So that future internal changes of the current lexer can't interfere with our parse,
unless we change the contract between then that is TokenType
Conclusion
In conclusion, the Vondel parser, based on the Pratt parsing approach, provides efficient and accurate parsing of expressions and statements in the Vondel programming language. With its error handling capabilities, well-defined precedence levels, and support for various statement types, the parser offers a solid foundation for the development and expansion of the Vondel language, ensuring reliable and effective parsing of code structures.
Object
The Object module in the Vondel language is designed to handle various types of objects used in the language, including integers, booleans, return values, null values, and functions. It provides a Object enum with associated data for each object type. The module defines methods to inspect objects and retrieve their string representation. The Object enum implements the Debug, PartialEq, and Clone traits for debugging, equality comparison, and cloning, respectively. Additionally, it provides a type_as_string() method to retrieve the type of the object as a string for error handling.
Object Enum
The Object enum represents the different types of objects in the Vondel language. It has the following variants:
Integer(i64): Represents an integer value.Boolean(bool): Represents a boolean value.ReturnValue(Box<Object>): Represents a return value from a function.Null: Represents a null value.Function { params: Vec<Expression>, body: StatementType, env: Rc<RefCell<Environment>> }: Represents a function object, storing the function's parameters, body, and environment.
Object Methods
The Object module provides the following methods:
inspect() -> String: Returns a string representation of the object.type_as_string() -> &'static str: Returns the type of the object as a string for error handling.
Display Formatting
The Object enum implements the fmt::Display trait to provide a formatted string representation for display purposes. The Display implementation formats the object based on its variant, converting it to a string representation.
#![allow(unused)] fn main() { impl fmt::Display for Object { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { // Format the object based on its variant // ... } } }
Environment
The Environment module provides functionality for storing variables and their corresponding values.
Features
- Variable Storage: The
Environmentstruct uses aHashMapto store variables and their values. - Nested Environments: The module supports nested environments, allowing for scoping and variable lookup in outer environments.
- Argument Matching: The
set_arguments_to_envfunction sets arguments to the environment, matching them with corresponding parameters. - Error Handling: The module uses the
anyhowcrate for error handling, providing detailed error messages for various scenarios.
Usage
To create a new environment, use the Environment::new() function. To create an environment with an outer environment, use Environment::new_with_outer(outer).
To set arguments to the environment, use the set_arguments_to_env(args, params) function, providing a vector of arguments and parameters.
To retrieve the value of a variable, use the get(name) function, providing the variable name. It automatically looks up the variable in outer environments if it's not found in the current environment.
To set the value of a variable, use the set(name, value) function, providing the variable name and value.
Examples
use std::{cell::RefCell, collections::HashMap, rc::Rc}; use anyhow::{bail, Result}; use crate::inter::{ast::Expression, evaluator::EvaluationError}; use super::object::Object; /// Represents an environment that stores variables and their corresponding values. #[derive(Debug, PartialEq, Clone)] pub struct Environment { store: HashMap<String, Object>, outer: Option<Rc<RefCell<Environment>>>, } impl Environment { // ... Implementation details ... } // Example usage fn main() { let mut environment = Environment::new(); environment.set( &Expression::Identifier("x".to_string()), Object::Integer(42), ).unwrap(); let result = environment.get("x").unwrap(); println!("Value of 'x': {}", result); }
Evaluator
The evaluator module provides the definition and functionality for evaluating code using different evaluators.
This document outlines the design decisions and considerations made in the implementation of the evaluator module.
EvaluationError Enum
The EvaluationError enum represents possible errors that can occur during evaluation.
It includes various error variants such as MissingIntegerToInvert, MismatchedTypesInfix, UnallowedBooleanComparisonOperator, IdentifierNotFound, ExpectedIdentifier, and WrongNumberOfArguments.
These errors cover different scenarios where evaluation can fail and provide descriptive error messages.
Evaluator Trait
The Evaluator trait defines the behavior of an evaluator. It includes a single method, eval, which takes an AST node and an environment and returns the resulting Object or an error if evaluation fails.
Implementations of this trait provide the specific evaluation logic for different languages or custom evaluators.
Evaluator Implementations
The evaluator module includes two sub-modules, custom and rust , which contain different implementations of the Evaluator trait.
These implementations can be used to evaluate code written in different languages or using custom evaluation logic.
evaluate_buffer Function
The evaluate_buffer function is a utility function that simplifies the evaluation process. It takes a boxed trait object implementing the Evaluator trait and an input string to evaluate.
The function performs the lexing, parsing, and evaluation steps, and prints the resulting object or error message. It can be used as a convenient way to evaluate code using different evaluators.
Usage
To use our evalutor follow theses steps:
- Create a
Custom Customevaluator or just use our custom and rust builtin evaluators#![allow(unused)] fn main() { use my_evaluator::{Evaluator, evaluate_buffer}; // Define a custom evaluator struct MyEvaluator; impl Evaluator for MyEvaluator { // Implement the `eval` method // ... } } - Create a new Evaluator instance
#![allow(unused)] fn main() { let my_eval = Box::new(MyEvaluator::new()); let rust_eval = Box::new(RustEvaluator::new()); let custom_eval = Box::new(CustomEvaluator::new(); } - Use
evaluater_bufferfn or create your own logic for parsingProgram#![allow(unused)] fn main() { let input = String::from("777 * 777"); println!("my_eval res: {}", evaluate_buffer(my_eval, input)?; println!("rust_eval res: {}", evaluate_buffer(rust_eval, input)?; println!("custom_eval res: {}", evaluate_buffer(custom_eval, input)?; }
Testing
The Evaluator module provides a set of tests to ensure the correctness of the evaluators and their functionalities.
The tests cover various scenarios, including error handling, function calls, and decoupling from the parser and lexer.
Conclusion
In conclusion, the Evaluator module plays a crucial role in interpreting and evaluating Rust code. It provides an essential component for executing programs, performing expressions, handling errors, and enabling function calls. By decoupling from the parser and lexer, the evaluators ensure flexibility and compatibility with various parsing and lexing implementations. The comprehensive test suite included in the module verifies the correctness of the evaluators, covering error handling, function calls, and specific behaviors of the RustEvaluator and CustomEvaluator implementations. With the Evaluator module, developers can confidently evaluate Rust code and rely on its robustness and accuracy in producing the expected results.
Rust Evaluator
The RustEvaluator is an implementation of the Evaluator trait specifically designed to evaluate Rust code. This document outlines the design decisions and considerations made in the implementation of the RustEvaluator struct.
RustEvaluator Struct
The RustEvaluator struct represents an evaluator for Rust code. It provides methods for evaluating different types of expressions and statements in Rust.
Usage
To use the RustEvaluator follow these steps:
-
Import the necessary modules and structs in your Rust file:
#![allow(unused)] fn main() { use std::{cell::RefCell, rc::Rc}; use anyhow::{Result}; use crate::inter::{ ast::expression::{InfixOp, PrefixOp}, environment::Environment, object::Object, // Import other necessary modules as needed }; use super::ast::*; // Import the RustEvaluator module use super::RustEvaluator; } -
Create an instance of the RustEvaluator:
#![allow(unused)] fn main() { let evaluator = RustEvaluator::new(); } -
Set up the environment and input for evaluation:
#![allow(unused)] fn main() { let mut environment = Environment::new(); let input = "1 + 2"; } -
Pass the program to the eval() method of the RustEvaluator instance to evaluate it:
#![allow(unused)] fn main() { let result = evaluator.eval(&program, &mut environment); } -
Handle the evaluation result using
Result:#![allow(unused)] fn main() { match result { Ok(object) => { // Handle the evaluated object // ... } Err(error) => { // Handle the evaluation error // ... } } } -
Repeat steps 4 and 5 as needed to evaluate multiple programs or statements.
By following these steps, you can integrate the RustEvaluator into your Rust project and evaluate Rust code.
Feel free to modify the code snippets or the instructions to fit your specific use case.
Methods
new()
The new() method creates a new instance of the RustEvaluator.
map_boolean()
The map_boolean() method maps a boolean value to the corresponding Object value used in the evaluator.
eval_bang_operator()
The eval_bang_operator() method evaluates the logical NOT (!) operator on the right operand.
eval_minus_operator()
The eval_minus_operator() method evaluates the arithmetic minus (-) operator on the right operand.
eval_prefix_expression()
The eval_prefix_expression() method evaluates a prefix expression (e.g., -, !) by applying the corresponding operator on the right operand.
eval_infix_expression()
The eval_infix_expression() method evaluates an infix expression (e.g., +, -, *, /, <, >, ==, !=) by performing the operation on the left and right operands.
is_truthy()
The is_truthy() method checks if an object is truthy, meaning it evaluates to true.
eval_if_expression()
The eval_if_expression() method evaluates an if expression by evaluating the condition and executing the appropriate branch based on the result.
eval_let_statement()
The eval_let_statement() method evaluates a let statement by evaluating the value expression and storing it in the environment with the specified name.
eval_expressions()
The eval_expressions() method evaluates a list of expressions within the given environment.
eval_fn_call()
The eval_fn_call() method evaluates a function call expression by evaluating the function expression and the argument expressions, and executing the function with the provided arguments.
eval_expression()
The eval_expression() method evaluates an expression by dispatching to the appropriate evaluation method based on the expression type.
eval_block_statements()
The eval_block_statements() method evaluates a block of statements by iterating over each statement and evaluating them in order.
eval_statements()
The eval_statements() method evaluates a statement by dispatching to the appropriate evaluation method based on the statement type.
Implementation of Evaluator Trait
The RustEvaluator struct implements the Evaluator trait, which defines the behavior of an evaluator. The eval() method is implemented to evaluate a program by iterating over each statement and evaluating them in order.
Conclusion
The RustEvaluator provides a comprehensive implementation of an evaluator for Rust code. It handles different types of expressions and statements and provides accurate evaluation results.The design decisions made in the implementation ensure flexibility and extensibility for future enhancements or modifications.
Assembler
The Vondel Assembler was based on RISC V assemblers with our custom taste for Mnemonics and operations that our Uarch can accept
This assembler produce 2 files a .ram and a .rom that can be used for our microarchitecture needs where:
- .ram: It's the ram dump produced by
.datasections - .rom: It's the firmware made by
.textsections
Usage
To use the Vondel Assembler you can open your terminal and type
cargo run -r assembler -- -i input.asm -o output
This commmand will generate lex, parse and evaluate the input.asm and therefore produce output.ram and output.rom for further usage
If the output argument isn't provided the default output file will begin with
a
Language Specifications
Our language specs are similar to RISC V, but with some tweaks
If you came from there you'll be in home
.data
tubias: .word 420
test: .byte 69
res: .word 42069
.text
main:
addi s0 <- test
lui s1 <- 254
add a0,a1,a2,a3 <- t1, t2
beq t1, a0, done
done:
read s2 <- tubias
write res <- t1
halt
More info on the sup-chapter Language Specs
Lexer
Just a common lexer that we'll ignore comments and parse all necessary tokens for further steps
More info on the sup-chapter Lexer
Parser
Here we define our syntax that is described on Language Specs, if any step here fail we should the user the error and context
More info on the sup-chapter Parser
Evaluator
We generate our ROM and RAM for our microarchitecture, it handles the logic of our firmware and all variables
that are beeing used
More info on the sup-chapter Evaluator
Comparision
The comparision between our language and assembler with the ones that our teacher proposed at class
The key difference between then are:
- Our assembler is easier to extend and define a fine grained syntax than the proposed one.
- We have a error handling with more context for the user
- We build the microinstruction itself while our teacher use opcode to navegate on the firmware own microinstructions
- We support immediates that can speed the process a lot
More info on the sup-chapter Comparision
Registers
The assembler supports 25 registers. These registers can be used for general-purpose operations or not.
| Register | Suggested Usage | A Bus | B Bus | C Bus |
|---|---|---|---|---|
| Mar | Memory Address | Output | ||
| Mdr | Memory Data | Input | Input | Output |
| Mbr | Memory Buffer | Input | ||
| Mbru | Memory Buffer | Input | ||
| Mbr2 | Memory Buffer | Input | ||
| Mbr2u | Memory Buffer | Input | ||
| Pc | Program Counter | Input | Output | |
| Cpp | Call Pointer | Input | Input | Output |
| Lv | Local Variable | Input | Input | Output |
| Ra | Return Address | Input | Input | Output |
| T0 | Temporary | Input | Input | Output |
| T1 | Temporary | Input | Input | Output |
| T2 | Temporary | Input | Input | Output |
| T3 | Temporary | Input | Input | Output |
| S0 | Saved | Input | Input | Output |
| S1 | Saved | Input | Input | Output |
| S2 | Saved | Input | Input | Output |
| S3 | Saved | Input | Input | Output |
| S4 | Saved | Input | Input | Output |
| S5 | Saved | Input | Input | Output |
| S6 | Saved | Input | Input | Output |
| A0 | Function Arg | Input | Input | Output |
| A1 | Function Arg | Input | Input | Output |
| A2 | Function Arg | Input | Input | Output |
| A3 | Function Arg | Input | Input | Output |
Sections
The assembler supports two sections: .text and .data. The sections are defined as follows:
.text Section
The .text section contains the executable instructions.
.data Section
The .data section is used for declaring and initializing data.
Instruction Format
Most vondel instructions in follow the format: opcode dest_regs <- source1, source2. Here's a breakdown of the components:
opcode: Specifies the operation to be performed.dest_regs: The registers where the result will be stored.source1andsource2: Registers or immediate values used as operands.
Supported Instructions
The assembler supports the following instructions:
- Arithmetic and Logic
- Shift
- Memory
- Branch
- Immediate
- Other
Add
Adds two registers and save the result into 1 up to 20 registers that are allowed in the C_bus
add t0, t1, s0, a2, ra <- a0, a1
Sub
Subtract x - y and store on registers
sub t0, t1, s0, a2, ra <- a0, a1
Mul
Multiplies the value of x and y and store on registers
mul s0, a2, ra <- a0, a1
WARNING: Mul cannot be used with t* registers
Mul2
Multiplies the value of x and y and store on registers
mul s0, a2, ra <- a0, a1
INFO: More performatic and can use t* registers
Div
Divides x by y and store on registers
div t0, t1, s0, a2, ra <- a0, a1
Mod
Gets the remainder of x by y and store on registers
mod t0, t1, s0, a2, ra <- a0, a1
And
Makes a bitwise and with x and y
and t0, t1, s0, a2, ra <- a0, a1
Or
Makes a bitwise or with x and y
or t0, t1, t2, s0, s1, a2, ra <- a0, a1
Xor
Makes a bitwise xor with x and y
xor t0, t1, t2, s0, s1, a2, ra <- a0, a1
Not
Stores the result of !x on the registers
not t0, t1, t2, s0, s1, a2, ra <- a0
It uses only a single register as source
Sll
Stores the result of x << 8 on the registers
sll t0, t1 <- a0
It uses only a single register as source
Sla
Stores the result of x << 1 on the registers
sla t0, t1 <- a0
It uses only a single register as source
Sra
Stores the result of x >> 1 on the registers
sra t0, t1 <- a0
It uses only a single register as source
Read
Load a value from memory on the address addr into a register
read t0, t1, t2 <- addr
read t0, t1, t2 <- 77
Addr can be both a label referencing a variable in
.datasection or a immediate with value of 0 to 255
Write
Store a value from a register x into memory on address addr
write addr <- t1
write 77 <- ra
It allows only a single register as source
Addr can be both a label referencing a variable in
.datasection or a immediate with value of 0 to 255
Jal
Jump inconditionally to a label
jal loop
Beq
Branch if equal (jump to a label if x and y are equal)
beq t0, t1, done
Bne
Branch if not equal (jump to a label if x and y are not equal)
bne t0, t1, done
Blt
Branch if less then (jump to a label if x is less then y)
blt t0, t1, done
Bgt
Branch if greater then (jump to a label if x is greater then y)
bgt t0, t1, done
Lui
Load upper immediate imm on registers
lui t0, t1,t2 <- imm
lui t0, t1,t2 <- 77
It takes only a single immediate as parameter
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Addi
Adds a register x and a immediate imm and save this value on registers
addi t0, t1,t2 <- t0, imm
addi t0, t1,t2 <- a2, 77
The first argument must be a register and the second a immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Muli
Multiplies a register x and an immediate imm and save this value on registers
muli t0, t1,t2 <- t0, imm
muli t0, t1,t2 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Divi
Divides a register x by an immediate imm and save this value on registers
divi t0, t1,t2 <- t0, imm
divi t0, t1,t2 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Modi
Get the remainder of a div between a register x by an immediate imm and save this value on registers
modi t0, t1,t2 <- t0, imm
modi t0, t1,t2 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Subi
Subtracts from register x the value of an immediate imm and save this value on registers
subi t0, t1 <- t0, imm
subi t0, t1 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Andi
Makes a bitwise and of register x and an immediate imm, then saves this value on registers
andi t0, t1 <- t0, imm
andi t0, t1 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Ori
Makes a bitwise or of register x and an immediate imm, then saves this value on registers
ori t0, t1 <- t0, imm
ori t0, t1 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Xori
Makes a bitwise xor of register x and an immediate imm, then saves this value on registers
xori t0, t1 <- t0, imm
xori t0, t1 <- a2, 77
The first argument must be a register and the second an immediate
Imm can be both a label referencing a
.bytein.datasection or an immediate with value of 0 to 255
Mov
Store the value of the register x on the registers
mov t0, t1, t2 <- a1
mov ra, s1, s2, s3 <- t0
It supports only a single register as parameter
Halt
Stop the program execution
halt
Just halt
Data Declaration Instructions (.data section)
.byte: Declare a byte-sized data item.word: Declare a word-sized (4 bytes) data item
Labels and Branching
Labels can be defined and used as targets for branching instructions. Here are the guidelines for labels and branching:
- Labels are defined by placing a colon (
:) after the label name (e.g.,label:). - Branching instructions can use labels as targets (e.g.,
beq t1, t2, label).
Comments
Comments start with a semicolon (;) or (#) and continue until the end of the line. Comments are ignored during assembly.
Addressing Modes
The assembler supports two addressing modes: immediate and register addressing. Here's how they are used:
- Immediate values can be specified directly in the instruction.
- Registers have the ABI form of RISC V
This specification provides a general outline for the assembler language. Additional details, such as specific opcode mappings and assembly directives, can be added as needed.
Lexer
This lexer is specifically designed for a custom microarchitecture and includes RISC-V opcodes. It provides tokenization functionality for assembly code.
Usage
To utilize the Lexer in your project, follow these steps:
- Include the lexer code in your project's source files.
- Import the necessary modules and dependencies, such as
std::{iter::Peekable, rc::Rc, str::Chars}. - Instantiate a
Lexerobject, passing the input source code as a string to the new function. - Use the provided methods to tokenize the source code
next_token: Retrieves the next token in the source code.next_with_ctx: Retrieves the next token with context information, including line and column numbers.get_deez_toks: Retrieves all tokens in the source code as a vector.get_deez_toks_w_ctx: Retrieves all tokens with context information as a vector.
Example
#![allow(unused)] fn main() { use crate::assembler::tokens::{AsmToken, TokWithCtx}; let input_code = "..."; // Provide your assembly code here let mut lexer = Lexer::new(&input_code); loop { let token = lexer.next_token(); if token == AsmToken::Eof { break; } println!("Token: {:?}", token); } }
Parser
This is a parser designed for the Vondel Microarchitecture. It provides functionality to parse assembly code written for the Vondel Microarchitecture.
Usage
To use the parser, follow these steps:
- Create an instance of the
Parserstruct by providing the input tokens as aRc<[TokWithCtx]>parameter. - Use the
newfunction of theParserstruct to initialize it with the provided tokens. - Use the various parsing functions provided by the
Parserstruct to parse the assembly code.
Example
Using our lexer
#![allow(unused)] fn main() { let input = r" .data dividend: .word 10 # Dividend divisor: .word 3 # Divisor quotient: .word 0 # Quotient remainder: .word 0 # Remainder address: .byte 77 # Address "; let mut l = Lexer::new(input); let toks = l.get_deez_toks_w_ctx(); let rc_slice = Rc::from(toks.into_boxed_slice()); let mut p = Parser::new(rc_slice); p.get_deez_program() }
Using a Vector of TokWithCtx
#![allow(unused)] fn main() { let toks = toks_from_api()?; let mut p = Parser::new(rc_slice); p.get_deez_program() }
Parser Structure
The parser is implemented using the following structs:
Program
The Program struct represents the parsed assembly program. It contains the following fields:
sections: A vector ofSectionsrepresenting the different sections of the program.errors: A vector ofErrorrepresenting any parsing errors encountered during the parsing process.
Parser
The Parser struct is the main parser implementation. It contains the following fields:
toks: A shared reference to the input tokens.cur_tok: The current token being processed.peek_tok: The next token to be processed.idx: The index of the current token in thetoksvector.cur_line: The current line number.cur_column: The current column number.
The Parser struct provides the following methods:
new(toks: Rc<[TokWithCtx]>): Creates a new instance of theParserstruct with the provided input tokens.next_token(): Moves to the next token in the input.expect_peek(expected: AsmToken) -> Result<()>: Checks if the next token matches the expected token.curr_token_is(expected: AsmToken) -> bool: Checks if the current token matches the expected token.peek_token_is(expected: AsmToken) -> bool: Checks if the next token matches the expected token.get_label() -> Result<Rc<str>>: Retrieves the label from the current token.get_number() -> Result<Rc<str>>: Retrieves the number from the current token.get_register() -> Result<Rc<Register>>: Retrieves the register from the current token.get_opcode() -> Result<Rc<Opcode>>: Retrieves the opcode from the current token.get_pseudo_op() -> Result<Rc<PseudoOps>>: Retrieves the pseudo operation from the current token.guard_a_bus(reg: Rc<Register>) -> Result<Rc<Register>>: Checks if the provided register can be used in the A bus.guard_b_bus(reg: Rc<Register>) -> Result<Rc<Register>>: Checks if the provided register can be used in the B bus.guard_c_bus(reg: Rc<Register>) -> Result<Rc<Register>>: Checks if the provided register can be used in the C bus.parse_data_to_write() -> Result<DataWrited>: Parses the data to be written.parse_data_directive() -> Result<Sections>: Parses the data directive section.get_dest_regs() -> Result<Vec<Rc<Register>>>: Retrieves the destination registers.parse_instruction_til_rs1() -> Result<(Vec<Rc<Register>>, Rc<Register>)>: Parses the instruction until the RS1 register.get_instruction() -> Result<Instruction>: Retrieves the instruction.parse_labeled_section() -> Result<TextSegment>: Parses the labeled section.- `parse_text_directive() ->.
AsmEvaluator is a custom microarchitecture evaluator implemented in Rust.
It provides functionality for evaluating assembly code written in a custom instruction set architecture.
Usage
To use the AsmEvaluator, follow these steps:
- Create a new instance of
AsmEvaluatorusing the new method:
#![allow(unused)] fn main() { let mut evaluator = AsmEvaluator::new(); }
- Call the
evaluate_buffermethod to evaluate a buffer containing assembly code:
#![allow(unused)] fn main() { let buffer = r" .text main: lui t1 <- 77 "; let result = evaluator.evaluate_buffer(buffer); }
The evaluate_buffer method returns a Result containing the control store (CtrlStore) and a reference to the evaluated memory (&[u32]). You can handle the result accordingly.
- Alternatively, you can evaluate a Program directly by calling the eval_program method:
#![allow(unused)] fn main() { let program = get_program(); // Obtain the program somehow let result = evaluator.eval_program(program); }
The eval_program method follows a similar pattern as evaluate_buffer but takes a Program as input.
- You can also evaluate individual sections of the program using the
evalmethod:
#![allow(unused)] fn main() { let sections = get_sections(); // Obtain the sections somehow let control_store = evaluator.eval(§ions); }
The eval method takes a reference to the Sections enum, which can contain either a TextSection or a DataSection. It returns the CtrlStore generated from the evaluation.
Structure
The AsmEvaluator struct has the following fields:
values: A HashMap mapping labels to values (u8).addr: A HashMap mapping labels to addresses (u8).ram: A vector representing the random access memory.unreachable: A vector containing tuples of unreachable labels, their corresponding control store addresses, and microinstructions. The struct implements the Default trait using the derive attribute.
The AsmEvaluator struct provides the following methods:
new: Creates a new instance of AsmEvaluator.evaluate_buffer: Evaluates a buffer of assembly code and returns the control store and evaluated memory.eval_program: Evaluates a Program and returns the control store and evaluated memory.eval: Evaluates individual sections of the program and returns the control store. Additionally, there are internal helper methods for evaluating data and text segments, resolving unreachable instructions, and evaluating different types of instructions.
Error Handling
If there are errors while parsing the program, the evaluate_buffer method will print the errors and return a Result with an error message. It's important to handle this case to ensure proper error reporting and handling.
Comparision with proposed Assembler and Language
Output
The first and more obvious difference between our project and the proposed, is that we generate the Microinstruction itself
and not a tuple (Opcode, Byte_addr) that can be used by the microarchitecture.
We choose this method because of the poor performance that is accomplished using fetch operation for getting the next instruction
A common operation of goto or jal can be perfomed on a single instruction in Vondel because we know the next address without having to use fetch
| Index | Vondel | Proposed |
|---|---|---|
| 0 | 0b000000010_100_00000000_00000000000000000000_000_11111_11111_00000000 | 0b00001110_000_00110101_001000_001_001 |
| 1 | 0b00000000_100_00010100_001000_001_010 |
Another difference is the way that we interact with RAM, on Vondel we produce a .ram file that it's separated from the firmware file,
this, by itself, can reduce early gotos that are common in the proposed one. Because all words can be placed early and stop the program without executing anything
Combinations
Vondel has more than 700000000 possibilities for creating instructions.
This is because of our 23 register that can be combined in any way on C_Bus, 24 options for A_bus and 19 for B_bus. Where every register of
A or B bus can be replaced with a immediate or an address to a label.
On the proposed assembler, we only have operations with register X and everything in centered around him, limiting the way a programmer can
interact with a microarchitecture.
Extensibility
Adding new Mnemonics, Registers, PseudoOps, Sections or updating previous one is extremelly easy on Vondel.
Because of Rust powerful type-system, any new changes that you make to this project will be captured by both our unit tests and
Rust itself, leading to a more robust project.
The proposed one was written in python without any tests
Fine Grained Syntax
On Vondel, we can define the syntax of our Language in any way that you want. Because we created a parsing stage, our syntax will be defined there.
A syntax pattern that would be hard to do in the proposed one is our write function.
write 123 <- t0
write addr <- a1
Most Vondel instructions can receive up to 23 registers as destination, but this one only receives label or immediate for storing the result
and it must have an register as input
Implementing this on the proposed one would require to remake the entire bussiness logic that consists of opcode, dest, source
On Vondel we just create a arm on the match statement and it's done
Error Handling and Diagnostics
One of the most important features of a interpreter/compiler/assembler is how well it handle errors and show this information for a user.
On the proposed one, if something goes wrong the parser still tries to generate a file and just prints that have happened an error like this:
Error of syntax on line 15
We improved this error system using Rust powerful type-system that cries until we handle it on a good manner using this approachs:
- Show not only a line but also which token is required and at which column
- If some operation is forbidden like writing to MBR we send an error message to the user
Expected "Assign" found "Comma"
Line 13, column 9
Cannot write to MAR on C Bus
Line 18, column 9
Immediate Support
On Vondel we support Immediates, that are a type of data used in microarchitecture to provide immediate values or constants as operands in instructions. They allow for the inclusion of constant values directly within the instruction itself, without the need to load them from memory or registers.
It's not implemented on the proposed one
Microarchitecture
The basic idea behind Vondel's design is to follow some tips from the book Structured Computer Organization - Andrew S. Tanenbaum while implementing the ability to perform operations even if the main clock is in a non-edge level in order to reduce clock cycles.
The implementation of this design is provided by the uarch module.
Datapath structure
The way that a datapath is structured is not far from the usual three-bus design plus a IFU, the techniques involved a rather simple for now as you can see the diagram below:

This implementation provides 24 registers in total: 5 memory registers, 3 system registers (to manage function calls and variables) and 16 general 16 general purpose registers (from R0 to R15). The register list is as following:
- Memory
- MAR (Memory Address Register): 20 bits
- MDR (Memory Data Register): 32 bits
- PC (Program Counter): 20 bits
- MBR (Memory Buffer Reader): 8 bits
- MBR2 (Memory Buffer Reader 2): 16 bits
- System
- SP (Stack Pointer): 20 bits
- LV (Local Variables): 20 bits
- CPP (Constant Pool Pointer): 20 bits
- General Purpose
- R0: 32 bits
- ...
- R15: 32 bits
Each cycle the datapath is driven by a Microinstruction provided by the Control Store. For more info see the next chapter.
Data parallelism
For achieve data parallel processing we will use two datapaths with one ALU each, but the clock trigger of one is the opposite of the other, that is, ALU1 is falling-edge triggered and ALU2 is rising-edge triggered. Furthermore, the clock signal received in ALU2 is a function of the clock signal of ALU1.
Let's say that ALU1 has a clock α1 and the delayed version of this signal is α1'. So the ALU2 must have a clock α2 = (α1 ∧ α1') plus a delay of δ1 as shown below:

In other words, ALU1 will start its operation cycle on the falling-edge of α1 and end the operation on the rising-edge, while ALU2 will start its operation cycle on the rising-edge of α2 and end the operation on the falling-edge, taking advantage of the main clock (α1) even when it is at high level.
The advantage of this method is that we can share the control store and all the registers without big cost in the Misconstruction size, external hardware components and additional logic steps that may cost some clock cycles. The way that the components are shared is shown below:

Those two datapaths will become a thread in a future design of this Microarchitecture that is planned to have two task parallel threads.
Microinstruction
Microinstructions are stored in the control store and are fetched with the address stode at MPC (Microprogram Counter).
A Vondel microinstruction has 62 bits the following format:
| NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|
| 9 bits | 3 bits | 9 bits | 20 bits | 3 bits | 5 bits | 5 bits | 8 bits |
You can find a more detailed version of this diagram here
NEXT
The NEXT field stores the next microinstruction address. In other words, the
NEXT field stores the next value to be stored at MPC.
JAM
The JAM field stores in which conditions the program should jump. A Jump is
a change in the natural flow of the microprogram that represents that the next
instruction should be changed is some way.
We have 3 possible condition to jump: if the value returned by ALU is 0 (JAMZ),
if is negative (JAMN) and if we want to change the next instruction using
something stored at MBR (JMPC). The first bit (MSB) from JAM is JMPC, the
second is JAMN and the last is JAMZ.
On JAMN and JAMZ a jump is, in fact, a OR operation on the most sigficant bit
of MPC and the condition. In other words, if the condition is true, which happens
when its bit is 1, that 1 is bitwise ORed with the MSB of MPC, so if the value
of MPC is 000001010 the jump position is 100001010. But on JMPC,
a jump is a bitwise or with MBR and the 8 LSB's from MPC.
ALU
The ALU field actually controls 2 devices: the ALU itself and the shifter
connected to its autput.
The ALU has two input, A and B (that come from the A and B bus, respectively) and it's controled by the 6 LSB's from the ALU field, they are:
- F0, F1 and F2 (Controls the ALU function)
- ENA and ENB (Enables the input from A and B bus respectively)
- INVA (Inverts the bits of A)
- INC (Increments 1 to the ALU result)
from MSB to LSB. The logic and arithmetic functions that ALU can operate are managed by F0, F1 and F2 like this:
| F0 | F1 | F2 | Function |
|---|---|---|---|
| 0 | 0 | 0 | A AND B |
| 0 | 0 | 1 | A OR B |
| 0 | 1 | 0 | NOT B |
| 0 | 1 | 1 | A + B |
| 1 | 0 | 0 | A XOR B |
| 1 | 0 | 1 | A * B |
| 1 | 1 | 0 | A / B |
| 1 | 1 | 1 | A % B |
Where AND, OR and XOR are bitwise operations and +, *, / and % are
the addition, multiplication, division and remainder arithmetic operations respectively.
Some useful combinations of ALU signal can be found below:
| F0 | F1 | F2 | ENA | ENB | INVA | INC | Function |
|---|---|---|---|---|---|---|---|
| 0 | 0 | 1 | 1 | 0 | 0 | 0 | A |
| 0 | 0 | 1 | 0 | 1 | 0 | 0 | B |
| 0 | 0 | 1 | 1 | 0 | 1 | 0 | not A |
| 0 | 1 | 0 | 1 | 1 | 0 | 0 | not B |
| 0 | 1 | 1 | 1 | 1 | 0 | 0 | A + B |
| 0 | 1 | 1 | 1 | 1 | 0 | 1 | A + B + 1 |
| 0 | 1 | 1 | 1 | 0 | 0 | 1 | A + 1 |
| 0 | 1 | 1 | 0 | 1 | 0 | 1 | B + 1 |
| 0 | 1 | 1 | 1 | 1 | 1 | 1 | B − A |
| 0 | 1 | 1 | 0 | 1 | 1 | 0 | B − 1 |
| 0 | 1 | 1 | 1 | 0 | 1 | 1 | −A |
| 0 | 0 | 0 | 1 | 1 | 0 | 0 | A AND B |
| 0 | 0 | 1 | 1 | 1 | 0 | 0 | A OR B |
| 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 |
| 0 | 1 | 1 | 0 | 0 | 1 | 0 | −1 |
| 1 | 0 | 0 | 1 | 1 | 0 | 0 | A XOR B |
| 1 | 0 | 1 | 1 | 1 | 0 | 0 | A * B |
| 1 | 1 | 0 | 1 | 1 | 0 | 0 | A / B |
| 1 | 1 | 1 | 1 | 1 | 0 | 0 | A % B |
The shifter has 2 inputs: The value of the ALU operation (let's call it X) and
the operation opcode that are which are the 2 MSB's from the ALU field. The
operations and its opcode are as following:
| Opcode | Operation | Output |
|---|---|---|
0b00 | None | X |
0b01 | Shift Right 1 bit | X >> 1 |
0b10 | Shift Left 8 bits | X << 8 |
0b11 | Shift Left 1 bit | X << 1 |
C BUS
The C BUS field represents which registers gonna be writen with the value
of the C BUS (which is the shifter output). This field has 20 bits because
there are 20 registers connected to the C bus, each bit 1 represents that the
register represented by that bit should be writen by the C BUS.
The relation between a bit n and which register it's represents is shown below (from MSB to LSB):
| Bit | Register |
|---|---|
| 1 | MDR |
| 2 | MAR |
| 3 | PC |
| 4 | LV |
| 5 | R0 |
| 6 | R1 |
| 7 | R2 |
| 8 | R3 |
| 9 | R4 |
| 10 | R5 |
| 11 | R6 |
| 12 | R7 |
| 13 | R8 |
| 14 | R9 |
| 15 | R10 |
| 16 | R11 |
| 17 | R12 |
| 18 | R13 |
| 19 | R14 |
| 20 | R15 |
MEM
The memory field represents which memory operations gonna happen in the cycle. The bit 1 (MSB), 2 and 3 represents the operations of WRITE, READ and FETCH respectively.
Each bit 1 in the field informs that the memory operation related to that field will be executed. What is actually done in operations can be found below:
- READ: Reads the memory word (32 bit) of the address stored in
MARintoMDR. - WRITE: Writes the memory word stored in
MDRinto the memory address stored inMAR. - FETCH: Reads the memory word of the address stored in
PCintoMBR(8-bit) andMBR2(16-bit).
The FETCH operation uses the IFU to store the remainder bytes in cache to only
access memory when needed.
A and B
A and B are fields that control which register writes to the A and B bus
respectively, but, since the A and B bus are connected directly to the A and B
inputs of the ALU, they can also be viewed as which register goes as a entry
to the ALU operation.
The values of A and B and which register they enable (NONE,
represents that none of them writes to the respective BUS, so the value of the
bus is 0) is shown below:
Output to BUS A:
| ID | BIN | Register |
|---|---|---|
| 0 | 00000 | MDR |
| 1 | 00001 | PC |
| 2 | 00010 | MBR |
| 3 | 00011 | MBRU |
| 4 | 00100 | MBR2 |
| 5 | 00101 | MBR2U |
| 6 | 00110 | LV |
| 7 | 00111 | CPP |
| 8 | 01000 | IMMEDIATE |
| 9 | 01001 | R0 |
| 10 | 01010 | R1 |
| 11 | 01011 | R2 |
| 12 | 01100 | R3 |
| 13 | 01101 | R4 |
| 14 | 01110 | R5 |
| 15 | 01111 | R6 |
| 16 | 10000 | R7 |
| 17 | 10001 | R8 |
| 18 | 10010 | R9 |
| 19 | 10011 | R10 |
| 20 | 10100 | R11 |
| 21 | 10101 | R12 |
| 22 | 10110 | R13 |
| 23 | 10111 | R14 |
| 24 | 11000 | R15 |
| .. | ... | NONE |
Output to BUS B:
| ID | BIN | Register |
|---|---|---|
| 0 | 00000 | MDR |
| 1 | 00001 | LV |
| 2 | 00010 | CPP |
| 3 | 00011 | IMMEDIATE |
| 4 | 00100 | R0 |
| 5 | 00101 | R1 |
| 6 | 00110 | R2 |
| 7 | 00111 | R3 |
| 8 | 01000 | R4 |
| 9 | 01001 | R5 |
| 10 | 01010 | R6 |
| 11 | 01011 | R7 |
| 12 | 01100 | R8 |
| 13 | 01101 | R9 |
| 14 | 01110 | R10 |
| 15 | 01111 | R11 |
| 16 | 10000 | R12 |
| 17 | 10001 | R13 |
| 18 | 10010 | R14 |
| 19 | 10011 | R15 |
| .. | ... | NONE |
IMMEDIATE
The immediate field allows us to send a arbitrary 8 bit number to the A or B
bus, i.e if we set 0x08 in the IMMEDIATE field and enable the immediate input
on the A and/or B bus, that 0x08 gonna be loaded in the corresponding bus.
Assembly
Some details about the relation between the assembly code and the microarchitecture.
Register nomeclature
In the assembly code the general purpose registers were renamed for convenience in when writing code, the rename table is as following:
| Register | Assembly Name |
|---|---|
| R0 | Ra |
| R1 | T0 |
| R2 | T1 |
| R3 | T2 |
| R4 | T3 |
| R5 | S0 |
| R6 | S1 |
| R7 | S2 |
| R8 | S3 |
| R9 | S4 |
| R10 | S5 |
| R11 | S6 |
| R12 | A0 |
| R13 | A1 |
| R14 | A2 |
| R15 | A3 |
WARNING: The registers T0 - T3 can be used in instructions like mul to
store temporary values, so the value of those can be changed in the instruction
implementations, therefore they are not guaranteed to have the value that you expect
using instruction like lui or read.
Assembly Implementations
You can find some implementatinos examples of assembly code into the microarchitecture in the next chapter.
Instruction Implementations
In this section we will show how some of the microinstruction of the Vondel Language are implemented in the microprogram.
Operations Summary
- Arithmetic
- Logic
- Shift
- Memory
- Branch
- Immediate
- Other
HALT
HALT is a special microinstruction to indicate that the program has ended. In this design this opcode is achieved by seting all 64 bits of the microinstruction to 1.
This constant is avalible as a shortcut in the CtrlStore struct.
MOV
Syntax: mov r0, ..., rn <- x.
Action: Store the value of the register x on the registers r0, ..., rn.
High Level
mov r15, r13 <- r14
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000011000 | 00000000000000000101 | 000 | 01110 | 11111 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
ADD
Syntax: add r0, ..., rn <- x, y.
Action: Stores the result of x + y on the registers r0, ..., rn.
High Level
add r0, r1 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000111100 | 00001100000000000000 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
SUB
Syntax: sub r0, ..., rn <- x, y.
Action: Stores the result of x - y on the registers r0, ..., rn.
High Level
sub r0 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000111111 | 00001000000000000000 | 000 | 01100 | 00110 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
AND
Syntax: and r0, ..., rn <- x, y.
Action: Stores the result of x & y on the registers r0, ..., rn.
High Level
and r1 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000001100 | 00000100000000000000 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
OR
Syntax: or r0, ..., rn <- x, y.
Action: Stores the result of x | y on the registers r0, ..., rn.
High Level
or r0, r1, r14, r15 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000011100 | 00001100000000000011 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
XOR
Syntax: xor r0, ..., rn <- x, y.
Action: Stores the result of x ^ y on the registers r0, ..., rn.
High Level
xor r0, r1, r14, r15 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001001100 | 00001100000000000011 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
NOT
Syntax: not r0, ..., rn <- x.
Action: Stores the result of !x on the registers r0, ..., rn.
High Level
not r0, r15 <- r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000011010 | 00001000000000000001 | 000 | 01100 | 11111 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
SLL
Syntax: sll r0, ..., rn <- x.
Action: Stores the result of x << 8 on the registers r0, ..., rn.
High Level
sll r0, r15 <- r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 100011000 | 00001000000000000001 | 000 | 01100 | 11111 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
SLA
Syntax: sla r0, ..., rn <- x.
Action: Stores the result of x << 1 on the registers r0, ..., rn.
High Level
sla r0, r15 <- r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 110011000 | 00001000000000000001 | 000 | 01100 | 11111 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
SRA
Syntax: sra r0, ..., rn <- x.
Action: Stores the result of x >> 1 on the registers r0, ..., rn.
High Level
sra r0, r15 <- r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 010011000 | 00001000000000000001 | 000 | 01100 | 11111 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
READ
Syntax: read r0, ..., rn <- [addr].
Action: Stores the memory word at addr on the registers r0, ..., rn.
High Level
read r0, r15 <- 0x00008
Microprogram
Precondition: 0x00008 should be stored at MAR.
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000000000 | 00000000000000000000 | 010 | 11111 | 11111 | 00000000 |
| 1 | 000000010 | 000 | 000011000 | 00001000000000000001 | 000 | 00000 | 11111 | 00000000 |
| 2 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
WRITE
Syntax: write [addr] <- x.
Action: Stores the memory value of x at addr on the memory.
High Level
write 0x00001 <- r14
Microprogram
Precondition: 0x00001 should be store at MAR.
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000011000 | 10000000000000000000 | 000 | 01110 | 11111 | 00000000 |
| 1 | 000000010 | 000 | 000000000 | 00000000000000000001 | 100 | 11111 | 11111 | 00000000 |
| 2 | 111111111 | 111 | 110111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
JAL
Syntax: jal [addr].
Action: Jumps to the instruction of id (address) addr.
High Level
jal 0x0A
Microprogram
JMPC bit is used to nconditional branch, so we have to find some way of
storing 0x0A into MBR and then use JMPC to jump to that address.
For that the number 10 (0x0A) must be stored at memory and its address
should be stored in PC. So, let's store 10 in the current address of the PC,
by default the PC is 0, so let's store it there.
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000000000 | 00000000000000000000 | 010 | 11111 | 11111 | 00000000 |
| 1 | 000000010 | 100 | 000000000 | 00000000000000000000 | 000 | 11111 | 11111 | 00000000 |
| - | - | - | - | - | - | - | - | - |
| 10 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
In a nutshell the operations are the following:
- Fetch the memory.
- Set
JMPCto 1.
BEQ
Syntax: beq x, y.
Action: Jump if the value of the register x is equal to the value of the
register y.
High Level
beq r14, r15
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 001 | 000111111 | 00000000000000000000 | 000 | 10111 | 10011 | 00000000 |
| - | - | - | - | - | - | - | - | - |
| 257 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
BNE
Syntax: bne x, y.
Action: Jump if the value of the register x is not equal to the value of the
register y.
High Level
bne r14, r15
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 010 | 000111111 | 00000000000000000000 | 000 | 10111 | 10011 | 00000000 |
| 1 | 000000010 | 010 | 000111111 | 00000000000000000000 | 000 | 11000 | 10010 | 00000000 |
| - | - | - | - | - | - | - | - | - |
| 257 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
| 258 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
BLT
Syntax: blt x, y.
Action: Jump if the value of the register x is less than the value of the
register y.
High Level
blt r14, r15
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 010 | 000111111 | 00000000000000000000 | 000 | 11000 | 10010 | 00000000 |
| - | - | - | - | - | - | - | - | - |
| 257 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
BGT
Syntax: bgt x, y.
Action: Jump if the value of the register x is greater than the value of the
register y.
High Level
bgt r14, r15
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 010 | 000111111 | 00000000000000000000 | 000 | 10111 | 10011 | 00000000 |
| - | - | - | - | - | - | - | - | - |
| 257 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
MUL
Syntax: mul r0, ..., rn <- x, y.
Action: Multiply the value of x by y .
High Level
mul r13 <- r6, r7
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 010 | 000111111 | 00000000000000000000 | 000 | 10000 | 01010 | 00000000 |
| 1 | 000000010 | 000 | 000011000 | 00001000000000000000 | 000 | 10000 | 11111 | 00000000 |
| 2 | 000000011 | 000 | 000111111 | 00000100000000000100 | 000 | 01111 | 11111 | 00000000 |
| 3 | 000000100 | 001 | 000110110 | 00001000000000000000 | 000 | 01001 | 11111 | 00000000 |
| 4 | 000000011 | 000 | 000111100 | 00000000000000000100 | 000 | 10110 | 00101 | 00000000 |
| 5 | 000000011 | 000 | 000111111 | 00000100000000000100 | 000 | 10000 | 11111 | 00000000 |
| 6 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
| - | - | - | - | - | - | - | - | - |
| 257 | 000000101 | 000 | 000011000 | 00001000000000000000 | 000 | 01111 | 11111 | 00000000 |
| - | - | - | - | - | - | - | - | - |
| 260 | 000000110 | 000 | 000000000 | 00000000000000000000 | 000 | 00000 | 00000 | 00000000 |
In a nutshell the operations are the following:
- Jump to
0b100000001(257) ifr6 - r7is negative. - Case 1 (has not jumped), therefore
r6 >= r7- Store the value of
r7onr0 - Store the value of
r6onr1andr13 - Store the value of
r0 - 1intor0or jump to theTERMINATEcode (260) if equals to 0. - Stores the value of
r13 + r1intor13.
- Store the value of
- Case 2 (has jumped), therefore
r6 < r7- Store the value of
r6onr0 - Store the value of
r7onr1andr13and goto the step 3 of the case 1.
- Store the value of
In another words, the smallest number is stored at r0 and the greater is stored
in r1 and we aways make r0 * r1. This optimization is necessary because a
multiplication is computed as sequential additions, and making less additions
save some clock cycles.
In the real assembly implementation the registers used to store the temporary
values are T0 (to store the minimum), T1 (to store the sum) and T2 (to store the maximum),
so be caution when using this registers at the same time as the mul operation.
MUL2
Syntax: mul2 r0, ..., rn <- x, y.
Action: Multiply the value of x by y .
The difference between mul2 and mul are that mul is a software addition,
in other words, a sequential sum of a number, on the other hand mul2 is
hardware implemented (uses a multiplication circuit).
High Level
mul2 r0, r1 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001011100 | 00001100000000000000 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
DIV
Syntax: div r0, ..., rn <- x, y.
Action: Divide the value of x by y .
High Level
div r0, r1 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001101100 | 00001100000000000000 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
MOD
Syntax: mod r0, ..., rn <- x, y.
Action: Calculate the remainder of the division of x by y.
High Level
mod r0, r1 <- r2, r3
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001111100 | 00001100000000000000 | 000 | 01011 | 01100 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
LUI
Syntax: lui r0, ..., rn <- [byte].
Action: Store the the immediate value of byte on the registers r0, ..., rn.
High Level
lui r15 <- 0xFF
This program stores 0xFF (255) to r15, but 255 is not stored in any register (it is
a immediate value).
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000011000 | 00000000000000000001 | 000 | 01000 | 11111 | 11111111 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
ADDI
Syntax: addi r0, ..., rn <- x, [byte].
Action: Stores the value of x + [byte] on the registers r0, ..., rn.
High Level
addi r14 <- r14, 0x05
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000111100 | 00000000000000000010 | 000 | 01000 | 10010 | 00000101 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
SUBI
Syntax: subi r0, ..., rn <- x, [byte].
Action: Stores the value of x - [byte] on the registers r0, ..., rn.
High Level
subi r14 <- r14, 0x05
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000111111 | 00000000000000000010 | 000 | 01000 | 10010 | 00000101 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
ANDI
Syntax: andi r0, ..., rn <- x, [byte].
Action: Stores the value of x & [byte] on the registers r0, ..., rn.
High Level
andi r14 <- r14, 0xFF
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000001100 | 00000000000000000010 | 000 | 01000 | 10010 | 11111111 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
ORI
Syntax: ori r0, ..., rn <- x, [byte].
Action: Stores the value of x | [byte] on the registers r0, ..., rn.
High Level
ori r14 <- r14, 0x00
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 000011100 | 00000000000000000010 | 000 | 01000 | 10010 | 00000000 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
XORI
Syntax: xori r0, ..., rn <- x, [byte].
Action: Stores the result of x ^ [byte] on the registers r0, ..., rn.
High Level
xori r0, r1, r14, r15 <- r2, 0x02
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001001100 | 00001100000000000011 | 000 | 01000 | 00110 | 00000010 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
MULI
Syntax: muli r0, ..., rn <- x, [byte].
Action: Stores the result of x * [byte] on the registers r0, ..., rn.
High Level
muli r0, r1, r14, r15 <- r2, 0x02
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001011100 | 00001100000000000011 | 000 | 01000 | 00110 | 00000010 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
DIVI
Syntax: divi r0, ..., rn <- x, [byte].
Action: Stores the result of x / [byte] on the registers r0, ..., rn.
High Level
divi r0, r1, r14, r15 <- r2, 0x02
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001101100 | 00001100000000000011 | 000 | 01000 | 00110 | 00000010 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
MODI
Syntax: modi r0, ..., rn <- x, [byte].
Action: Stores the result of x % [byte] on the registers r0, ..., rn.
High Level
modi r0, r1, r14, r15 <- r2, 0x02
Microprogram
| ID | NEXT | JAM | ALU | C BUS | MEM | A | B | IMMEDIATE |
|---|---|---|---|---|---|---|---|---|
| 0 | 000000001 | 000 | 001111100 | 00001100000000000011 | 000 | 01000 | 00110 | 00000010 |
| 1 | 111111111 | 111 | 111111111 | 11111111111111111111 | 111 | 11111 | 11111 | 11111111 |
Hardware
Some hardware details about some relevant components of the microarchitecture that differ from the provided components from the book Structured Computer Organization.
ALU
Vondel's ALU have some optimizations oven the Arithmetic Logic Unit provided by the book, the main difference is that the function inputs are 3-bit lenght instead of 2-bit, in other words we have 23 possible operations that are:
- Logic And
- Logic Or
- Logic Xor
- Logic Not
- Addition
- Multiplication
- Division
- Remainder
For that we change a little bit the hardware provide a circuit to any given possible operation. The ALU model can be found below:

Adder
The adder is a simple 32-bit full-adder that uses a combination of sequential
1-bit full-adders that conects to each other by the carry, in other words,
the Co of one is the Ci of the next adder.
Multiplier
The multiplier circuit uses a standard single cycle multiplier to multilpy A by B, for that we need to set the inputs and output width to 32-bit lenght.

Divider and Remainder
The divider circuit uses the division algorithm logic implemented on digital logic to divide A by B, the basic idea is:
- Compare the divisor with the selected bits of the dividend.
- If is less, then perform the subtraction, that results on a
Q = 1(Qis the Quotient). - If is greater, do not perform the subtraction (
Q = 0), go to step 2.
- If is less, then perform the subtraction, that results on a
- Add the next bit of the dividend to the result and shift the divisor to right by 1 bit, go to step 1.
- Repeat the procedure until all the bits of the dividend are covered.
This circuit uses a Process Unit (PU) to perform the comparison logic part of this algorithm, this simple circuit is ilustrated below:

Combining a matrix of PU's which each row determine a single Quotient bit we can buid a simple 32-bit x 32-bit adder.

You may notice that in the first row we feed the PU inputs only with B and zeros, the reason is that in the standard divider circuit the dividend lenght is aways 2 times the divisor lenght, that way whe fill the 32 MSB's of the pseudo 64-bit input with zeros, that way we can achieve a 32-bit x 32-bit division.
You can simulate a smaller sample of this circuit here.
IMM
The IMM circuit provides pre functionality of write the IMMEDIATE field
of the microinstruction in the A and/or the B buses. The circuit is fairly
simple and looks like this:

Others
Other integrated circuits of the microarchitecture like the Logic Unit, O and High Bit are exactly the same as illustrated in the book.