Misty Programming Language:The Parse Tree

This document describes the Misty parse tree. This is not a part of the Misty System Standard. A conforming implementation does not have to conform to this. This is descriptive of one particular implementation.

The tree can be fully realized as a JSON object.

Tokens

The subprogram tokenize.mst takes a source text and produces an array of tokens. The tokens are made up of these fields:

The kind and text fields will be meaningful to the parser that consumes the token list. The other fields are intended for error reporting.These token records are treated as nodes that are woven into a parse tree.

kind

"name", "text", "number", "comment", "newline", "space", and all of the intrinsics, functinos, and punctuators, including:

.  :  (  )  [  ]  {  }  +  -  *  /  //  &  &&  =   <>  <  >  <=   >=  /\  \/  |

text

The characters making up the token.

The kind: "text" tokens have already had escape sequences decoded and outer quotes removed.

A contigous run of spaces is made into a single kind: "space" token with a text field containing the entire run of spaces.

number

The numeric value of kind: "number" tokens.

at

The character position in the source text where the token begins, starting at 0.

from_row

The line number of the the token, starting at 0.

from_column

The column number of the start of the token, starting at 0.

to_row

The line number if the last character of the token.

to_column

The column number of the character past the token.

indentation

The number of spaces as an integer. This is in space tokens with a from_column of 0, and long texts.

 

If the token contains an error, then it will also contain these fields:

error

A brief description of the error. Numbers, texts, and functinos can contain errors.

error_at

The character position in the source text where the error was identified.

error_column

The column where the error was identified.

error_row

The row at which the error was identified.

Tree

The subprogram parse.mst turns a list of tokens into an abstract parse tree by adding new fields. The source notations are retained for error reporting and debugging.

Root

The root of the tree is a record. Its fields are

endowments

An array of endowment names.

functions

An array of functions. For each function there is a record in this array, plus function 0, the program or subprogram.

intrinsics

An array of intrinsics that are used. This information might be used by a linker.

kind

"misty"

logs

An array of log names.

misty

"program" or "subprogram"

name

The name of the file.

patterns

An array of pattern definitions.

scopes

An array of scopes. This array is parallel with the functions array. It contains records containing the names that are used in each function. The names can be the function name and inputs, and additional names created with def, use, var. The scope also contains names that are found in outer functions and intrinsics.

A scope is used to determine what a name in a function refers to.

uses

An array of subprograms that are mentioned in use statements. This information might be used by a linker.

Names

Names are use to represent functions, parameters, variables, and constants.

kind

"name"

function_nr

The number of the function in which this name is defined.

level

0 if a variable is declared in the current function
1 if a variable is declared in the outer function
2 or after and so on

make

"def" "endowment" "function" "intrinsic" "parameter" "use" "var"

text

The name, as a simple text.

More information about the name token can be found in the scope.

Functinos

A functino is an operator function. It looks like an operator, but it behaves as an intrinsic function. It has a name token, with the operator as the name.

Literals

Number

Number tokens have their values in text form so that the interpretation of the number can be delayed as long as possible. This is to minimize rounding and other representation errors.

kind

"number"

text

A number in text form. This is important during bootstrapping because JavaScript is not able to exactly represent Misty numbers.

number

A number.

Text

kind

"text"

text

A text.

indentation

On long text literals, this is the numeric indentation of the close quote.

Array

kind

"array"

first

An array (possibly empty) of expression nodes.

Record

kind

"record"

first

An array (possibly empty) of pair records. A pair record is

first

A simple text.

second

An expression node.

Space

kind

"space"

text

The spaces.

length

The number of spaces.

Operators

Infix operators

first

The left operand node.

second

The right operand node.

text

|  *  /  //  +  -  &  &&  =  <>  <  <=  >  >=  /\  \/

The . operator

text

"."

first

The left operand node.

second

A text token.

Subscript

kind

"["

first

The left operand node.

second

The right operand node. If right is missing, then [] are representing the push/pop operator.

The (invoke operator

text

"("

first

The left operand node, which should resolve to a function.

second

An array of expression nodes, an argument list.

The then operator

This is the ternary operator.

text

"then"

first

A condition node.

second

The expression node if true.

third

The expression node if false.

Statements

The assign statement

The first and second may also have a box field, which if true, indicates that the first result receives the last element of the destination array, removing it from the destination array, reducing its length by 1, or that the second result is appended to the destination array.

text

"assign"

first

The destination expression node.

second

The value expression node.

push

true if first ended with [].

pop

true if second ended with [].

The break statement

text

"break"

label

If the statement is labelled, this is a simple text.

The call statement

text

"call"

first

An invocation expression.

The def statement

text

"def"

first

A name token.

second

An expression node.

The disrupt statement

kind

"disrupt"

The do statement

text

"do"

statements

An array of statement nodes.

label

If the do statement is labelled, this is a simple text.

The if statement

text

"if"

first

A conditional expression node.

statements

An array of statement nodes.

else_if

An array of if statement nodes (without else_if or else) that are from else if.

else

an array of statement nodes (optional).

The jump statement

text

"jump"

first

An invocation expression.

The return statement

text

"return"

first

An optional expression node.

The send statement

kind

"send"

first

The address expression node.

second

The message expression node.

third

The optional reply function expression node.

The use statement

text

"use"

first

A name token.

second

An invocation expression.

The var statement

text

"var"

first

A name token.

second

An optional expression node. The default is null.

Function

The function operator does two things: It makes a function definition that it appends to the functions array, and it makes a reference to that function definition.

Function reference

kind

"function" "program" "subprogram"

function_nr

The location of the function definition in the functions array.

name

If it is an operator function, then name will be name of the operator (functino), such as "'+'". The name will also be noted in the intrinsics.

Function definition

disruption

An array of statement nodes, the function's disruption handler. (Optional.)

kind

"function"

parameters

A list of name nodes, the parameters. Each has a parameter_nr field, 0 or greater. If a parameter has a default value, it will be in the name token's first field.

name

If the function is named, this is a simple text.

outer

The number of the outer function that made this one.

statements

An array of statement nodes.

nr_close_slots

The number of slots that are protected during memory reclamation.

nr_slots

The number of slots required for the this function's inputs, vars, and defs.

Scope

The root node contains an array of scopes. The scopes array is indexed by function_nr, like the functions array. Each function definition has a scope record containing all of the names used in that function.

A scope record contains name fields, where the key is a variable name, and the field is a record containing:

closure

If the name is used by an inner function, then this is true.

function_nr

The number of the this function in the functions array..

functino

If this is an intrinsic functino, then true.

level

The distance to the outer scope.

make

The maker of the name:

 def  endowment  function  parameter  intrinsic  use  var

Only names that were made by the var statement can have their values replaced by the assign statement.

name

The name of the variable.

nr_uses

The number of appearences where the variable is actually used.

Pattern

The pattern operator does two things: It makes a function definition that it appends to the patterns array, and it makes a reference to that pattern definition.

Pattern reference

kind

"pattern"

pattern_nr

The location of the pattern definition in the patterns list.

Pattern definition

Coming soon.