ref: d6bd40e37b085aec922c6bb56e867e85a6c36982
parent: 15392a525cae61a9fc06d40cd4846664512661b2
author: Ori Bernstein <ori@eigenstate.org>
date: Thu Feb 2 19:06:41 EST 2017
Fix some looseness in the spec.
--- a/doc/lang.txt
+++ b/doc/lang.txt
@@ -73,7 +73,8 @@
To put it in words, /regex/ defines a regular expression that would
match a single token in the input. "quoted" would match a single
string. <english description> contains an informal description of what
- characters would match.
+ characters would match. In the case of ambiguity, longest match wins.
+ In the case of ambiguity with a quoted string, the quoted string wins.
Productions are defined by any number of expressions, in which
expressions are '|' separated sequences of terms.
@@ -85,9 +86,9 @@
2.2. As-If Rule:
- Anything specified here may be treated however the compiler wishes,
- as long as the result is observed as if the semantics specified were
- followed strictly.
+ Anything specified in this document may be treated however the
+ compiler wishes, as long as the result is observed as if the semantics
+ specified were followed strictly.
3. STRUCTURE:
@@ -96,7 +97,9 @@
The language is composed of several classes of tokens. There are
comments, identifiers, keywords, punctuation, and whitespace.
- Comments begin with "/*" and end with "*/". They may nest.
+ Comments begin with "/*" and end with "*/". This style of comment
+ may nest, meaning that /* and */ still have a meaning within the
+ comment. No other text in this comment is interpreted.
/* this is a comment /* with another inside */ */
@@ -108,8 +111,9 @@
// it will end on this line, regardless of the trailing \
Identifiers begin with any alphabetic character or underscore, and
- continue with alphanumeric characters or underscores. Currently the
- compiler places a limit of 1024 bytes on the length of the identifier.
+ continue with alphanumeric characters or underscores. The compiler
+ may place a reasonable limit on the length of an identifier. This
+ limit must be at least 256 characters.
some_id_234__
@@ -182,7 +186,7 @@
3.3. Declarations:
decl: attrs ("var" | "const" | "generic") decllist
- attrs: ("exern" | "pkglocal" | "$noret")+
+ attrs: ("extern" | "pkglocal" | "$noret")+
decllist: declbody ("," declbody)*
declbody: declcore ["=" expr]
declcore: name [":" type]
@@ -304,10 +308,11 @@
In Myrddin, declarations may appear in any order, and be used at any
point at which it is in scope. Any global symbols are initialized
- before the program begins. Any nonglobal symbols are initialized
- on the line where they are defined. This decision allows for slightly
- strange code, but allows for mutually recursive functions with no
- forward declarations or special cases.
+ before the program begins. Any nonglobal symbols are initialized on
+ the line where they are declared, if they have an initializer.
+ Otherwise, their contents are indeterminate. This decision allows for
+ slightly strange code, but allows for mutually recursive functions
+ with no forward declarations or special cases.
3.5.1. Scope Rules:
@@ -330,9 +335,9 @@
3.5.2. Capturing Variables:
- When a closure is created, it captures all of the variables
- that it refers to in its scope by value. This allows for
- simple heapification of the closure.
+ When a closure is created, it captures the stack variables that
+ are in its scope by value. This allows for simple heapification of
+ the closure.
For example:
@@ -761,6 +766,8 @@
the index operator can be used on the type. It is implemented
for slice and array types.
+ A user cannot currently implement this trait on their types.
+
4.6.5. sliceable
The sliceable trait is a built in trait which implies that
@@ -845,13 +852,13 @@
String literals begin with a ", and continue to the next
unescaped ".
- eg: "foo\"bar"
+ e.g. "foo\"bar"
Multiple consecutive string literals are implicitly merged to create
a single combined string literal. To allow a string literal to span
across multiple lines, the new line characters must be escaped.
- eg: "foo" \
+ e.g. "foo" \
"bar"
They have the type `byte[:]`
@@ -865,7 +872,7 @@
compiler (generally UTF8). They share the same set of escape
sequences as string literals.
- eg: 'א', '\n', '\u{1234}'
+ e.g. 'א', '\n', '\u{1234}'
They have the type `char`.
@@ -877,7 +884,7 @@
0o to indicate an octal value, or 0b to indicate a binary value.
Decimal values are not prefixed.
- eg: 0x123_fff, 0b1111, 0o777, 1234
+ e.g. 0x123_fff, 0b1111, 0o777, 1234
They have the type `@a::(numeric,integral)
@@ -887,7 +894,7 @@
Unsurprisingly, they evaluate to `true` or `false`
respectively.
- eg: true, false
+ e.g. true, false
They have the type `bool`
@@ -897,7 +904,7 @@
value, a value that takes zero bytes storage, and contains
only the value `void`. Like my soul.
- eg: void
+ e.g. void
They have type `void`.
@@ -907,7 +914,7 @@
digit and possibly separated by underscores. Floating point
literals are always in decimal.
- eg: 123.456, 10.0e7, 1_000.
+ e.g. 123.456, 10.0e7, 1_000.
They have type `@a::(numeric,floating)`
@@ -940,6 +947,9 @@
unindexed initializer sequence. For struct literals, the
initializer sequence is always a named initializer sequence.
+ All elements not initialized in the literal expression are
+ filled with zero bytes.
+
An unindexed initializer sequence is simply a comma separated
list of values. An indexed initializer sequence contains a
'#number=value' comma separated sequence, which indicates the
@@ -983,7 +993,7 @@
If a function is defined where stack variables are in scope,
and it refers to them, then the stack variables shall be copied
- to an environment on thes stack. That environment is scoped to
+ to an environment on the stack. That environment is scoped to
the lifetime of the stack frame in which it was defined. If it
does not refer to any of its enclosing stack variables, then
this environment will not be created or accessed by the function.
@@ -1019,7 +1029,7 @@
flow construct either. Labels are identifiers preceded by
colons.
- eg: :my_label
+ e.g. :my_label
They can be used as targets for gotos, as follows:
@@ -1303,9 +1313,11 @@
of the expression is stored on the left hand side after this
statement has executed.
- The fused assignment operators are equivalent to applying the
- arithmetic or bitwise operator to the lhs and rhs of the
- expression before storing into the lhs.
+ The expression is similar to applying the expression to the lhs
+ and rhs of the expression before storing into the lhs. However,
+ the lvalue of the expression is evaluated fully before being
+ computed and stored into, meaning that any side effects in the
+ subexpressions will only be applied once.
Type:
@@ -1738,7 +1750,7 @@
is omitted, then it is equivalent to matching it against a gap
pattern. If all elements match, then this is a successful match.
- Array pattenrs recursively check each member of the array that is
+ Array patterns recursively check each member of the array that is
provided. The array length must be part of the match. If all array
elements match, then this is a successful match.