RfD - Enhanced local variable syntax, v4
====================================
Stephen Pelc - 11 August 2008
20080811 Removed references to local buffers as appropriate.
20070914 Moved local buffers to separate proposal.
20070607 Wordsmithing. Corrected reference implementation.
20060822 Added explanatory text.
Corrected reference implementation.
Updated ambiguous conditions.
Problem
=======
1) The current LOCALS| ... | notation explicitly forces all
locals
to be initialised from the data stack.
2) 1) The current LOCALS| ... | notation defines locals in
reverse
order to the normal stack notation.
This proposal is derived from implementations that have existed
for
more than 15 years.
Solution
========
Base version
------------
The following syntax for local arguments and local values is
proposed. The sequence:
{ ni1 ni2 ... | lv1 lv2 ... -- o1 o2 ... }
defines local arguments, local values, and outputs. The local
arguments are automatically initialised from the data stack on
entry, the rightmost being taken from the top of the data stack.
Local arguments and local values can be referenced by name
within
the word during compilation. The output names are dummies to
allow
a complete stack comment to be generated.
The items between { and | are local arguments.
The items between | and -- are local values or local buffers.
The items between -- and } are outputs for formal comments
only.
The outputs are provided in the notation so that complete stack
comments can be produced. However, all text between -- and } is
ignored. The facility is there to permit the notation to form a
complete stack comment. This eases documentation and current
users of the notation like this facility.
Local arguments and values return their values when referenced,
and must be preceded by TO to perform a store.
Any name ending in the '[' character is reserved for
compatibility
with existing implementations.
In the example below, a and b are local arguments, a+b and a*b
are
local values, and arr[ is a 10 byte local buffer.
: foo { a b | a+b a*b -- }
a b + to a+b
a b * to a*b
cr a+b . a*b .
;
Local types and extensions
--------------------------
Some current Forth systems use indicators to define local values
of sizes other than a cell. It is proposed that any name ending
in a ':' (colon) be reserved for this use.
: foo { a b | F: f1 F: f2 -- c }
...
;
At least one significant Forth implementation uses local value
names ending in the '[' character to indicate local buffers.
This
character is reserved to prevent disenfranchising
implementations
that have that behaviour. For similar reasons the use of '[' and
'\' as local argument or value names is also reserved.
Discussion
==========
The '|' (ASCII 0x7C) character is widely used as the separator
between local arguments and local values. Other characters
accepted in current Forth implementations are '\' (ASCII 0x5C)
and
'¦' (ASCII 0xA6).. Since the ANS standard is defined in terms of
7 bit ASCII, and with regard to internationalistion, we propose
only
to consider the '|' and '\' characters further. Only recognition
of
the '|' separator is mandatory.
The use of local types is contentious as they only become useful
if TO is available for these. In practice, some current systems
permit TO to be used with floats (children of FVALUE) and other
data types. Such systems often provide additional operators such
as +TO (add from stack to item) for children of VALUE and
FVALUE.
Standardisation of operators with (for example) floats needs to
be done before the local types extension can be incorporated
into
Forth200x. Apart from forcing allocation of buffer space, no
additional functionality is provided by local types that cannot
be obtained using local buffers. More preparatory
standardisation
needs to be done before local types can be standardised.
It has been noted that one widely used implementation uses brace
for multiline comments. However, inspection of the vendor's code
shows that this use only occurs during interpretation. The
interpretation semantics of brace in this proposal are undefined
in order for that implementation to coexist with this proposal.
Forth 200x text
===============
13.6.2.xxxx {
brace LOCAL EXT
Interpretation: Interpretation semantics for this word are
undefined.
Compilation:
( "<spaces>arg1" ... "<spaces>argn" | "<spaces>lv1" ...
"<spaces>lvn" -- )
Create up to eight local arguments by repeatedly skipping
leading
spaces, parsing arg, and executing implementation defined
actions.
The list of local arguments to be defined is terminated by "|",
"--" or "}". Append the run-time semantics for local arguments
given below to the current definition. If a space delimited '|'
is
encountered, create up to eight local values by repeatedly
skipping
leading spaces, parsing the "lv" token, and creating the local
element. The list of local values to be defined is terminated by
"--" or "}". Append the run-time semantics for local values
given below to the current definition. If "--" has
been encountered, further text between "--" and } is ignored.
Local argument run-time: ( x1 ... xn -- )
Local value run-time: ( -- )
Initialize up to eight local arguments from the data stack
Local argument arg1 is initialized with x1, arg2 with x2 up
to argn from xn, which is on the top of the data stack. When
invoked, each local argument will return its value. The value
of a local argument may be changed using 13.6.1.2295 TO.
Initialize up to eight local values. The initial contents of
local
values are undefined. When invoked, each local value
returns its value. The value of a local value may be changed
using 13.6.1.2295 TO. The size of a local value is a cell.
The user may make no assumption about the order and contiguity
of
local values in memory.
Ambiguous conditions:
a) The { ... } text extends over more than one line.
b) { ... } is declared more than once in a word.
c) Parsing units '|', ']', '--' and '}' are not whitespace
delimited.
Reference implementation
=========================
(currently untested)
0 [if]
BUILDLV c-addr u +n mode
When executed during compilation, BUILDLV passes a message to
the
system identifying a new local argument whose definition name is
given by the string of characters identified by c-addr u. The
size
of the data item is given by +n address units, and the mode
identifies the construction required as follows:
0 - finish construction of initialisation and data storage
allocation code. C-addr and u are ignored. +n is 0
(other values are reserved for future use).
1 - identify a local argument, +n = cell
2 - identify a local value, +n = cell
3+ - reserved for future use
-ve - implementation specific values
The result of executing BUILDLV during compilation of a
definition
is to create a set of named local arguments, values and/or
buffers, each of which is a definition name, that only have
execution semantics within the scope of that definition's
source.
[then]
: BUILDLV \ c-addr u +n mode --
\ Dummy for testing
CR 2SWAP TYPE SPACE SWAP . .
;
: TOKEN \ -- caddr u
\ Get the next space delimited token from the input stream.
\ Can be extended to permit multiple line declarations.
PARSE-NAME
;
: LTERM? \ caddr u -- flag
\ Return true if the string caddr/u is "--" or "}"
2DUP S" --" COMPARE 0= >R
S" }" COMPARE 0= R> OR
;
: LSEP? \ caddr u -- flag
\ Return true if the string caddr/u is the separator between
\ local arguments and local values or buffers.
2DUP S" |" COMPARE 0= >R
S" \" COMPARE 0= R> OR
;
: { ( -- )
0 >R \ indicate arguments
BEGIN
TOKEN 2DUP LTERM? 0=
WHILE \ -- caddr len
2DUP LSEP? IF \ if '|'
R> DROP 1 >R \ change to vars and
buffers
ELSE
R@ 0= IF \ argument?
CELL 1
ELSE \ value or buffer
CELL 2
THEN
BUILDLV
THEN
REPEAT
BEGIN
S" }" COMPARE
WHILE
TOKEN
REPEAT
0 0 0 0 BUILDLV
R> DROP
; IMMEDIATE
--
Stephen Pelc, stephen@...
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads