Have just been browsing in Python Land and have seen
how they deal with unicode chars & strings and low
and behold it seems that the "u" prefix is how they
solve the unicode syntax problem.
So to keep as much REBOL compatability as possible
for PRIMO here is my latest suggestions for PRIMO
syntax for chars & strings & unicode * immutables
etc.
>> type? #"A"
== char!
>> type? #'A"
== unicode-char!
>> type? "" ; or {}
== string! ; ascii
>> type? ."" ; or .{}
== immutable-string!
>> type? u"" ; or u{}
== unicode-string!
>> type? .u"" ; or .u{}
== immutable-unicode-string!
>> type? []
== block!
>> type? .[]
== immutable-block!
This way deeply quoted strings etc would work just as
they currently do in REBOL and much compatibility could
be retained.
Is everyone agreeable?
cheers,
Mark Dickson
A small example how I've implemented the datamodel so far in C++.
The attachment is an PDF document of a class hierachy..
In this model the values are fixed sizes. With exeption to complex or variable size values.
Series and larger datatypes are stored in Containers. Containers always have one or more Pointer values attached to them. When a pointer value is constructed it increases the usedinteger in the container.
If the usedinteger hits zero the container will be destroyed. Also all items in that container are released. This could start a chain reaction through a tree of series and deallocate all linked values.
When a Pointer is deallocated it will decrease the used integer. Removing the container when needed.
In this deallocation scheme cyclic memory structures will not be deallocated.
I have constructed a parser. It works quite well but It is not finished yet.
On Thu, 1 Nov 2001 Robbo1Mark@... wrote:
> Firstly, your datamodel ideas are of much interest
> to me, just to ensure i understand your proposals
> fully, could you please perhaps show me by way of
> examples how you might implement a few particular
> datatypes. Integer! Decimal! Block! String! etc.
Ok, although variable sized values doesn't feel like such a good idea
anymore, here are some examples:
Unset! None! Logic! Char! Byte! Unichar! Short! 32bits
word1 bits 0-1 .size field (00)
bits 2-15 .datatype field
bits 16-31 .value field
; value unused for unset! and none!, on/off for logic!,
; 8 bits for char! and byte!, 16 bits for unichar! and short!
Integer! 64bits
word1 bits 0-1 .size field (01)
bits 2-15 .datatype field
bits 16-31 .padding
word2 bits 32-63 .value field
; padding unused, could perhaps be used for base-encoding (bin, oct, hex)
; and/or for additional exponents
Byte-Pair! 32bits
word1 bits 0-1 .size field (00)
bits 2-15 .datatype field
bits 16-23 .byte1
bits 24-31 .byte2
Short-Triple! 64bits
word1 bits 0-1 .size field (01)
bits 2-15 .datatype field
bits 16-31 .short1
word2 bits 32-47 .short2
bits 48-63 .short3
Super-Mega-Byte-Tuple! 128bits
word1 bits 0-1 .size field (01)
bits 2-15 .datatype field
bits 16-19 .index (if we like to treat it as a series :-)
bits 20-23 .length
bits 24-31 .byte1
word2 bits 32-39 .byte2
bits 40-47 .byte3
bits 48-55 .byte4
bits 56-63 .byte5
word3 bits 64-71 .byte6
bits 72-79 .byte7
bits 80-87 .byte8
bits 88-95 .byte9
word4 bits 96-103 .byte10
bits 104-111 .byte11
bits 112-119 .byte12
bits 120-127 .byte13
Block! 128bits
word1 bits 0-1 .size field (11)
bits 2-15 .datatype field
bits 16-31 .padding
word2 bits 32-63 .value field
word3 bits 64-95 .index field
word4 bits 96-127 .length field
> About decimal!, im my earlier post you will see
> that the REBOL docs do in fact say they are 64bit
> double precision values.
Yes I see, there was quite a lot I didn't know about decimal!s.
> Regards Ascii & Unicode string! types. I wanted to
> try to keep the concept of string syntax very similar
Fair enough.
> to what REBOL currently has where "" & {} are pretty
> much interchangeable and deeply quoted strings are
> quite trivial by mixing and matching quotes "" or
> braces {}.
Not quite. Braces work for deep quotes, as long as you balance them.
But in a "string" you need to escape all other quotation marks, those are
btw proabably used more often within braced strings too.
> string syntax though braces {} seemed to be the right
> choice for unicode string!. As Larry Wall of Perl would say
> "Things that are different should look different"
> so hence "" for ascii and {} for unicode.
But it would be nice to not break Rebol compability if not needed. I won't
pretend I have a solution for the Unicode syntax, but we should definately
examine some more options. Some prefix char could perhaps be used:
>> u"Hello"
** Script Error: u has no value
If we make the string rule a little smarter that is. :-)
> Another idea that might be possible is keeping strings
> as is with braces & quotes and requiring unicode strings
> to have a prefix u" or u{ to identify them as unicode
> however i don't think that's very pretty nor elegant.
I do.
> Again with immutable-string! and immutable-block! I like
> the suggestion of the prefix dot options ." .{ .[ as I
> think the dot / period is unobtrusive & elegant but also
Indeed! Btw, how do you like .u{"Hello", he said.} ?
> However my hex!, binary! & octal! are NUMERIC values and
> are stored as 32bit ( or any other number of bits ) values
> like integer! are. They are NUMERIC values and can be used
> just as easily with aritmetic and logic functions without
> conversion from any-string! to number! types as is currently
Aargh. Yes, you're right. Stupid Rebol to treat what should be numbers as
any-strings. I still like the idea of variable bases though, how about we
add that as an option:
ffffhex = ffffbase16 ; true
1011bin = 102base3 ; true
But this perhaps this is approaching overkill. ;-)
> About Memory-address! datatype, I look for PRIMO to one day
> have the features of the Languages I like best which are
> REBOL, MIT Scheme & F-PC FORTH. F-PC Forth is a fantastic
...
> I know this raises many issues from a practical and security
> perspective, however the power to "use with care" and the
> ability to have full access to any system feature or resource
> is more desirable in my view than artificial limits or
> constraints.
Though I in theory wouldn't mind being able to write system libraries and
drivers in Primo, it takes a lot more than memory access to make this
happen. And of course, you must never ever be able to corrupt the language
or the system below.
This would be one of the last things I'd implement, and only after having
carefully investigated the consequences.
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
On Thu, 1 Nov 2001, Ladislav Mecir wrote:
First of all, you did read my block proposal, right? Either way, it's
probably best to clear up some things. I see now that it will indeed slow
down and complicate element access, which is hardly acceptable. On the
other hand, if we want variable sized values in Primo we haven't got much
choice, do we? So let's have a look:
In Rebol:
A block consists of a number of places, that contain values. Each value is
96 bits long, so each place is 96 bits wide.
We know what a value is, so let's not repeat.
My suggestion:
A block consists of a number of places, that contain handles. Each handle
is 32 bits long, so each place is 32 bits wide.
What is a handle? First let's find out what a Primo value could be. A
Primo value is either 32, 64, 96, or 128 bits large. The size of the
value is defined by the two first bits. 0 means 32 ... 3 means 128. It
may be better to interpret it this way: 3 means "fetch 3 more longwords".
A handle is either a 32 bit value, or a 32 bit reference. A reference is
not a pointer, but consists of a 2 bit table id, and a 30 bit offset into
a table. An id of 3 means table 3, but also that the size of the places in
the table is (note!) is 4 longwords wide. You would need to multiply the
offset with 4 to get the correct index in the table.
Optionally, you could define a handle to always be a reference, so that 32
bit values are referenced too.
Anyway, you could consider a reference to be a sort of mini-container,
similar to a block, but with just the "length" and the "elements" fields.
> > that's a lot of space, considering that a regular memory pointer on a
> > 32bit machine is just, yes that's right, 4 bytes. So a Rebol block uses at
> > least 4 times as much memory as really would be needed to access a value.
Oops. That should of course be 3 times, not 4.
> Wrong. Ordinary values do not have to be collected at all. Only the part of
> a value not contained in the ordinary size part needs to be collected. That
> means, that the things, that are collected are:
So what really happens when you unset a word? Does every dictionary (what
you might call context data tables) contain the actual values, and when a
word is unset, the relevant value is replaced by an unset! value? Or in
more block-like structures, the word/value pair is removed.
Ok, so there can never be such a thing as free "pending" values in Rebol.
That sure deflates much of what I said, like for example:
> > Each value is used exactly once, or not used at all, in which
> > case it is collected. There's no such thing as a shared value.
Oops again. Obviously there's no such thing as an unused value in Rebol,
either the value exists or it doesn't.
> > Recycle iterates through the system dictionary (system/words) and tag all
> > the values found in it as "don't collect". Then the rest is collected.
Replace "values" with "storages" above, and add some recursion (which I
did in the second version... did you miss that one?) and we almost get
your version:
> 1) all storages as above are condemned (marked as collection candidates)
> 2) recycle unmarks Global Context storage
> 3) recycle unmarks storages of all "running" functions
> 4) recycle recursively browses all unmarked storages and unmarks all
> storages of values that are contained in unmarked storages and storages of
> contexts of all words contained in unmarked storages
> 5) after finishing unmarking all storages that remain condemned are
> collected
And, if it turns out that Primo blocks _reference_ rather than _contain_
values, we must add that all unreferenced values remain condemned and are
destroyed. But then yet another problem appears: memory fragmentation!
Shudder. Perhaps variable values sizes wasn't a good idea after all. :-(
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
Marcus / Everybody,
Hi there,
You made some very interesting & pertinent points
in your posts, here's a few quick replies &
queries.
Firstly, your datamodel ideas are of much interest
to me, just to ensure i understand your proposals
fully, could you please perhaps show me by way of
examples how you might implement a few particular
datatypes. Integer! Decimal! Block! String! etc.
About decimal!, im my earlier post you will see
that the REBOL docs do in fact say they are 64bit
double precision values.
Secondly, your comments about my new / user defined
datatype ideas.
Regards Ascii & Unicode string! types. I wanted to
try to keep the concept of string syntax very similar
to what REBOL currently has where "" & {} are pretty
much interchangeable and deeply quoted strings are
quite trivial by mixing and matching quotes "" or
braces {}.
I chose to keep ascii char! #"A" types as is, so if
we follow on from that logic then ascii string! should
also use quotes "". I chose to use #'A' form for unicode
char! as I thought SINGLE tick ' would be more easy to
remember in association with UNIcode, besides #{} is
already used for other things, in keeping with REBOL
string syntax though braces {} seemed to be the right
choice for unicode string!.
As Larry Wall of Perl would say
"Things that are different should look different"
so hence "" for ascii and {} for unicode.
I didn't want things to look too different though as
I like syntax to be unobtrusive and elegant and not
in your face butt ugly syntax like perl.
However using quotes for ascii and braces for unicode
presents us with two little problems. First one is quite
trivial in that in REBOL quotes for single line strings
and braces for multi-line strings, well at least at the
console input. That is easy solved, and strings can
remain open across multi-lines until they are terminated
by the string closing character(s). Second problem which
requires a little bit more thought about what to do about
deeply quoted strings?
Scheme language solves this problem by requiring all
quotes within strings to have the escape character
immediately preceding the quote ie ^" or \" as it is in
Scheme.
-- An Aside --------
What is the origin / choice / inspiration for the use of
the ^ as the escape character as opposed to \ which is used
in a lot of other languages?
My view is that ^ is much more pretty and unobtrusive.
-- End of Aside -----
However requiring quotes to be escaped can be a real
bummer, especially with deep nesting of quotes.
One way around this is to use a string terminating sequence
and that is why I chose the quote-dot ". and brace-dot }.
combination as string terminators. That way any quotes
or braces within the strings are treated as characters
and don't need to be balanced nor escaped. If you don't like
the dot then perhaps we can look at alternative suggestions
for string terminating character(s) sequences, don't worry
as it's only syntax 8-) , however anything we do choose
should be distinctive, elegant & unobtrusive ;-)
Another idea that might be possible is keeping strings
as is with braces & quotes and requiring unicode strings
to have a prefix u" or u{ to identify them as unicode
however i don't think that's very pretty nor elegant.
Again with immutable-string! and immutable-block! I like
the suggestion of the prefix dot options ." .{ .[ as I
think the dot / period is unobtrusive & elegant but also
makes a statement that these values are distinct & constant!
This will be their value period! hence they are immutable.
What do you think? again it's only syntax so Iam open to
suggestions. 8-)
About based numbers, in PRIMO the current REBOL datatype
binary! will become binary-string! as that more accurately
represents what they are. Sure issue! can still represent
hexadecimal values as well as 2#{} 16#{} and 64#{} binary
or indeed as you say binary-string! n#{} could represent
and number base for 'n as long as there are sufficient
digits and characters to represent a viable number / value
correspondance, so in that repect I like your idea and think
that binary-string! should be given those abilities.
However my hex!, binary! & octal! are NUMERIC values and
are stored as 32bit ( or any other number of bits ) values
like integer! are. They are NUMERIC values and can be used
just as easily with aritmetic and logic functions without
conversion from any-string! to number! types as is currently
required with REBOL hex issue! or Binary! types which cannot
be added, multiplied etc without prior number conversion to
integer!
>> number? #FF
== false
>> number? 16#{FF}
== false
>> any-string? #FF
== true
>> any-string? 16#{FF}
== true
>> add #FF #FF
** Script Error: add expected value1 argument of type: number
pair char money date time tuple
** Near: add #FF #FF
>> add 16#{FF} 1
** Script Error: add expected value1 argument of type: number
pair char money date time tuple
** Near: add #{FF} 1
>> add to-integer #FF to-integer 2#{11111111}
== 510
In my proposals hex!, octal! & binary! are NUMBER! values
and can be used by the arithmetic & logic functions, it is
only their datatype! value which causes them to be displayed
in their appropriate NUMBER format.
Consider the the integer number! 1 in my suggestions these
would be the possible numeric formats....
1 integer!
1 decimal!
1b binary! ; 1bin - more visually distinctive?
1o octal! ; 1oct - ditto ?
1h hex! ; 1hex - ditto ?
1/1 ratio!
1+0i complex!
1i imaginary!
I've not totally made up my mind about the final display
format for hex!, octal! & binary! - opinions please?
the 1b 1o & 1h formats are those used in Scheme language
but maybe 1bin 1oct & 1hex is more visually distinctive?
and maybe a more definite sequence for the parser to
identify?
Again I'm canvassing for opinions as to which is more
preferable & elegant syntactically?
About garbage collection, I've bookmarked a few pages and
resources on the web a while ago und put them in my "FUTURE"
study folder as I've only got a vague conceptual understanding
of this topic so any help or advice on this matter would be
greatly appreciated! LADISLAV seems to have thought indepth
about what is happening and perhaps we should investigate his
ideas further.
About Memory-address! datatype, I look for PRIMO to one day
have the features of the Languages I like best which are
REBOL, MIT Scheme & F-PC FORTH. F-PC Forth is a fantastic
forth environment with an interpreter, native code assembler
compiler & disassembler / debugger as well as it's own
editor, source & help.
It gives the programmer access to all system levels and
resources at either the native code or FORTH code level,
however with this power comes the need for caution and care
not to violate the registers & memory that F-PC uses for
it's own purposes. It doesn't have garbage collection
built in although I do believe that people have implemented
garbage collection routines for FORTH F-PC.
I know this raises many issues from a practical and security
perspective, however the power to "use with care" and the
ability to have full access to any system feature or resource
is more desirable in my view than artificial limits or
constraints.
It is only by being fully open and changeable at ALL levels
can the system truly adapt & survive to meet unforeseen future
needs, otherwise it is a CLOSED system and sooner or later it
will collide with it's own limitations or restrictions.
Yes this might necessitate system specific implementations or
perhaps different types general purpose & system specfic PRIMO's
but that is a good thing in my opinion rather than a one size fit's
all solution.
Let me know what you think about all this, and MARCUS I would
especially like to see some examples of you datatype implementation
proposals.
cheers,
Mark Dickson
Marcus,
Hi
From the REBOL Core User Guide......
The decimal! data type includes 64-bit standard IEEE floating point
numbers. They are distinguished from integer numbers by a decimal
point.
Format
Decimal values are a sequence of numeric digits, followed by a
decimal point, which can be a period (.) or a comma (,), followed by
more digits. A plus (+) or minus (-) immediately before the first
digit indicates sign. Leading zeros before the decimal point are
ignored. Extra spaces, commas, and periods are not allowed.
1.23
123.
123.0
0.321
0.123
1234.5678
A comma can be used in place of a period to represent the decimal
point (which is the custom in some countries):
1,23
0,321
1234,5678
Use a single quote (`) to separate the digits in long decimals.
Single quotes can appear anywhere after the first digit in the
number, but not before the first digit.
100'234'562.3782
100'234'562,3782
Do not use commas or periods separate the digits in a decimal value.
Scientific notation can be used to specify the exponent of a number
by appending the number with the letter E or e followed by a sequence
of digits. The exponent can be a positive or negative number.
1.23E10
1.2e007
123.45e-42
56,72E300
-0,34e-12
0.0001e-001
Decimal numbers span from 2.2250738585072e-308 up to
1.7976931348623e+308 and can contain up to 15 digits of precision.
Creation
Use the to-integer function to convert a string!, integer! , block! ,
or a decimal! data type to a decimal number:
probe to-decimal "123.45"
123.45
probe to-decimal 123
123
probe to-decimal [-123 45]
-1.23E+47
probe to-decimal [123 -45]
1.23E-43
probe to-decimal -123.8
-123.8
probe to-decimal 12.3
12.3
If a decimal and integer are combined in an expression, the integer
is converted to a decimal number:
probe 1.2 + 2
3.2
probe 2 + 1.2
3.2
probe 1.01 > 1
true
probe 1 > 1.01
false
Related
Use decimal? to determine whether a value is an decimal! data type.
print decimal? 0.123
true
Use the form , print , and mold functions with an integer argument to
print a decimal value in its simplest form:
integer . If it can be represented as one.
decimal without exponent. If it's not too big or too small.
scientific notation. If it's too big or small.
For example,
probe mold 123.4
123.4
probe form 2222222222222222
2.22222222222222E+15
print 1.00001E+5
100001
Single quotes (`) and a leading plus sign (+) do not appear in
decimal output:
print +1'100'200.222'112
1100200.222112
....cheers,
Mark Dickson
Hi,
> On Tue, 30 Oct 2001 Robbo1Mark@... wrote:
>
> > First of from previous investigations it would
> > seem to be that the average size of a rebol value
> > is 16 bytes / 128 bits, see below...
>
> I had to read this twice before actually understanding it. Now I realize
> that's a lot of space, considering that a regular memory pointer on a
> 32bit machine is just, yes that's right, 4 bytes. So a Rebol block uses at
> least 4 times as much memory as really would be needed to access a value.
>
> The reason would be they trade memory for speed. A block actually holds a
> complete value, not just a pointer to it, so you save one memory access.
> It sounds insane, can it really be true, or what have I missed?
>
> Aha, maybe it has something to do with the garbage collection.
garbage collection & speed, sure.
> Values that
> are not bound to a word get collected,
Wrong. Ordinary values do not have to be collected at all. Only the part of
a value not contained in the ordinary size part needs to be collected. That
means, that the things, that are collected are:
context data tables, function data storages, port data storages, series data
storages, error data tables
...
> A problem is that we don't know how the garbage collection works in Rebol.
> Elan (author of some Rebol book) speculated on the list that each value
> has a reference counter. I think he's wrong, there are no reference counts
> involved.
You bet.
> Each value is used exactly once, or not used at all, in which
> case it is collected. There's no such thing as a shared value. This is
> true for all valuetypes, even for blocks. Consider this:
>
> a: []
> b: :a
>
> Did you think that 'b refers to the same value as 'a. Wrong! Values are
> always copied. That 'b and 'a may look the same is only due to the fact
> that they both contain the same pointer value. Their indexes are not
> shared. Their lengths are not shared either, my guess is that the length
> is connected to the pointer (i.e. they are the two values of a Vector).
>
> Anyway, let's get back to the issue. How does garbage collection work in
> Rebol? My guess is that the following happens in a Recycle run:
>
> Recycle iterates through the system dictionary (system/words) and tag all
> the values found in it as "don't collect". Then the rest is collected.
No, it has to work differently IMO:
1) all storages as above are condemned (marked as collection candidates)
2) recycle unmarks Global Context storage
3) recycle unmarks storages of all "running" functions
4) recycle recursively browses all unmarked storages and unmarks all
storages of values that are contained in unmarked storages and storages of
contexts of all words contained in unmarked storages
5) after finishing unmarking all storages that remain condemned are
collected
On Tue, 30 Oct 2001 Robbo1Mark@... wrote:
> any-string! / any-block! datamodel ; assumes 32bit word size
> word1 bits 0-31 .datatype field
> word2 bits 32-63 .value field ;32 bit base memory address
> word3 bits 64-95 .index field ; offset from series head
> word4 bits 96-127 .length field ; absolute series length
Some of these are definately overkill. Didn't I read somewhere that Rebol
values actually fit in 96 bits? Note how first, second, third can be used
on many values, but not fourth.
There are no more datatypes in Rebol than they fit in 6 bits. For a series
value, my guess is that value, index, length then would be 30 bits each,
or perhaps 32 bits for the value, and 29 bits each to the others.
Of course, we don't need to follow that model. :-)
> So with this datamodel we can accomodate all the series! type
> values and composite type values, although to confess I'm not
> sure how the bitset! type is implemented in this datamodel.
The data is 256 bits, so it's safe to assume it's a pointer. Wonder how
well a Unicode bitset would work? 65536 bits. :-)
> Tuple! has upto 32bits for .datatype field and 96bits which can
> accommodate 10 x 8bit = 80bits for unsigned byte values as well
> as leaving space for the tuple! length, maximum size 10 byte
> values which needs either 4bits or 3bits, 0-7 offset + 3 which
> gives the valid tuple range of 3-10 values.
Nope, as I said 6 bits for datatype. 80 bits for the values alright, the
size fits in 4 bits, which leaves us 2 bits left with a 96bit value. If
128 bits were used, tuples could be 14 numbers long.
> However 16 bytes / 128bits does seem overkill for some of the
> smaller scalar type values.
> Even assuming 32bits for .datatype field, currently REBOL has
> approx 50 datatypes, which requires only 6bits, even allowing
Oops, sorry if I pointed out the obvious before. ;-)
Actually, let's use the bits to the maximum. As you point out, using 128
bits for all values is overkill. Although having all values be the same
length gives some advantages and is actually required for blocks in Rebol,
if we make Primo blocks contain pointers or references instead of values,
we can have variable-sized values. (read my other mail about blocks)
Some suggestions:
1. Use 2 bits to define if the value is 32, 64, 96 or 128 bits long.
2. Use 14 bits for the datatype number. It won't give us millions of
possible types, but 256 times more than Rebol has. You could even use
the 2 bits above as an offset, giving you 16384 possible types each of
32, 64, 96 or 128 bit values.
3. For 32bit values, use the remaining 16 bits for the value. This allows
storage of unset, none, logic, 16bit ints, chars, and Unicode chars.
For longer values, use them for something else, or simply ignore them.
4. For values >= 64 bits, use the remaining space as defined by the type.
Another cool thing now that I think about it. Let's say Primo blocks
contain 32 bit numbers. You can simply check the two first bits to find
out if it's the real value or a reference. If they are zero, it is an
actual 32bit value. If not, the remaining bits is a reference into the
value tables. Again, you can use the 2 bits to define 3 different value
tables (for 64, 96, 128bit values), which gives you 3 times as many
possible values (about 3 billion) in total. :-)
Ok, so all of this requires some additional logic, but should only add
some constant time to every value fetch. I think it's worth it.
> Note could use 32bit single precision floating point but less accurate
> although could be contained within a 64bit value, however from the
> REBOL docs I'm sure decimals are 64bit double precision.
I'm not sure what the precision currently is (though I have a sneaky
suspicion it's only single). Anyway, the interpreter seems to cut off
decimal! values after 16 digits, including the dot.
> --------- ALTERNATIVE DATAMODEL -------------------------------
> One possible alternative that I've been mulling over is a 64bit
> datamodel in which each PRIMO value has the following;
Could be an option, but I think variable size is much better. :-)
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
On Tue, 30 Oct 2001 Robbo1Mark@... wrote:
> {Unicode string}.
> #'A' ; unicode-char! 2 bytes / 16bit value
Oh, I like this one! :-)
> >> "an "example" of a ""deeply"" quoted ascii string".
> == "an "example" of a ""deeply"" quoted ascii string".
This I don't like at all, it makes parsing much harder.
> An immutable string! .{some text}. or ."some text".
Immutable series are good, but I don't like the string terminator dot.
> A ratio! 355/133 ; similar to pair! but numerator / denominator
Nice! Currently unused too, reported as invalid dates.
> A co-ordinate! -4.56x3.57 ; similar to pair! except 2 decimals!
> A complex! 4.567+3.141i ; as above but real / imaginary
> An imaginary! 3.141i ; numeric as decimal! but postfix-joined i
All fine.
> A hex! FFFFhex ; numeric value as integer! but formatted
I still opt for using either issue! or binary!. There are two advantages
of using any of these:
1. Easier to parse. 16#ffff 16#{ffff} both start with [integer! "#"]
2. Can potentially use any base. 256#{wdowfd%$} is a 256-base integer,
2#1100100100101 is a binary (2-base) number.
> A memory-address! #0000:FFFF ; platform specific - segment / offset hex!
Sorry, but I can't see how this could work in any Rebol-like language.
First, the system will probably not let you access much memory (at least
not give you write access), perhaps just the interpreter's own memory.
Secondly, if you expose the interpreters own memory, you raise a lot of
security issues, not to mention that all value handling including garbage
collection might no longer work. :-(
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
On Tue, 30 Oct 2001 Robbo1Mark@... wrote:
> First of from previous investigations it would
> seem to be that the average size of a rebol value
> is 16 bytes / 128 bits, see below...
I had to read this twice before actually understanding it. Now I realize
that's a lot of space, considering that a regular memory pointer on a
32bit machine is just, yes that's right, 4 bytes. So a Rebol block uses at
least 4 times as much memory as really would be needed to access a value.
The reason would be they trade memory for speed. A block actually holds a
complete value, not just a pointer to it, so you save one memory access.
It sounds insane, can it really be true, or what have I missed?
Aha, maybe it has something to do with the garbage collection. Values that
are not bound to a word get collected, and in Rebol you can't bind a value
to a block. That's why blocks need to contain full values.
Is there perhaps another way to do it?
Purpose: To decrease the size of blocks to a quarter of Rebol blocks,
while making sure that "contained" (actually referenced) values don't
get collected.
A problem is that we don't know how the garbage collection works in Rebol.
Elan (author of some Rebol book) speculated on the list that each value
has a reference counter. I think he's wrong, there are no reference counts
involved. Each value is used exactly once, or not used at all, in which
case it is collected. There's no such thing as a shared value. This is
true for all valuetypes, even for blocks. Consider this:
a: []
b: :a
Did you think that 'b refers to the same value as 'a. Wrong! Values are
always copied. That 'b and 'a may look the same is only due to the fact
that they both contain the same pointer value. Their indexes are not
shared. Their lengths are not shared either, my guess is that the length
is connected to the pointer (i.e. they are the two values of a Vector).
Anyway, let's get back to the issue. How does garbage collection work in
Rebol? My guess is that the following happens in a Recycle run:
Recycle iterates through the system dictionary (system/words) and tag all
the values found in it as "don't collect". Then the rest is collected.
If this is true (or something similar), what we need might be an extended
form of dictionary. One which can bind both words and containers
(any-blocks) to values. The problem is that blocks are very different to
words. A word can be reset or unset, and then its old value is removed
form the dictionary. But a block is mutable (unless it's immutable ;-),
and it can contain more than one value. It would be better to use this
fact to our advantage.
The following technique will work I think, but it will make garbage
collecton slow:
Recycle iterates through the system dictionary (system/words) and tag all
the valuess found in it as "don't collect". Now all the tagged blocks are
iterated through, and the values found in them are also tagged as "don't
collect". This continues recursively, until all tagged blocks have been
scanned. Then all untagged values can be collected.
Does this make sense, or is it flawed, uncorrect, or just too slow? ;-)
Oh btw, more typed series (like strings for chars) would be nice too.
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
On Tue, 30 Oct 2001 Robbo1Mark@... wrote:
> >> add 1 1/1/2001
> == 2-Jan-2001
> >> add 1/1/2001 1
> == 2-Jan-2001
> >> add #"a" 1
> == #"b"
> >> add 1 #"a"
> == 98
This is another undocumented behaviour...
> I believe the datatype! number values are the basis of
> "precedence" between datatypes in REBOL for both sorting
> AND datatype! coercion for the result of arithmetic and
> logic calculations.
If that's true, it's a behaviour we should get rid off, since we will be
able to make new datatype values in Primo, which would all probably get
higher numbers.
Or was this only possible at startup?
datatype!: ___make ___datatype! 160
If so, we could define the order ourselves. Still though, I'd prefer that
at least arithmetic results are based on something more flexible than
datatype numbers.
> I say it is a general rule because you will see from the
> last two examples in adding integer! & char! types that
> in the first instance it operates as you might expect and
> increments the char! value, I'm not sure if the last example
> is correct behaviour as in establishing ASCII char! values
> or if it is a BUG / Feature. 8-)
Or perhaps it's an indication that datatype numbers is not strictly
related to the result of those operations...? :-)
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
On Mon, 29 Oct 2001 Robbo1Mark@... wrote:
> >> source integer?
> integer?: native ["Returns TRUE for integer values." value [any-type!] 29]
Ah, forgot to use the source. :-) Actually, I used ?? instead, didn't
realize these were two different functions.
> and that is where I based my value for integer!. I'm
> not sure if REBOL is actually zero indexed at this
> level or 29 is the offset value :-)
Most probably zero indexed, looks like the implementation shines through.
I'm surprised that these numbers are exposed like that at the user level.
Wonder what the purpose is?
> What is a datatype?
> It is a REBOL value with the datatype? == datatype!
> REBOL_VALUE_struct
> datatype 3 ; == datatype!
> value 29 ; == integer!
Sounds credible to me.
> In this way the datatype! values can be used with the
> native! function 'MAKE to set the datatype! field on
> the constructed REBOL_VALUE
...
> Hope this makes sense?
Yes it does. Nice example.
> How many datatypes should be available?
This is indeed not an easy question.
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
NEW / USER-DEFINED DATATYPES
Further to my recent post about number of possible
datatypes, here are my current thoughts about some
very useful datatypes, each modelled on an existing
datatype but with their own specific & "distinct"
format / parser token identifiers & rules.
"ASCII string".
#"A" ; ascii-char! 1 byte / 8bit value
{Unicode string}.
#'A' ; unicode-char! 2 bytes / 16bit value
>> "an "example" of a ""deeply"" quoted ascii string".
== "an "example" of a ""deeply"" quoted ascii string".
>> print "an "example" of a ""deeply"" quoted ascii string".
an "example" of a ""deeply"" quoted ascii string
>> {an "example" of a "{deeply}" quoted unicode string"}.
== {an "example" of a "{deeply}" quoted unicode string"}.
>> print {an "example" of a "{deeply}" quoted unicode string"}.
an "example" of a "{deeply}" quoted unicode string"
note quotes need NOT balance within strings! all that is
required is initial open string " or { and final string!
terminator ie postfix-joined period ". or }.
An immutable string! .{some text}. or ."some text".
prefix-joined period ." or .{
An immutable block! .[ apple 123 5.4 'literal ]
prefix-joined period .[
A ratio! 355/133 ; similar to pair! but numerator / denominator
A co-ordinate! -4.56x3.57 ; similar to pair! except 2 decimals!
A complex! 4.567+3.141i ; as above but real / imaginary
An imaginary! 3.141i ; numeric as decimal! but postfix-joined i
A hex! FFFFhex ; numeric value as integer! but formatted
An octal! 3770oct ; numeric value as integer! but formatted
A memory-address! #0000:FFFF ; platform specific - segment / offset hex!
These only scratch the surface, still thinking about degrees!,
radians!, kelvin!, ohms!, kilowat! etc etc. etc. there are
literally hundreds on types that can be modelled on the numeric
datatypes that are basically just a .datatype & .value but for
which the .datatype field identifies semantic meaning & appropriate
display format in a human understandable / recognizable form.
Lots more to say on this........
I'm sure you all can think up some of your own
feel free to show them here, I'm all for new
ideas / inspirations.
cheers,
Mark Dickson
DATAMODEL possibilities?
This is a subject I've been playing around with
in my mind for quite a while now without reaching
any firm conclusions as to the best answer nor
solution.
First of from previous investigations it would
seem to be that the average size of a rebol value
is 16 bytes / 128 bits, see below...
>> a: system/stats
== 1463472
>> make block! 10000
== []
>> b: system/stats
== 1625264
>> (b - a) / 10000
== 16.1792
>>
From this we see that the space allocated for each
value in the block! is approximately 16 bytes, note
the fraction part is most probably due to REBOL
allocating memory is fixed size amounts to allow
a certain amount of space for the block to grow via
insertions & appendages.
This 16 byte value size fits quite well with modern
CPU word sizes of either 2 x 64 bits or 4 x 32 bits.
What does this 16 byte / 128 bit value size afford us,
let's look at some examples,
any-string! / any-block! datamodel ; assumes 32bit word size
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ;32 bit base memory address
word3 bits 64-95 .index field ; offset from series head
word4 bits 96-127 .length field ; absolute series length
note the bits / word order of each field may be differently
arranged however on a 32 bit system the use of 32 bits for
the .value base address, .index & .length fields allows the
series! to grow to the maximum system limits if necessary.
Of course on a 64bit system these values could be packed into
2 x 64 bit word values, however this would only enable the
series to grow to the same maximum as a 32 bit system
implementation.
So with this datamodel we can accomodate all the series! type
values and composite type values, although to confess I'm not
sure how the bitset! type is implemented in this datamodel.
Tuple! has upto 32bits for .datatype field and 96bits which can
accommodate 10 x 8bit = 80bits for unsigned byte values as well
as leaving space for the tuple! length, maximum size 10 byte
values which needs either 4bits or 3bits, 0-7 offset + 3 which
gives the valid tuple range of 3-10 values.
However 16 bytes / 128bits does seem overkill for some of the
smaller scalar type values.
Even assuming 32bits for .datatype field, currently REBOL has
approx 50 datatypes, which requires only 6bits, even allowing
for growth to 256 types would require 8bits / 1 byte, here is
what some scalars values would need.
Integer! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ; 32bit signed integer
Date! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-39 .day field ; unsigned byte
bits 40-47 .month field ; unsigned byte
bits 48-63 .year field ; 16bits upto 65536 years
OR ALTERNATIVE Date! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ; 32bit unsigned integer
4294967296 possible days == upto 11.76 million years
requires algorithm to calculate each date from days from
base date; 1st January 0000AD ?
Unset! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ; Does Unset require a value? 0?
Logic! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ; requires only 1bit test off/on
None! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ; Does None require a value? 0?
Char! 64bits
word1 bits 0-31 .datatype field
word2 bits 32-63 .value field ; requires only 8bit unsigned byte
Decimal! 96bits
word1 bits 0-31 .datatype field
word2 & 3 bits 32-95 .value field ; 64bit double precision floating point
Note could use 32bit single precision floating point but less accurate
although could be contained within a 64bit value, however from the
REBOL docs I'm sure decimals are 64bit double precision.
Pair! 96bits
word1 bits 0-31 .datatype field
word2 & 3 bits 32-95 .value field ; 2 x 32bit signed integers
So we can see that 16 bytes / 128bits easily accommodates all (bitset?)
REBOL values although it wastes a LOT of space for some smaller scalar
type values.
--------- ALTERNATIVE DATAMODEL -------------------------------
One possible alternative that I've been mulling over is a 64bit
datamodel in which each PRIMO value has the following;
Any-type! 64bits
word1 bits 0-15 .datatype field 16bits up to 65536 datatypes
word1 bits 16-31 .reserved for datatype! specific options
word2 bits 32-63 .value field ; 32bit scalar or pointer value
Examples
Integer!
word1 bits 0-15 .datatype field ; type 29 integer?
word1 bits 16-31 .unused
word2 bits 32-63 .value ; 32bit signed integer
Decimal!
word1 bits 0-15 .datatype field ; type 30 decimal?
word1 bits 16-23 .offset field ; base address offset
word1 bits 23-31 .size-of field ; value size field
word2 bits 32-63 .value ; 32bit base memory address pointer
consider
decimal! 3.141592643e0
datatype == 30
offset == 0
sizeof == 8 bytes / 64bits / 2 words ; size of a double
value == 0000:FFFF ; 32bit pointer to memory address
stored at address 0000:FFFF is the 64bit double precision
floating point value == 3.141592643e0
another example
string! "Hello"
datatype == 39 string?
offset == 8 bytes / 64bits / 2 words ; see note below
sizeof == 1 byte / 8bits ; size of each character
value == 0000:FFFF ; 32bit pointer to memory address
note1
offset is 2 words because we the first two 32bit values
are the string.length and string.index fields.
note2
sizeof is 1 because each character occupies 1 byte
note3
stored at memory address 0000:FFFF is a series of values,
the first 32bits contain the string.length == 5, the next
32bits contain string.index == 0, each subsequent byte
contains the characters 'H', 'e', 'l', 'l', 'o' etc.
Where the datatype necessitates that the .value field contains
a pointer value then the data is accessed via
base-address + offset ( + .index for series! types )
Does this make sense? can anybody suggest obvious flaws or
improvements or modifications or even a whole different model?
Please discuss / correct / critique etc.
cheers,
Mark Dickson
More On Datatypes
Hello again.....
To follow on from my last post, if we accept
that the predicate numbers for datatypes are
correct and that integer! is 29, decimal! is
number 30 etc. then I think that this ordering
of the datatypes is NOT random, well at least
not for the numeric & logic type values.
Consider the following;
>> sort reduce [ decimal! char! integer! money!
pair! date! time! logic! none!]
== [none! logic! integer! decimal! money! time!
date! char! pair!]
>> add 1 2.3
== 3.3
>> add 3.3 -1
== 2.3
>> add 2.3 $1
== $3.30
>> add 1 1x1
== 2x2
>> add 1 1/1/2001
== 2-Jan-2001
>> add 1/1/2001 1
== 2-Jan-2001
>> add #"a" 1
== #"b"
>> add 1 #"a"
== 98
I believe the datatype! number values are the basis of
"precedence" between datatypes in REBOL for both sorting
AND datatype! coercion for the result of arithmetic and
logic calculations.
You will see fromt he above that the general rule is that
when there are operands of differing types then the result
is of the higher datatype value suggesting that REBOL does
a MAX function comparison of the operand datatype values to
establish type precedence.
I say it is a general rule because you will see from the
last two examples in adding integer! & char! types that
in the first instance it operates as you might expect and
increments the char! value, I'm not sure if the last example
is correct behaviour as in establishing ASCII char! values
or if it is a BUG / Feature. 8-)
I will have more to write soon on my DATA-model theories which
I've been wrestling with for the past few months trying to
establish a coherent, simple, consistent & elegant approach to
the size of REBOl / PRIMO values, I've thought of a perhaps a
couple of solutions which I'll outline here soon to get feedback,
corrections, improvements, clarity, modifications etc. etc.
More on this soon when I get time to properly write down my often
jumbled thoughts and scribbles on these matters.
cheers folks,
Mark Dickson
Marcus / Everybody,
Just to go over some "OLD" ground, what we previously
discovered about datatypes was the following;
Each Predicate for each datatype! has a number in the
last field of it's spec.
for example;
>> source integer?
integer?: native ["Returns TRUE for integer values." value [any-type!] 29]
>> last third get 'integer?
== 29
>> last third get 'decimal?
== 30
>> last third get 'word?
== 22
>> last third get 'get-word?
== 24
>> last third get 'set-word?
== 23
>> last third get 'lit-word?
== 25
>> last third get 'any-word?
== 15
>> last third get 'char?
== 34
>> last third get 'money?
== 31
>> last third get 'unset?
== 1
>> last third get 'error?
== 2
>> last third get 'datatype?
== 3
>> last third get 'native?
== 5
>> last third get 'action?
== 6
>> last third get 'routine?
== 7
>> last third get 'struct?
== 11
>> last third get 'library?
== 12
>> last third get 'port?
== 13
>> copy/part first system/words 35
== [end! unset! error! datatype! context! native! action! routine! op! fun
ction! object! struct! library! port! any-type! any-word!...
>>
If you do all the examples then only the mysterious
datatypes end!, context! & symbol! are missing a
predicate & associated number.
You're correct Marcus when you say that integer! is word 30 in a one based
indexing system, interesting that
in a one based indexing system unset! is word number
2 but has a predicate number 1, integer? has number 29
and that is where I based my value for integer!. I'm
not sure if REBOL is actually zero indexed at this
level or 29 is the offset value :-)
Either way there is an interestingly close correlation
between the predicate number and the datatype word!
position.
If we assume the predicate numbers as the correct
values then consider these examples of datatype!
operations.
What is a datatype?
It is a REBOL value with the datatype? == datatype!
If we use our model of REBOL values having a .datatype
part and in most cases a .value part then the following
model for datatype values makes most sense.
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 3 ; == datatype!
This way Datatype! represents itself as a valid type.
Some more examples
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 29 ; == integer!
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 30 ; == decimal!
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 0 ; == end!
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 1 ; == unset!
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 2 ; == error!
REBOL_VALUE_struct
datatype 3 ; == datatype!
value 20 ; == word!
In this way the datatype! values can be used with the
native! function 'MAKE to set the datatype! field on
the constructed REBOL_VALUE
for example
>> index? find first system/words 'make
== 189
>> Make integer! 5.5
== 5
can be reduced to
MAKE ; == .datatype 20 (word!) .value 189 (system/words index? )
INTEGER! ; == .datatype 3 (datatype!) .value 29 (integer!)
5.5 ; == .datatype 30 (decimal!) .value 5.5 (floating point)
RESULT is
5 ; == .datatype! 29 (integer!) .value 5 ( 32 bit integer )
Hope this makes sense?
More about datatypes & datamodels tomorrow.
Question
How many datatypes should be available?
How many bits should we allocate for the datatype! field?
8bits == 256 possible types
16bits == 65536 possible types
n bits == power 2 n possible types?
I have some interesting theories based on 32bit word /
64bit double word datamodel, but more on that soon
as I don't have time tonight.
cheers folks,
Mark Dickson
On Wed, 24 Oct 2001, Mark Dickson wrote:
> ADD: get-arg(s) check-arg(s) do-add-op make-type return ;
>
> get-arg(s) ; get next reduced value from input-block, set parameter-
> name & push value onto the data-stack.
To get next reduced value is easier said than done. I know this from
working on my interpreter emulator. There can be so much recursion
involved. And then I still haven't considered fetching literal or
unevaluated arguments (i.e func ['arg1 :arg2]).
> register. In REBOL integer! is currently datatype! type value 29.
index? find first system/words 'integer! ; == 30
Close enough. But perhaps you don't consider end! Please do, they are used
to mark end of input AFAIK. Or perhaps you used 0-base index? Sorry. :-)
> So in our example we now have a top down data-stack containing 2 3
> with 3 being the top value. As well as a value in the return-stack or a
> local register containg the byte or integer value of the result datatype!
Wonder if registers could be used to emulate the data-stack for a register
VM as Parrot? Hmm... :-/
> This is a simplified example and doesn't take into account the
> difference between byte, integer & floating point addition routines,
My guess is that a switch selects the correct "real" function.
> Lets start with PRISMATIC the language kernel.
> Iam sure both Daan & Marcus will have much to say, discuss, design
> and also implement for this.
Me? No, not much. ;-)
> Ideally PRISMATIC will be a small and efficient language kernel and
> written in either native assembly or C / C++. *** SPEAK UP DAAN & MARCUS
Well, considering that Daan has probably implemented quite a bit in C it's
no longer much of a choice. If starting over from scratch though, I'd
personally go for C++ with STL. But if we target a VM instead of writing
our own interpreter, it doesn't really matter much.
> PRISMATIC is the basis for a PRIMO/Tiny & PRIMO/Core language
Sounds good.
> We can use REBOL initially to parse input source code & convert it
> into a useable internal code for PRISMATIC, which is all just
Eventually I'll try to target Parrot, but so far it seems to be missing
some structures needed for datatypes... Hmm, suddenly this abstraction
doesn't feel quite right. At which point do we translate to VM code?
Does it need to know about Primo datatypes at all? :-/
> and provide complete access to your computer system PRISM would be
> the Macro Assembler capable of producing native machine code for host or
> target system.
Oh. Unless it's a JIT compiler, dynamic code must be banned. Yep, that's
right, only immutable code-blocks. :-)
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
PRISM, PRISMATIC, PRIMO
Hello again everybody,
The guy from the computer repair shop phoned
today to say that he managed to make good my
home pc although at the cost of a complete
modem replacement, so although I'll be a few
pounds worse off 8-( at least I'll be back online
on a regular basis. 8-)
As I said previously I've been down at the hardware
& machine code level for the past few months, looking
at languages from the perspective of what lies beneath.
Here are my current thoughts on the best way to bootstrap
the OSCAR / PRIMO project. As I see it there are three
separate interconnected aspects to the problem / solution.
PRISM: Primo Macro Assembler
PRISMATIC: Primo Language Kernel / Virtual Processor or VM
PRIMO: Top Level REBOL like language.
I say a bit more about each of these in a moment.
I see this as mirroring the current situation were we have
1. Native Machine Code Instructions
2. REBOL Language Kernel coded in ANSI C Standard Libraries*
3. REBOL Top Level Functional Langauge
*Level 2 would seem to be a stack based kernel / virtual machine
whose specification is platform agnostic but each platform is
coded in a combination of standard C libraries and system specific
libraries.
Ie the MS Windows implementation might make use of such things as ...
WSOCK32.dll , KERNEL32.dll , USER32.dll , MSVCRT.dll , GDI32.dll etc.
Each Implementation of the Kernel is cleverly mapped to take advantage
of host system features.
Back to what we might do.
I think just now, unless anybody strongly objects, we ought to limit
ourselves to designing and implementing a basic, but fully extensible,
console system for Linux & DOS (Yes DOS!) on Intel X86 architecture.
I say DOS instead of Windows because at this stage it is a much less
complex programming environment but still presents full input / output
access to the file system. Only later when we get to the requirements
of the network protocols & interfacing Winsock.dll should we step up,
re-implement the windows specific parts. I say this because all though
the network protocols are very important to the functionality of REBOL
and also very interesting & powerful to program to & for, they are not
critical at this stage to the basis of the language and a getting a
working interpreter.
PRISM, PRISMATIC & PRIMO
Consider this REBOL expression...
>> ADD 2 3
this would LOAD to an input-block of ; [ add 2 3 ]
If we assume that the REBOL Language Kernel is a stack-based ( ie
postfix )
virtual machine then the internal definition of the the action! :ADD
might be
something like this....
ADD: get-arg(s) check-arg(s) do-add-op make-type return ;
get-arg(s) ; get next reduced value from input-block, set parameter-
name &
push value onto the data-stack.
check-arg(s) ; type? check parameter against legitimate datatypes!
in implementation terms probably a byte or integer value comparison,
flag any errors!, push result-datatype! to the return stack or a local
register. In REBOL integer! is currently datatype! type value 29.
So in our example we now have a top down data-stack containing 2 3
with
3 being the top value. As well as a value in the return-stack or a
local
register containg the byte or integer value of the result datatype!
do-add-op ; is a destructive addition of the top two data-stack
elements &
replaces them on the data-stack with their sum value.
In PRISM/IntelX86 assembler it might look something like this...
do-add-op:
pop eax ;
pop ebx ;
add eax,ebx ;
push eax ;
next ;
This would leave the value 5 on top of the data-stack.
Make-type ; creates a REBOL/PRIMO value of the result datatype and
top of stack value.
In our example this is the integer! "5"
return ; passes control back to the interpeter
We are now at the tail of our input-block which is now reduced to [5]
which is the returned result.
REBOL/PRIMO prints the system/console/result & the 'result value in
the correct format according to it's datatype!
== 5 ; integer!
This is a simplified example and doesn't take into account the
difference
between byte, integer & floating point addition routines, type
coercions
etc.
What it does show is that all of REBOL can reduced to relatively
simple &
primitive operations on 8,16, 32 & 64 bit values, creating higher
level
constructs like Datatypes & polymorphic functions and the possibility
of
defining WORD!s to represent them.
What I would like to see & discuss further is the three things I
mentioned
above PRISM, a macro assembler, PRISMATIC the kernel / vp / vm , and
PRIMO
our REBOL like language.
Lets start with PRISMATIC the language kernel.
Iam sure both Daan & Marcus will have much to say, discuss, design
and also
implement for this. We need to design & discuss this as a matter of
priority
as it it from this upon which everything else is built.
Ideally PRISMATIC will be a small and efficient language kernel and
written in
either native assembly or C / C++. *** SPEAK UP DAAN & MARCUS ***
There are lots of examples available to match our ideas & designs
from.
I have details to hand for EFORTH, PARROT, JAVA-VM. Anybody got
anything about
Tao/Amiga Virtual Processor? We should also be able to test out our
definitions
of higher level PRIMO constructs in these environments, especially if
we limit
ourselves to a few simple datatypes & functions like the [ADD 2 3]
example above.
PRISMATIC is the basis for a PRIMO/Tiny & PRIMO/Core language
implementation.
With the correct mix of op-codes & functionality a fully extensible
language
can be created. It is the inner interpreter that deals only with OP-
CODES(WORDS!)
and 8, 16, 32 & 64 bit values (BYTES!, NUMBERS! / POINTERS!)
We can use REBOL initially to parse input source code & convert it
into
a useable internal code for PRISMATIC, which is all just combinations
of
hex numbers & byte values anyway.
Iam looking to Daan, Marcus and anybody else who wishes to give it
their best
ideas & efforts to flesh out the design & specification of PRISMATIC
or come up
with something better.
PRIMO is the higher level REBOL like language specification built on
PRISMATIC
( or JAVA-VM, PARROT, TaoVP, MSDotNet or whatever 8-)
We need to start writing down either algorithms and / or lower level
code examples
of at least the absolute minimum core wordset required. REBOL Make-
Spec or the
the newer Make-Doc can help to produce quality HTML documentation in
this respect.
Iam thinking here of essential words like 'make, 'print etc. the
datatypes!, the
arithmetic & logic operators, series! operators & functions, loop
constructs etc.
Detailed description & design specification documentation will make
implementation
so much easier. Iam willing to do the donkey work here as well as
work on other areas like PRISMATIC, anybody else willing to do
documentation work please make
yourself known. Please SHOUT!
I've left PRISM till last because although to me it is appealing and
in line with
my current topic of interest it is not absolutely essential to
PRISMATIC or PRIMO
in the short term. We can proceed by using other assemblers or
compilers whether
these be GCC, Turbo C, VisualStudio, MASM, Netwide Assembler, JAVA,
Eforth or
whatever.
To be truly extensible both above at the PRIMO level and below at
PRISMATIC level
and provide complete access to your computer system PRISM would be
the Macro
Assembler capable of producing native machine code for host or target
system.
The Parse capabilities of REBOL provide one of the most powerful
tools for writing
parsers, compilers and assemblers. Ideally PRISM would initially be
written as a
Macro Assembler for Intel X86 or Pentium as I imagine that's what
most of us are
currently using. Apologies to MAC & PPC people, if my assumptions are
wrong let me
know.
At some stage I see PRIMO being extensible with native code routines
assembled /
compiled inline. I envisage a special kind of block! datatype called
native-code!
here's an example...
>> do-native code[ pop eax ; pop ebx ; xor eax,ebx ; push ebx ; next ]
end
'do-native is a function which takes a native-code! block as it's
argument and
assembles / compiles the native code instructions.
code[ ]end are the token identifiers for the start & end of native-
code! blocks
and direct the parser to the detect valid assembler statements as
opposed to
valid REBOL / PRIMO expressions.
The contents of code[ ]end are PRISM Macro Assembler statements which
make sense
for either your host system or target machine.
Whilst working at the machine code / assembler level there are lots
of free or
shareware tools available to use & learn from.
Amongst these are NASM Netwide Assembler, GAS GNU Assembler, ASM
Project which
is interesting in that the same Macro Assembler Code can be used for
different
CPU's simply by using a TARGET (CPU) directive instruction and the
back end
outputs CPU specific binary according to the appropriate CPU
instruction set
dictionary. Foreach new CPU target all you need add is a block that
matches
the Macro Assembler Words to the CPU Instructions & register
Bytecodes.
Another interesting piece of shareware is Ketman.exe which is a DOS
Console
application which is an excellent Integrated Development Environment
for Intel
X86 based assembler. It has an Editor, Assembler,
Debugger/Disassembler, Graphic
Console which shows step by step changes in Register Values & Flag
Settings.
It also has an Assembler INTERPRETER, yes interpreter! which is the
complement
of a debugger in that it interactivley Assembles Mnemonic Source code
on a per
line basis rather than assemble and link a whole source program. All
you need do
is make & test changes to your line by line source code and compile
each change
rather than the whole program. The debugger can only poke changes
into specific
memory words / bytes which is okay unless you need to insert
additonal or remove
uneccessary or incorrect instructions. I'll post it here. I recommend
you take
a look at it, KETMAN is a very interesting approach to Assembly level
programming.
The demo version of Ketman is limited to assembling 64k programs but
if you buy
a registered copy you have full access to memory. Downside to KETMAN
are that it
only deals in a subset of INTEL X86 and does not deal with newer
specific 32 bit
or Pentium specific instructions. But never the less Ketman is
interesting from
a design point of view especially the Macro Assembly Interpreter
which is a really
cool feature and allows interactive & productive writing & testing of
assembly
code routines. Ketman also includes a good series of introductory
modular chapters
on learning & programming in assembly / machine code level for those
who are
interested. I've also got some good documentation for those who wish
to learn
more about how their computer really works, including the Art of
Assembly Language
Programming book in it's entirety in PDF format.
If anyone is interested in the PRISM route we could do a lot worse
than look at
the design of some of the features of KETMAN and the ASM Project with
respect to
producing our own Macro Assembler programming and debugging tools
which would
truly enable us to bootstrap our own projects, however there is
plenty to be done
at the higher levels of PRIMO and PRISMATIC and there are plenty of
quality tools
already available, editors, assemblers, compilers etc. that we can
use and get
free so I'll understand if people think that PRISM is not something
we need right
now, though I hope they'll see the potential and necessity of them in
building a
truly extensible PRIMO programming language and system.
Let me know what you all think about PRISMATIC, PRIMO & PRISM.
PS Did you read the article about REBOL on www.osnews.com
In the interview Carl Sassenrath talks about the tremendous progress
REBOL and
the new IOS is about to make in the world of X-Internet & distributed
computing.
He makes a great point about using REBOL as a tool for both the
semantic exhange
and processing of information without the mish mash of SOAP, XML and
all the other
multitude of XML based languages and protocols that DotNet requires.
I AGREE with
him whole heartedly, REBOL potentially is a much better way! although
when a new
technology like DotNet or an existing one like JAVA have huge
companies like
Microsoft & Sun backing them along with a whole slew of industry &
corporate
partners it is hard to see how such corporate might won't crush the
competitors
in the market place, the best technology doesn't always win where
money talks and
it seems to me that a great chunk of corporate software & support
industry earns
it's income and continuing existence from preserving such complexity
and making
and managing "upgrades" to their clients systems. I hope REBOL wins
but I wouldn't
bet money on it however their new secret partner who will put REBOL
on 30 million
desktops (Microsoft? AOL?) should hopefully provide them with a
decent income and
help promote the spread of REBOL as a technology.
Hopefully the fact that REBOL is a closed system and not as fully
featured as
either JAVA or DotNET nor as widely supported won't kill REBOL and
those of us
with REBOLlish interests will win the day.
However even if RT inc don't succeed commercially in the market
place, REBOL the
language is still a great tool to use and emulate and learn from.
Cheers everybody,
Mark Dickson
On Fri, 19 Oct 2001, Daan Oosterveld wrote:
> The dream of all software programmers is not comming true. You can't make
> software run faster than the hardware. Even if you use dynamic
> optimalisation this would still cost you an amount of computing power.
> Directly compiled code will always be faster than a VM. VM will always stay
> beneath the 100% boundary.
AFAIK, those optimisation tricks are not strictly connected to the use of
a VM. Dynamic optimisation has been (or was) used for long in the Alpha
version of NT. They never approached 100% (it would have noticed since
the Alpha was so much faster than the emulated x86, though on the other
hand I think they emulated more than just the CPU).
Reaching 100% or above is extremely uncommon so far. I don't know any more
examples than HP's Dynamo project, and perhaps Transitive Dynamite (if
their claims are correct). These two definately work by compiled (and
dynamically recompiled) code.
> VM could be seen as the RISC revolution in software:
> Make the interpreter fast, the instructions simple and let the compiler
> do the rest.
That is the basic type, but they don't get very fast until the bytecode
is compiled.
> Exactly the same with RISC. In hardware the RISC and CISC processors match
> eachother these days. I don't see I reason for inserting another compiler
> layer like bytecode. I rather like the idea of the VP. These programs even
> load faster than native binary's. The difference between a VM and a VP is
> that the VM is just another RISC interpreter.
VM is just a generalisation of VP. Everything I've mentioned could be
seen as VPs.
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
...,
The dream of all software programmers is not comming true. You can't make
software run faster than the hardware. Even if you use dynamic
optimalisation this would still cost you an amount of computing power.
Directly compiled code will always be faster than a VM. VM will always stay
beneath the 100% boundary. VM could be seen as the RISC revolution in
software:
Make the interpreter fast, the instructions simple and let the compiler do
the rest.
Exactly the same with RISC. In hardware the RISC and CISC processors match
eachother these days. I don't see I reason for inserting another compiler
layer like bytecode. I rather like the idea of the VP. These programs even
load faster than native binary's. The difference between a VM and a VP is
that the VM is just another RISC interpreter.
cheers,
Daan Oosterveld
----- Original Message -----
From: "Marcus Petersson" <d4marcus@...>
To: <OSCAR-PROJECT@yahoogroups.com>
Sent: Friday, October 19, 2001 0:16
Subject: Re: [OSCAR Project] Virtual Machines
Good VMs typically reach 80-90% vs native speed. Some new ones which make
use of dynamic optimisation may go from 90% to 110% or more, for example
by aligning code better, eliminating branches and so on.
Hi all (gee it's been a while since I posted here ..... :-)
I've been following Oscar/Primo for *ages*. One of the other areas
of my interest is the Spirit parser generator (done in C++ - check
out -
http://sourceforge.net/projects/spirit
( I'm coming to the reasons for my somewhat OT posting .... )
I believe that if anyone here has a look at Spirit (and uses it) ,
they'll be convinced that it could be a "fast-track" to getting
Oscar/Primo fully up and away , *very* quickly .....
I posted about Spirit here a while back. Since that time, Spirit
has matured **enormously**. It now has the following parsers
included -
a) XML
b) Pascal
c) C
Spirit now has nine developers involved with it ( note - the site
lists me as an editorial/content writer, but I've pulled out , unable
to commit the time this deserves ... ) .
If you take a few mins to check Spirit out, I'm sure you'll like
what you see .... Apologies for the off-topic post .....
- Andy
On Wed, 17 Oct 2001, Mark Dickson wrote:
> I suppose it all really comes down to how much value
> you place on multi platform availability or whether
> you accept platform diversity, whether you believe in
> a single solution or many possible paths to achieving
> the same goal.
I like platform diversity, and see VMs not as a threat, but as a way of
ensuring that platforms can remain, just as high-level languages do. As
for hardware, and especially CPUs, there are very little to choose from
currently. I believe it will remain so until VMs become more common, and
truely make CPUs into a user choice, just as drivers, APIs, and protocols
make other hardware a user choice.
> Is there really one software system that meets the diverse
> and different performance and specification needs of user
> desktop or workstations, embedded or handheld devices,
> enterprise scale mega servers?
Not yet, because software has rarely been written that way. But of course,
the hardware will always define what's possible in terms of features and
performance. Still it's perfectly possible to design a system that scale
well for different platforms, though it takes lots of careful design.
> I would say no there is not and for that reason Iam a bit
> concerned about the current trend towards trying to pretend
> that everything can be abstracted to appear as the same
> system ala Microsoft DotNET, Java JVM, Tao/AMIGA, REBOL/IOS,
> DotGNU, Mono, Parrot etc. etc.
That's what HLLs (of which Rebol is one) do today, but by using VMs, you
transfer portability from the language to the VM level. Binary compability
in other words.
One could think that this is a big performance hit, which however could
still be acceptable as hardware is getting faster every day, and software
development often can't keep up (isn't this the primary reason for all
abstraction?). But it doesn't necessarily have to be a performance hit.
Good VMs typically reach 80-90% vs native speed. Some new ones which make
use of dynamic optimisation may go from 90% to 110% or more, for example
by aligning code better, eliminating branches and so on.
> Now Iam not saying these projects are not interesting or of
> value in themselves, I myself AM interested in them and follow
> their progress regularly, however trying to pretend that ALL
> systems can appear the same, and shoe-horning the abstraction
> level to present that as a reality across all systems only
> means settling for the lowest common denominators, losing any
> platform or hardware specific features that perhaps made the
> system attractive or beneficial in the first place.
That doesn't sound very likely to me. For example, adding another system
to run on top of your existing system won't make you lose anything more
than some drive space. If you add a new library or language to your
system, does this mean settling for the lowest common denominator of all
the other systems that library or language exist for? Well, perhaps it
does, but do you care?
As for hardware features, compilers usually can't make use of new CPU
features from the start, but then they do, the VM will too. The major
difference is that the code is compiled dynamically instead of statically
as with most languages. That's what a VM does, it adds a bytecode layer
between the languages and the CPU. You could settle for an interpreter,
but you could also compile the code, in which case the result will be the
same (maybe a little worse, maybe a little better) than if it was compiled
from the HLL to the CPU directly.
> I LOVE what REBOL makes possible, I also frustrate at what prevents.
> A truly powerful system, PRIMO ??? , would ideally be as multiplatform
> as REBOL and as generally useful, each implementation however should
> be highly platform specific and allow direct access to that system or
> machine features / capabilities / peculiarities / limitations.
Hmm, perhaps. I really don't care much any longer.
To me, Rebol as just another language for the old world, although a good
one, yet sometimes a bad one. The entire old concept of monolithic
programs, huge libraries, stupid filesystems and anonymous data, that is
what frustrates me. And unfortunately, these old artifacts define how
languages are built. Even Rebol, though we don't see much of it, because
you need a Pro license to even call external programs and data.
<dream-mode>
Within a modern object system, all programs can be written as scripts,
compiled to bytecode, and then to native code. You call an object, it does
it's thing, there's no logical difference between libraries, applications
and data "files", they are all objects. Data objects describe themselves,
you don't have to tell them how to do something, just what you want them
to do. Directory hirearchies are just an abstraction, the user have many
ways of grouping objects together, and they can be collected into groups
automatically based on their contents and attributes.
The system IS a language, and it is its own meta-language. Code can be
either native, or redirected to one out of several virtual machines.
Languages can be fully system dependant, this actually makes sense because
the system model is so different from traditional systems. And since the
system is platform independant itself, with retained binary compability,
any language can be ported with it.
There's no kernel, a group of native objects offer all the basic services.
These objects can be replaced and optimised individually, just as any
other object, so the system scale well for different platforms.
Settings follow the objects, and if no settings is defined for an object
it may fall back on the default settings of the group, the name, the class
or any other attribute.
There are several layers of protection, for objects on the one hand, for
users on the other. Everything can be protected individually, and code is
not modifiable, so viruses can't spread without the user's consent, unless
they used the backdoor approach, which is through the old stupid host
system or hardware.
The system is free for home users, schools and organisations. All the
interfaces are open, and most of the source as well. New programs become
part of the system automatically, and their building pieces can be reused.
Thanks to the simple programming model, most users are able to write
decent programs, just as easily (or more) as they would write a Rebol
script. In fact, graphical applications can be created by dragging and
dropping tools onto an area. Most user interfaces can be created this way,
or if you prefer, dynamically generated by scripts.
And still there's more to it, but I can't remember the remaining stuff...
</dream-mode>
> With regards to Virtual Machines mentioned previously in this thread,
> I would love it if REBOL & PRIMO were someday available on all these
Well, I wouldn't suggest you use Parrot as a platform, only to execute the
parts of the code that are, or may be, dynamic. Whether that code can be
compiled, JIT-compiled, or just interpreted depends I suppose on just how
dynamic it is. The static parts are better written and compiled in the
usual way, in C or C++. The static parts include most of the language -
the interfaces (console, windows), protocols, file handling.
But everything that can be used directly in actual code, from native
functions to word-value bindings, should be implemented on top of the VM.
If needed, you could also add your own instructions and datatypes to it.
(I'm not sure where I'd put the parser. As it seems Parrot will have
Perl-like RE handling built-in, which could be used. But do we really
want the parse rules to be dynamic?)
> platforms but here is my take on the merits, advantages &
> disadvantages of some of the systems we mentioned.
Nice list. :-)
> However as I said JAVA is here, it is quite powerful, it is widely
> known and understood and in Frank Sievertsen's FREEBELL we also have the
> basis of a REBOL for JAVA implementation. Frank's project seems to be on
> hold for now but if anybody is friendly with Frank they may wish to enquire
> if he's willing to hand the project on to somebody here or make his
> code public & let others take up the mantle.
That would be most welcome. Freebell is interesting, but without the code
available it's unfortunately useless. :-(
> So that's my take on the current virtual machine environments as
> possible targets for implementing PRIMO. We could also stick to the ANSI
> C & it's standard libraries and hopefully achieve the same level of
> functionality and platform portability that REBOL/Core has.
We "could"? It's not a case of either/or, I recommend we use a combination
of standard languages and libraries, and a virtual machine such as Parrot.
We will need an interpreter anyway, so might just as well use an existing
one. This will not just save time, but also make us part of a larger
community. Imagine Perl, Python, and Primo code being executed by the same
machine. :-)
Btw, I'd like to urge everyone who has unreleased Oscar/Primo code lying
around to release it, no matter if it works or not.
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
Virtual Machines
Marcus / Ladislav / Everybody
Regards the implementing of REBOL / PRIMO on various
virtual machines, whilst this has certain obvious
attractions and is certainly desirable long term, at
this stage I feel it is maybe a little bit ahead of
where we are just now for a number of reasons which
I will outline here.
I suppose it all really comes down to how much value
you place on multi platform availability or whether
you accept platform diversity, whether you believe in
a single solution or many possible paths to achieving
the same goal.
A Single Way of doing things has obvious attractions for
both developers & enterprises in that it should in theory
limit the overall complexity but that comes with the down
sides of single vendor lock in / monopoly abuse / lack of
choice & freedom. If there is decreed a single true way of
doing things, then it may not be to everyones tastes or
preferences, it might not be the optimal solution and we
all get locked into the "wrong" way of doing things if not
forever, then at least for a considerable time period.
The Windows Paradigm of the last decade(s) is an obvious
example.
Is there really one software system that meets the diverse
and different performance and specification needs of user
desktop or workstations, embedded or handheld devices,
enterprise scale mega servers?
I would say no there is not and for that reason Iam a bit
concerned about the current trend towards trying to pretend
that everything can be abstracted to appear as the same
system ala Microsoft DotNET, Java JVM, Tao/AMIGA, REBOL/IOS,
DotGNU, Mono, Parrot etc. etc.
Now Iam not saying these projects are not interesting or of
value in themselves, I myself AM interested in them and follow
their progress regularly, however trying to pretend that ALL
systems can appear the same, and shoe-horning the abstraction
level to present that as a reality across all systems only
means settling for the lowest common denominators, losing any
platform or hardware specific features that perhaps made the
system attractive or beneficial in the first place.
Anyway, my ranting against pretend uniformity aside, there are
clear and obvious benefits to multiplatform portability at the
LANGUAGE / PROTOCOL Level ala HTTP-HTML, email & instant messaging,
REBOL scripting running on fifty plus platforms with minimal
modification, but hey WE already understand the benefits for
network based semantic & informational applications that REBOL
can make possible, that's why we are here anyway 8-)
I LOVE what REBOL makes possible, I also frustrate at what prevents.
A truly powerful system, PRIMO ??? , would ideally be as
multiplatform
as REBOL and as generally useful, each implementation however should
be
highly platform specific and allow direct access to that system or
machine features / capabilities / peculiarities / limitations.
As I said in my last post as synatactically pretty and as generally
useful as REBOL / as powerful and as platform specific as FORTH.
However as OSCAR | PRIMO is an open source voluntary effort, people
will as a rule only develop for the platforms they have to hand and
for which they are specifically interested & motivated enough to
provide for. I use Windows ME & Redhat Linux on Intel Pentium so
I would be interested in PRIMO for those environments, it would take
somebody specifically determined to produce say a MAC OSX on PPC port
or a NetBSD on SPARC version, to give examples, because I probably
wouldn't do it because I don't have those systems and they don't
interest me personally. However hopefully other people would be
sufficiently interested enough in these platforms to port our REBOL
like language, whether this be for geek status, personal interest,
platform support or whatever, it will only happen if people want it
enough to create it. In the same way Linus Torvalds works on the
aspects of the Linux Kernel that matter to him and preferred systems
and leaves the more exotic or platform specific ports to people who
have an interest in them.
With regards to Virtual Machines mentioned previously in this thread,
I would love it if REBOL & PRIMO were someday available on all these
platforms but here is my take on the merits, advantages &
disadvantages
of some of the systems we mentioned.
MICRO$OFT DotNET
Potentially the BEST and WORST thing that is going to happen in
software
in the coming decade. With Microsoft's clout and massive captive base
it will inevitably "the predominant" if not the best platform of it's
type. There is already so much publicity, hype, and third party
support
added to the M$ led momentum that it WILL happen. Rest assured a large
part of the software industries future will in someway be based upon
it
or around it, be this development, support, applications etc.
Microsoft
is also betting the company on the whole DotNET initiative and has
already
promised to vigorously protect it's intellectual and financial
interests,
so whether the DotNET Virtual Machine, Common Language Runtime
Environment
call it what you will, is available for Free or Low Cost, and is
either
Windows Centric or Multiplatform, rest assured it will cost you
somewhere
down the line. There will be some payback to Microsoft whether this be
for development tools licensing, the cost of the runtime, Web Services
subscriptions, restrictions on application licensing terms or
whatever,
Microsoft WILL at all costs retain ownership and control of the
platform
either legally or financially or both. That said C# & the DotNET CLI
seem to promise a comprehensive set of nice features so no doubt REBOL
and hopefully PRIMO will appear in this environment, but as you can
only
get the Software Development Kit under Microsoft's terms and the whole
system is still heavily in beta, this system is a maybe a little bit
"one for the future" at this stage.
MONO & DotGNU
These are slated to be Free Software alternatives to Microsoft DotNET
but are at an even more primitive state of developement than the afore
mentioned M$ DotNET these are even more into the future. The precise
legal status or completeness of these systems is entirely in the hands
of how open Microsoft are with specifications and protocols and what
legal approach they take to defend their intellactual property rights.
I really hope that either MONO or DotGNU ( or both ) initiatives are
successful in providing an unrestrictive free software alternative or
re-implementation of the coming M$ DotNET system but the future
success
of either project is by no means certain and there are a lot of
uncertain
legal & technical obstacles that could potentially trip them up at
some
stage.
JAVA & JVM
JAVA is currently the most popular computer language in terms
developer
numbers and job opportunities, the JAVA Virtual Machine is also the
most mature, stable and complete system of it's type currently
available.
Whether SUN & the JAVA industry have the will for a head on fight with
Microsoft DotNET or whether they will be eclipsed by it remains to be
seen, however JAVA is here, it is already widely used, and has
numerous
implementations both commercial and some open source. It has a fair
degree
of platform portability on major platforms of varying quality and
performance. Microsoft's decision to unbundle JAVA from all it's
future
platform releases will undoubtedly adversely effect the JAVA market.
However as I said JAVA is here, it is quite powerful, it is widely
known
and understood and in Frank Sievertsen's FREEBELL we also have the
basis
of a REBOL for JAVA implementation. Frank's project seems to be on
hold
for now but if anybody is friendly with Frank they may wish to enquire
if he's willing to hand the project on to somebody here or make his
code public & let others take up the mantle. JAVA is widely available
and pretty complete it may be the best place to start a VM
implementation
either by persuading Frank Sievertsen to share Freebell or starting
from
afresh.
Tao/AMIGA
From what I've read & seen it seems to be avery attractive platform
from a technical point of view, AMIGA has always been based on quality
leading edge technology. However like the APPLE MAC they've always
been
a devotees platform, you have to be an Amiga fan to develop for & use
them. Like many technically excellent products they don't always win
out in the market place and although I'd like to see them succeed I
don't see how they will gain much traction as opposed to MS DotNET,
JAVA, MAC OSX & Linux who have either massive industry support or a
loyal devoted following. AMIGA has a devoted following but whether it
is sufficient to create a significant marketshare, especially as
others
like QNX, BeIA/Palm etc are also viaing for a similar space. From my
point of view although Iam interested in the technological ideas that
Tao/Amiga provides Iam not sufficiently attracted to invest in
purchasing
the software development kit, but that is a personal matter, other's
no
doubt are big AMIGA fans / developers so maybe they will make REBOL /
PRIMO a reality on that platform if they see that being a big part of
their future.
PARROT
PARROT is the new Common Language Runtime being developed for PERL6
and
also of interest to PYTHON development. It is an attractive idea to
share the same virtual machine and they do have some very skilled and
talented people exploring & developing the idea but again it is very
early stage of development so again it is a possibility "for the
future".
The main advantage about PARROT is that being open source we have the
code available to learn from, study or piggyback on if we so desire.
The restrictions on the uses & distribution of a future PARROT should
be more favourable from our perspective than the potential licensing
implications that developing for commercial virtual machines like
MS DotNET, Tao/AMIGA etc. might entail. From our point of view PARROT
may be one to watch & learn from. Being able to share with PERL &
PYTHON
might be an interesting plus feature if PARROT & PRIMO developed
enough.
So that's my take on the current virtual machine environments as
possible
targets for implementing PRIMO. We could also stick to the ANSI C &
it's
standard libraries and hopefully achieve the same level of
functionality
and platform portability that REBOL/Core has. It all depends on what
your
language preferences and target platforms are.
As I've said numerous times before, people will only produce for what
specifically interest them and rightfully so, especially in open
source
where there is no direct financial reward. So all I try to do is
maximise
my understanding of REBOL internal workings and how I might envisiage
PRIMO to be and how I would like to express myself and control the
machines that I work with as that is all I can do, test & write my own
code fragments & snippets until I understand the system completely
enough
to implement it in full.
cheers,
Mark Dickson
On Tue, 16 Oct 2001, Ladislav Mecir wrote:
> is Parrot intended to be another interpreted JVM like machine, or shall it
> be more like Elate's VP or Microsoft's MSIL, that were meant to be
> translated to machine code instead of being interpreted?
Both. At the moment only interpreted, but it looks like there will be
compilers in the future (the register architecture helps with this). And
it will not be anywhere near as tightly coupled to Perl, as JVM is to
Java. Perhaps there will be a little Perl'ism included by default, but in
the worst case one could replace it with more Rebol-like behaviour.
Btw, it's not as nice as Elate VP, no "unlimited" amount of registers, and
only 32bit support so far. But it will probably eveolve with time. Rebol
doesn't require 64bit support anyway, does it?
Of course, it's not clear if Rebol code can be compiled into native at
all, though I would think that a JIT-compiler could work. Or even bytecode
actually, if you consider some of the dynamic "features". Now, where are
those immutable code-blocks when you need them...? :-)
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
Hi Marcus,
is Parrot intended to be another interpreted JVM like machine, or shall it
be more like Elate's VP or Microsoft's MSIL, that were meant to be
translated to machine code instead of being interpreted?
...snip...
> Wouldn't it make some sense to put a virtual machine into the interpreter
> form the start? That might even same a little work, if we use an existing
> one. Perl and (perhaps) Python seems to be moving towards a new VM called
> Parrot. Could this be used for Primo as well?
>
> Introduction: http://www.perl.com/pub/a/2001/09/18/parrot.html
> Download: http://cvs.perl.org/snapshots/parrot/
> Home: http://www.parrotcode.org/
>
> Marcus
> ------------------------------------
> If you find that life spits on you
> calm down and pretend it's raining
On Fri, 12 Oct 2001, Daan Oosterveld wrote:
> Well no, it just the time the project is consuming to build an interpreter.
> I have not been able to vind spare time for this project because I have so
> little of it... We have made some progress before July... We've analysed the
> basic parts of REBOL. To this time the design never has reached a full
> useable source code/program....
Would it be ok to let us see your latest code? If it's been modified since
the 9-Feb release that is.
Wouldn't it make some sense to put a virtual machine into the interpreter
form the start? That might even same a little work, if we use an existing
one. Perl and (perhaps) Python seems to be moving towards a new VM called
Parrot. Could this be used for Primo as well?
Introduction: http://www.perl.com/pub/a/2001/09/18/parrot.html
Download: http://cvs.perl.org/snapshots/parrot/
Home: http://www.parrotcode.org/
Marcus
------------------------------------
If you find that life spits on you
calm down and pretend it's raining
Hello everybody,
Iam resurrected! 8-)
Sine Early August I have had little or No time for
making any posts to the OSCAR | PRIMO or REBOL lists
for the following reasons.
1. I was ill for a few weeks thus had a major work backlog to clear.
2. I then moved offices in work - change, upheaval, settling in period etc.
3. Went to Lake Garda in Italy mid september for a few weeks great holdaying,
fantastic recommend it to anyone.
4. End of September / Start October MOVED HOUSE AGAIN!!!
3rd Year in succession, our lease was up so we're staying with Dawn's parents
till we find a new home that is suitable and we can afford, again more major
upheaval / time consuming.
5. Work still very busy!
6. The modem on my home machine has not been working for a few months but I've
not had time / money / opportunity to get it fixed properly so my internet
access has been extremely limited.
All those reasons aside, my thoughts on REBOL / PRIMO have not changed, work has
not stopped, only become less visible and dropped under the hood a few levels.
I've spent the last few months reading & studying extensively about Hardware |
Processors | Assembly Language & the Forth Language in it's various incarnations
and means of implementation.
I Love the sheer raw power and simplicity and philosophy of Forth but dislike
the syntax & weird operators & word names etc.
I prefer REBOL style & syntax more, it "Looks" so much nicer in certain aspects,
however that's not everything.
I dislike the lack of low level ability in REBOL, everything is so high level
and abstract that you are "insulated" from the real machine and hardware your
working on. That's fine up to a point if you only want to
exchange semantic information using it as a communication language, not so good
if you really want to tinker about with the internals & change things.
A Better Means of Expression and a Better Means of Operation, that's what
REBOL was intended to be, though the actual development and exploration of
the technology & concepts behind it have sadly been shoe-horned into pursuing
the commercial objectives of REBOL Technology and their goal of
a platform independant internet distributed graphical operating system,
REBOL/IOS.
In my view that has caused a stagnation in the /Core abilities of REBOL.
No new datatypes have appeared nor a means of creating them, no way to extend
the datatype range or in-built parser capabilities and no way to
escape the platform independant high level abstraction and access machine
specific low level bits.
That is what Iam concentrating on designing & studying just now.
REBOL is great at what it does, I use it a lot and for the foreseeable time I
recommend you should too. If that doesn't suffice then PYTHON is
a great free language with tremendously wide capabilites across the whole
spectrum of computing. JAVA & C# are interesting, especially
from a Virtual Machine perpective. Assembler, C and Forth if you need to
do low level stuff.
I want PRIMO to be General ( Platform Neutral ) & Specific
( Machine Efficient ). High Level of Abtraction ( Just Like REBOL )
Low Level access & power ( Just Like Forth ).
So it should be possible to say things like,
>> Send Joe@... "Test Message" ; Just Like REBOL
>> 1 2 3 DUP STACK ; Just Like Forth
== [ 3 3 2 1 ]
Note 'DUP is a prefix function that takes a series! argument and
inserts a duplicate of the top / first element. STACK is native! word
that returns a block! of the data-stack elements from top to bottom.
or even more low level
>> get &EAX ; &label register! datatype
== #0000:FFFF ; #base:offset memory! datatype
here 'GET is polymorphic, &EAX is an indentifier for a register!
datatype and the result #0000:FFFF is a hex memory! address datatype,
note here intel x86 platform specific format, alternatively there
might be a System/Option to return a 32 bit number in integer! , hex!
, octal! , or binary! base number format in these instances.
Similarly we might say,
>> Apple: "Green"
== "Green"
>> Location? :Apple
== #0000:FFFF
>> Set #0000:FFFF "Red"
== "Red"
>> Apple
== "Red"
Note here LOCATION? is a func to return the memory! address of a value,
SET is polymorphic and takes either a word!, memory! or register! as it's
first argument and a value as it's second argument. Here we changed the
value at the address of 'Apple from "Green" to "Red".
I'm not sure how this direct manual manipulation of memory! or registers!
would interfere with the REBOL model of automatic storage allocation &
garbage collection,it may be incompatable, but it shows an example of the
Low Level tinkering which I would like to see available and what gives
Forth it's power. The ability to specifically control, bits, bytes and
memory addresses. With that capaibility anything becomes possible.
Forth is extensible in itself just like REBOL but it also has the
ability to define words & functions using inline native machine code
which when compiled look and act like any other normal words. Most Forths
have a built in assembler / compiler for their platform. This allows
for writing extremely efficient & optimised programs and also for
meta-compiling a new Forth System written in itself.
Anyway those are the longer term goals I have for the design of
PRIMO, I've now lost all worries about timescales. REBOL & other
languages are good enough for everyday stuff just now. PRIMO will
take as long as it takes, but the considerable time spent in careful
study of design & implementation considerations should save some
considerable amount of time and mistakes later on.
It's always easier if you know precisely what you want to do and
specifically how you intend to achieve it.
I know a lot of what I would like to see PRIMO become, Iam still
figuring out and studying the bits I don't understand fully to
increase my understanding. It took Carl Sassenrath nearly twenty
years to conceive of REBOL to where it is today. Chuck Moore is
still rewriting and tinkering with Forth after more than thirty years.
I hope we don't plan to spend anywhere near that long, that would be
a tragic waste of all our lives but design and understanding concepts
and implementation techniques is critcally important and as i said
before time spent now is time saved later on.
I do hope for for PRIMO to be something More and Different to what
REBOL is today, mainly because we've already got REBOL and that itself
is very good. If we think strictly along the lines of me too! then
we don't explore any new ground nor have anything more to offer, people
might just as well stick with REBOL as it's professionally produced
and supported and we're just re-inventing the same wheels people
can already get for free / little cost.
In my mind PRIMO is a path to discovery about how I would like to
express myself and operate my computers. It's also about learning and
understanding a REBOL like language and how that maps to the system
and platform your working in. I use Windows at work, Linux & Windows
at home, I often sample & explore lots of other small commercial and
free operating systems & desktop environments however no matter how
pretty or cool they are, you still have to interface and build within
the parameters of that platform.
REBOL is way cool and available on so many platforms so it's a great way
to write code for multiple environments but often you need to do things
that are platform or machine specific to use a feature that makes the
hardware or system attractive in the first place, REBOL is not always
so good at that. I would like to have the power to define and control
my own environment and code above or below the operating system.
That is what PRIMO means to me and what I would like to see it become.
All help and discussion is welcomed.
My Home PC goes into the shop tomorrow to hopefully get the Modem fixed
and working again soon. My recently very busy life is also hopefully
stabilising and settling back into routine gain and will give me more
time for my internet, computing & rebol activities.
Hello Daan, Ladislav, Marcus, Sharriff & everybody.
Mark Dickson
PS New Dattype Ideas
ratio! 22/7
complex! 0.33333+2.65e-49i
register! &EAX
memory! #0000:FFFF
hex! FFFFhex
octal! 3770oct
binary! 1010bin
fixed-string! ."Ladislav Mecir"
fixed-block! .[1 2 3]
In a message dated Fri, 12 Oct 2001 5:37:09 AM Eastern Daylight Time, "Daan
Oosterveld" <emptyhead@...> writes:
>
>
> Well no, it just the time the project is consuming to build an interpreter.
> I have not been able to vind spare time for this project because I have so
> little of it... We have made some progress before July... We've analysed the
> basic parts of REBOL. To this time the design never has reached a full
> useable source code/program....
>
> cheers,
>
> Daan Oosterveld
>
> ----- Original Message -----
> From: "John Chludzinski" <jchludzinski@...>
> To: <OSCAR-PROJECT@yahoogroups.com>
> Sent: Friday, October 12, 2001 6:59
> Subject: [OSCAR Project] Newbie question???
>
>
> > It appears that there has been a dramatic fall off in posting to this
> e-group. I hope that doesn't indicate a lack of interest??? What is the
> status of Oscar???
> >
> > ---John
> >
> >
> >
> > To unsubscribe from this list, send an email to:
> > OSCAR-PROJECT-unsubscribe@yahoogroups.com
> > Project mailing list:
> > http://groups.yahoo.com/group/OSCAR-PROJECT
> >
> >
> > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
> >
> >
>
>
>
> To unsubscribe from this list, send an email to:
> OSCAR-PROJECT-unsubscribe@yahoogroups.com
> Project mailing list:
> http://groups.yahoo.com/group/OSCAR-PROJECT
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/