Source

spark / Spark-Pre-Birth-of-a-Modern-Lisp.txt

The default branch has multiple heads

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
Spark - Pre-Birth of a Modern Lisp
==================================

Introduction
------------

Spark is a Modern dialect of Lisp currently being planned. This document is
not a formal functional (much less technical) specification for it, but rather
a briandump of some of the conclusions I (= Shlomi Fish) have reached
about the fundamentals of its behaviour. Nevertheless, some preliminary
(and still subject to change) specification of it will be given and some
code examples will brought.

Besides contemporary Lisp dialects such as Common Lisp, Scheme and Arc, Spark
draws a lot of inspiration from other modern languages, paradigms, and
technologies including Perl 5, Perl 6, Python, Ruby, Java and Haskell.

Some Spark Essentials
---------------------

TODO : FILL IN.

Spark is not another implementation of Scheme (or Common Lisp)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

There are far too many implementations of Scheme out there, and probably too
many of Common Lisp. However, that is besides the point that we did not come
to praise the existing dialects of Lisp, by implement them again.

Spark will be a completely different dialect of Lisp. It won't be compatible
with either Scheme (or any of its implementations), Common Lisp or even with
Arc. It will still be Lisp, though, as we hope you will see.

Spark does not aim to compete with C and friends
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Spark does not aim to fill the same ecological niche as ANSI C, C++
Objective-C, etc. much less Assembly. ANSI C and friends have been the
de-facto standard for writing applications when some or all of certain
constraints have been met:

http://www.shlomifish.org/philosophy/computers/when-c-is-best/

While their use have lately been diminishing somewhat due to the increasing
attractiveness of the Java or .NET frameworks or the various dynamic
languages (Perl/Python/PHP/Ruby/etc.) they are nevertheless still very much 
in vogue and even the backends for the more high-level virtual machines
are written in C and C++ .

There have been some efforts to compete with C and C++ on their own turf
such as http://en.wikipedia.org/wiki/D_%28programming_language%29[D]
or http://www.ecere.com/[Ecere] (and earlier efforts such as Ada 95, or
Object Pascal) and they can be commended for that, but unlike them, Spark does 
not aim to replace C in most of the valid use cases for it.

Spark will be a dynamic (so-called "scripting") programming language
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Spark will be an alternative to such languages as Perl, Python, Ruby,
PHP, Tcl, Lua or io-language, which are all very different but have APIs
to achieve similar ends and are used for similar tasks. Like them and 
Lisp (which is one of the oldest dynamic languages) it will be able to
determine a lot of behaviour at run time and will support dynamic
"eval", call-by-name
, run-time typing, run-time change of type for a datum, 
multiple dispatch, polymorphic macros and other features of Lisp
and other dynamic languages.

Some people have been referring to Perl and friends as "scripting
languages" but that implies they are only useful for scripts. See:

http://xoa.petdance.com/Stop_saying_%22script%22

While that article by Andy Lester illustrates the problem with labelling
programming languages as "scripting languages", I still think saying
"script" and "scripting" is a valid way to distinguish a trivial program
from an application. For example +/usr/bin/gcc+ is essentially
a script written in 
ANSI C, which just passes controls to the various compilation stages. When
we type +gcc+ at the command line, we are running this script that does the
hard work of doing the compilation for us. (+/usr/bin/gcc+ should not be
confused with GCC, the GNU Compiler Collection which is a compilation
framework for C, C++ and other languages, and is a crucial piece of the
open-source UNIX-like operating systems infrastructure).

For an "in your face" anti-thesis to the aversion to call languages scripting
languages see Larry Wall's "Progrmaming is Hard. Let's Go Scripting":

http://www.perl.com/pub/a/2007/12/06/soto-11.html

Spark will have a rich type system but won't be strongly typed
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Like Common Lisp, Python, Ruby and Perl 6 and to some extent unlike Perl 5, 
Spark will have a rich type system. However, it won't be strongly typed like
Haskell (which will no longer be considered Lisp). A variable can be assigned
different values with different types during its run-time, and functions
would be able to accept variables of any type (unless they specifically forbit
it).

The Spark type system will be extendable at run time, and will be analogous to
its Object Oriented Programming (OOP) system. As a result, one would be able
to call methods on pieces of data, on expressions, on S-expressions, and 
on functions, macros, classes and method declarations, and on their 
application.

In Spark "everything will be an object", but unlike Java, it won't be 
overly-OO. One won't need to instantiate a class and declare a method
just to print "Hello World" on the screen. This will work:

------------------------------------------------
$ spark -e '(say "Hello World")'
------------------------------------------------

Or:

------------------------------------------------
$ spark -e '(print "Hello World\n")'
------------------------------------------------

Or:

------------------------------------------------
$ spark -e '(-> "Hello World" say)'
------------------------------------------------

Spark will be capable of being used for Scripting
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

While Spark should not be called a scripting language, just as the name is
a misnomer for Perl 5 or for PHP, it should in fact be capable of writing
scripts, including command line scripts at the prompt or the REPL 
(Read-Eval-Print-Loop). Here are some examples for command line scripts. Most
of these are taken from the descriptions in http://www.catonmat.net/blog/awk-one-liners-explained-part-one/[Peteris Krumins' "Famous Awk
One-Liners Explained"] series (which is now in the process of being
augmented with "Famous Perl One-Liners Explained"). I'm not going to study
the Awk implementations due to lack of knowledge in Awk and lack of will
to learn it as I already know Perl 5 - its far superior superset, but I'll
implement something similar in Spark (Hopefully, Peteris will feature a 
"Famous Spark One Liners Explained" feature in his blog someday
too).

Line Count:
^^^^^^^^^^^

------------------------------------------------
$ spark -e '(-> (fh ARGV) foreach (++ i)) (say i)' [Files]
------------------------------------------------

Line Count Reloaded:
^^^^^^^^^^^^^^^^^^^^

------------------------------------------------
$ spark -e '(say (: len getlines (-> (fh ARGV))' [Files]
------------------------------------------------

+(: ... )+ serves the same purpose as Haskell's +$+ - to chain function
calls without too many nested parameters.

Double-space a file:
^^^^^^^^^^^^^^^^^^^^

------------------------------------------------
$ spark -pe '(say)'
------------------------------------------------

(Think Perl)

Number lines in each file:
^^^^^^^^^^^^^^^^^^^^^^^^^^

------------------------------------------------
$ spark -ne '(say "${^LINENUM} ${^LINE}")'
------------------------------------------------

Here we can see the string interpolation of variables in action. 
+${....}+ intepolates a single variable, while +$()+ is an
S-expression. Aside from that spark will also have sprintf, 
http://search.cpan.org/dist/Text-Sprintf-Named/[sprintf with named 
conversions similar to Python] and something as similar as possible to
Perl's Template Toolkit (while still being Sparky). I find Common
Lisp's +format+ to be hard to understand and much less flexible than
Template Toolkit so I'm going to drop it. 

Like in Perl 5 the +^VARNAME+ variables are reserved and 
are usually in all-capitals. Unlike Perl 5 (or Common Lisp), we are not a 
Lisp-2 and we use the same symbol namespace for everything (like Scheme). So
we can put assign a lambda 
(could be  +(fun ... )+ , +(sub ... )+ , +(lambda ...)+ or 
+(function ... )+ - all precise synonyms) to a variable and call it with a
value:

---------------------
(my square)
(:= square (fun (x) (* x x))) ; Or (<- square) but I'm hazy about (= square)
(say (square 5)) ; Prints 25 followed by a newline.
---------------------

Like in Perl 5, however, a symbol table has an arbitrary amount of slots which
we can put values. So we can say:

---------------------
(say :to (fh STDERR) "Warning, Will Robinson.")
---------------------

Which will print to STDERR.

In the case of the +:to+ named parameter to +(say)+ (or to +(print)+ or
to +(printf)+ or whatever we have or define) we don't need the explicit
+(fh ...)+ (= file handle) call but it won't hurt.

Note about command line magic
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Out of convenience the +-e+ and the rest of the +-p+, +-n+ , etc. flags
will involve some magic manipulation of the S-expression inside -e or inside
the script. It also loads some convenient modules. However, sometimes we may 
wish to convert a command line script to a full application. That's what 
the +--dump-code=code.spark+ flag is for. It dumps the code of the program to 
a file containing code that can be run with just +spark code.spark+.

For example:

---------------------
$ spark --dump-code=say.spark -pe '(say)'
$ cat say.spark
(no strict) ; you should probably remove that.
(use re)
(use cmd-loop)
(cmd-loop.set-implicit-print 1)
(say)
$ 
---------------------

Like all examples here, this is just for the sake of the illustration. Until
version 1.0.0 comes out, everything can change. But the concepts will remain
the same.

We encourage Perl, Ruby and other dynamic languages with rich command line
interfaces, to steal the +--dump-code+ idea. Maybe one day someone will become
a multi-millionaire from selling a 300K Lines program that evolved from a
simple +spark/perl/ruby --dump-code=code.... -e '....'+ invocation (after
a succesful plain +-e+ invocation).

Spark aims to be popular and be actively used for real-world tasks
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TOOD : fill in.

Spark will have nested namespaces
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Spark will have a similar namespace system to Perl 5, with nested namespaces,
and the ability to selectively import symbols from namespaces at run-time.
Similarly to http://search.cpan.org/dist/Sub-Exporter/ and as opposed to
C++ where namespaces are all-or-nothing, and so mostly unusable.

As opposed to Java one would be able to import several symbols from a 
namespace at once and grouped by tags. +Sub-Exporter+ gives much more than
that for Perl 5, but I don't recall all the details from the slides I saw 
about it. +:-)+ .

Like Perl 5 one will be able to import symbols at run-time.

As opposed to Perl 5 classes won't be automatically associated with
namespaces, and a namspace may contain one or more classes (or none).
Like CPAN and unlike Java (+org.apache.jakarta...+), we will not enforce 
namespace purity, but hopefully there will be a better mechanism than the
current CPAN and PAUSE (Perl Authors Upload Server) to be able to fork,
spin-off, or branch CSAN distributions or choose between competing
alternatives. http://cpan6.org/[CPAN6] (orthogonal to Perl 6) is worth
a look for some ideas, as is http://www.cpan.org/misc/ZCAN.html["The Zen
of Comprehensive Archive Networks"]

Spark will be more succinct than most Lisps, but not overly terse
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TODO : fill in.

Spark will be written in plaintext
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

TODO : fill in.

Regexps and other important elements have dedicated syntax
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

But don't worry - it's in the "re" module and it uses read-macros/ char-macros
/ text macros. With the help of such macros one can even create a parser 
for Ruby-style syntax (or Perl's - +;-)+), but it'll be actively 
discouraged.

So for example:

---------------------
$ spark -pe '(~ ^LINE (re.s {ba([zro])(\s+)mozart} ma$1$2bozart))'
---------------------

And it will replace the first +baz[whitespace]mozart+ with 
+maz[whitespace]bozart+ etc. The +~+ operator is similar to Perl 5's
+=~+ or perl-5.10.0's or Perl 6's +~~+ or in that it does a smart matching
of a datum (which could be a list) to an abstract operation. 

We can also use other delimiters instead of +{...}+ in the +(re.s...)+
read macro:

---------------------
$ spark -pe '(~ ^LINE (re.s /ba([zro])(\s+)mozart/ ma$1$2bozart))'
---------------------

C/Perl/etc. conventions
~~~~~~~~~~~~~~~~~~~~~~~

1. Spork will be case-sensitive

2. Unicode (UTF-8) aware-and-safe

3. With C-style escapes - backslash does the right thing.

Spark will not encourage a proliferation of implementations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As you all know Lisp is a family of languages, which includes Lisp, Scheme
Arc, Emacs Lisp, etc. and some people may say also Dylan, and that Perl 5,
Perl 6, Ruby, Python and most other modern languages have many Lispisms in
them up to being able to translate many programs written in Lisp to them 
line by line. (http://www.paulgraham.com/icad.html[See Paul Graham Revenge
of the Nerds for the inspiration])

However, some people on irc://irc.freenode.net/#scheme[#scheme] told me
that Scheme, due to the proliferation of incompatible implementations, was 
not one Lisp dialect, but a family of languages called "Scheme" with a
common denominator. While I'm all for 
http://en.wikipedia.org/wiki/Germanic_languages[Germanic languages]
being a major sub-division of
http://en.wikipedia.org/wiki/Indo-European_languages[Indo European languages],
also with several mutually incomprehensible languages, I'm not sure I want
a "Scheme programming language"-family within Lisp. It generally shouldn't
happen with "man-made" and computer-understood languages that are more under
our control.

As a result, I'd like Spark to remain a single language with only a few
implementations, possibly only one for each target virtual machine (e.g:
Parrotcode, the JVM, or the .NET CLR, or a C-based interpreter). Spark will
be defined and compatible even in its internals, its foreign-function
interface, and "standard library" (which will also have something more 
like CPAN is for Perl 5, where every J. Random Hacker can upload their
own INI parser, under a different namespace), and core functionality.

Spark will have an open-source source code (GPL-compatible BSD-style or 
possibly partially Artistic 2.0 in case some of the code is derived from 
Parrot code), which naturally can be span-off, branches, and forked. However,
none of them pose a threat to the fact that the Spark implementation will 
remain unified. 

If someone changes Spark in incompatible ways, it may either die, or forked
into a new language. This language will also be Lisp and may be Spark-like
but it won't be Spark. Perl 5 which only has one major implementation
(+perl5+ - currently at +perl-5.10.0+), recently span off 
http://search.cpan.org/dist/kurila/[kurila] which is fork of perl 5 that is
incompatible with it and with Perl 5, on purpose. Nevertheless, while 
Kurila may be considered a a language in the Perl family, it is not Perl 5
any more than Perl 4 , Perl 6 , http://sleep.dashnine.org/[Sleep] or whatever
are. So Perl 5 has still not become a family of incompatible implementations.

Another factor that will disaude people from creating multiple implementations
of Spark is that as opposed to Scheme, creating a Spark implementation from
scratch is not going to be trivial. It's not that Spark will be needlessly 
complixified, but that it would be needfully complex to implement to be an 
expressive, feature-rich and high-quality language.

To quote Bjarne Stroustrup (the creator \+1 of C++ ) from his
http://www.research.att.com/~bs/bs_faq.html#Java[FAQ question about Java]

[quote, Bjarne Stroustrup, FAQ Question about Java]
__________________________________________________________________________
Much of the relative simplicity of Java is - like for most new languages -
partly an illusion and partly a function of its incompleteness. As time passes,
Java will grow significantly in size and complexity. It will double or triple
in size and grow implementation-dependent extensions or libraries. That is the
way every commercially successful language has developed. Just look at any
language you consider successful on a large scale. I know of no exceptions, and
there are good reasons for this phenomenon. [I wrote this before 2000; now see
a preview of Java 1.5.] 
__________________________________________________________________________

(I should note that, like many other FOSS hackers, I normally prefer ANSI C
\+1 over C++ , and am not a big \+1 fan of C++ for most stuff. However, Stroustrup
is a wickedly smart guy, and despite whatever faults his language may has,
he speaks straight, and what he says here seems to make a lot of sense).

Spark hopefully won't be as \+1 complex as C++ is today from the beginning,
but will also be more complex than Scheme to allow for better expression
and faster development. It also doesn't aim to be an incremental improvement
over Scheme (and Common Lisp) which seems to be the case for Arc
and Clojure, but rather something like Perl 5 was to Perl 4 or Perl 6 is
to Perl 5 : a paradigm shift, which Lispers and non-Lispers alike will
appreciate. 

The first version of Spark will not be the ultimate Lisp
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some features of Common Lisp or other Lisps will be absent in Spark, some 
things will be harder to do than Common Lisp or even other Lisps or other
non-Lisp programming languages, and some things will not work as expected
at first (bugs, etc.). A lot of it will be caused due to the fact that the 
primary author of this document does not consider himself a Scheme expert
(and is very far from being a Common Lisp expert) and just likes Lisp and
Perl 5 and other languages enough to want to promote them. 

As a result, some estoric features of the popular Lisp languages today or
some languages that he has not fully investigated yet, won't be available
at first. This is expected given his ignorance, enthusiasm and anxiety
to get something out of the door first. 

While he would still be interested in learning about whatever core library or 
meta-programmatic features other languages have that may prove useful for 
the core Spark language (or alternatively cool APIs that you think should
be ported to Spark). But he has little patience to learn entire languages
"fully" (if learning any non-trivial language fully is indeed possible)
before starting to work on Spark. And often ignorance is a virtue.

So the first versions of Spark will still have some room for improvement.
Most of it may hopefully be solvable using some meta-syntactic or 
meta-programming user-land libraries (as is often the case for Lisps and
other dynamic languages). As for the rest, we could consider them bad design
decisions that still add to the language's colour and make it a bit more
interesting to program in than a 100% perfect language. Sometimes perfection
is in imperfection.