the simplest thing that can possibly work

We're not there yet. But Bard's getting simpler, and it feels like it's sort of settling down.


Base values and literals


null and undefined values


nothing

undefined

numbers


0 1 101 1234567890123456
#b1011 #o047 #xface
2/3 3.1415926

characters


\a \B \newline \space \u2122


names


'foo 'Bar :baz grault: :Wibble:

'bard.lang:Foo

sequences


()

[0 1 2 3 5]

[nothing :foo 17]

maps


{}

{:a 1 :b 2 :c 3}

{"name" "Fred" "friends" ["barney" "betty"]}

anonymous functions


#([x] (* x x))

Protocols


a protocol says what functions it defines, and gives the names of their parameters' categories:

(define-protocol Sequence
  ((empty? <Sequence>) => <Boolean>)
  ((first <Sequence>) => <Value>)
  ((rest <Sequence>) => <Sequence>)
  ((apply <Key> <Sequence>) => <Value>))

(define-protocol Map
  ((apply <Key> <Map>) => <Value>)
  ((merge <Map> <Map>) => <Map>))

Protocols say nothing about how their functions are implemented, and nothing about which representations their categories represent. Protocol and category names are module variables ("global" variables in some languages).

Representations


Structures are concrete maps defined as constructors with named arguments. The argument names become the keys used to access the structure fields. Constructor and accessor names are module variables.

(define-structure <cons-cell>
  (cons head tail))

We can specify the concrete representations of the fields. If we omit the qualification, then the fields are allowed to be <anything>. Here's the above, but with the field categories made explicit:

(define-structure <cons-cell>
    [(head <anything>)(tail <anything>)]
  (cons head tail))

A structure may have variant definitions. Here's a tree:

(define-structure <tree>
  ([] nothing)
  ([] (leaf val))
  ([] (node left-tree right-tree)))

We can qualify the argument categories, as before:

(define-structure <tree>
  ([(nothing <nothing>)] nothing)
  ([(val <anything>)] (leaf val))
  ([(left-tree <tree>)(right-tree <tree>)] 
    (node left-tree right-tree)))

The constructors become functions that create instances of the structure. The argument names become accessors:

(val (left-tree (node (leaf 1)(leaf 2)))) => 1

Vectors are indexed structures:

(define-vector <vector>)

This default form creates a vector representation that is allowed to be of any length and contain any values.

You can restrict the number of elements that are allowed; the following vector allows only 4 elements:

(define-vector <4vector> [<anything> <anything> <anything> <anything>])

You can also restrict the elements by representation; following is a vector representation that allows only 4 1-byte characters:

(define-vector <ostype> [<char> <char> <char> <char>])

A few conveniences are provided for defining other restrictions:

(define-vector <vector> [&]) ; any number of <anything>

(define-vector <char-vector> [<char> *]) ; zero or more <char>

(define-vector <at-least-1-char-vector> [<char> +]) ; one or more <char>

(define-vector <another-vector> [<char> <float> *]) ; one <char>, then zero or more <float>

Implementing functions


define-function can create a method definition without reference to any protocol:

(define-function (list & args) 
    (if (empty? args)
        nothing
        (cons (first args)
              (apply list (rest args)))))

Borrowing an idiom from clojure, it can also define more than one method,
overloaded on the number of parameters:

(define-function
  [(list) nothing] 
  [(list & args) (if (empty? args)
                   nothing
                   (cons (first args)
                         (apply list (rest args))))])

Implementing Protocols


define-methods is designed for defining implementations of protocols.
It uses the same notation as define-function for defining
multiple methods at once, but it can define methods for several
different functions at the same time, as long as they are all
members of the mentioned protocol.

If we omit a definition, define-methods automatically defines a default one for the representations we've declared. the default method signals an error if called. Bard warns us when defining default methods.

(define-methods Sequence [(c <cons-cell>) (n <Integer>)]
  [(empty? c) => false]
  [(first c) => (head c)]
  [(rest c) => (tail c)]
  [(apply n c) => (if (< n 0)
                    (error "Index ~A is less than 0" n)
                    (bind [nth (fn [p i]
                                (if (= i n)
                                    (head p)
                                    (if (empty? (tail p))
                                      (error "Index ~A is too large" n)
                                      (nth (tail p) (+ i 1)))))]
                      (nth c n)))])

Why define a method as simple as, for example, first, which just calls head?
Because head is a representation-specific accessor, whereas first is a
generic function that is defined for any member of the Sequence category.

(define-methods Map [(k <Anything>)(c <nothing>)]
  [(apply k c) => nothing])

(define-methods Map [(key <Anything>)(c <cons-cell>)]
  [(apply k c) => (bind [e (head c)
                        k (head e)
                        v (tail e)
                        more (tail c)]
                     (if (= k key)
                         v
                         (apply k more)))])

bind


bind does the work of let:

(bind [x 2
       y 3]
   (* x y))
=> 6

...and also of let*:

(bind [x 2
       y (+ x 1)]
   (* x y))
=> 6

...and also of multiple-value-bind

(bind [[x y] (values 2 3)]
   (* x y))
=> 6

Using functions and data


Defining a new protocol implementation pushes its category assertions onto the front of the precedence list:

(bind [s (list (cons 1 :one)
               (cons 2 :two)
               (cons 3 :three))]
  (apply 1 s))
=> :one

(categories <cons-cell>) => (<Map> <Sequence>)

When a representation is in more than one category, and there are methods defined on the same function for more than one protocol, we can explicitly say which protocol to use:

(bind [s (list (cons 1 :one)
               (cons 2 :two)
               (cons 3 :three))]
  (list
    (with Sequence (apply 1 s))
    (with Map (apply 1 s))))
=> (#<cons-cell>:[ 2 :two ] :one)

We can explicitly reorder categories:

(categories <cons-cell> <Sequence> <Map>)

(categories <cons-cell>) => (<Sequence> <Map>)

The implementation of Bard on Clozure Common Lisp is working (for small values of "working"), but it lags severely behind the design. It's highly optimistic to say that the design is settling down, but it seems to have at least found a comfortable part of the room, and has begun to walk around in a circle to tamp it down.




0 comments

A rich man is nothing but a poor man with money

A little while back, Faré asked me for an example of a value that could simultaneously be treated as an instance of two or more different categories. I've been thinking about that question for a while, because it highlights a couple of subtleties in Bard's design.

Here's the clearest example I know of:

Consider a sequence:

["Apple" "Banana" "Cherry"]

In Bard, a sequence is an instance of a type that is a member of the category <Sequence>. That is, the Sequencing protocol has methods that are specialized on its representation. In other words, you can do things like count its members, ask for a member by index, and all the sorts of things you might expect to be able to do with a sequence of values.

If a value is a sequence, then you can apply it to an integer greater than or equal to zero, and less than its count, to return a member of the sequence:


(["Apple" "Banana" "Cherry"] 1) => "Banana"


Now consider a mapping. A mapping is a member of <Mapping>; it supports the Mapping protocol, which means that it maps inputs (called "keys") to outputs (called "values"). Here's a mapping:


{ first: "Apple" second: "Banana" third: "Cherry" }


You can apply a mapping to a key, and Bard returns the corresponding value:


({ first: "Apple" second: "Banana" third: "Cherry" } third:) => "Cherry"

Naturally, if you apply it to a key that isn't present, you get nothing back:

({ first: "Apple" second: "Banana" third: "Cherry" } fourth:) => nothing

So far, so good.

Types in Bard are all abstract; the concrete things are called "representations". Types are defined by protocols. This means that a given type may be represented by a variety of different representations, and it means that the same value may simultaneously be treated as several different types.

A case in point is the mapping above: in general. mappings can also be treated as sequences. For example, the mapping { first: "Apple" second: "Banana" third: "Cherry" } can be treated as if it were the sequence of pairs [ [first: "Apple"] [second: "Banana"] [third: "Cherry"] ].

That being the case, we should expect this to be possible:


({ first: "Apple" second: "Banana" third: "Cherry" } 1) => [second: "Cherry"]
After all, if the mapping is a sequence of pairs, then we should be able to retrieve one of those pairs by index, right?

But if that's true, then what do we do with this expression?

({ "name" "Fred" age: 101 'a "Apple" 1 "One" } 1)

If that's a mapping in there, then it should return "One". But if it's a sequence, it should return [age: 101].

Uh-oh.

I know what I want to happen: I want the expression above to return "One", and I want a way for this expression:


({ "name" "Fred" age: 101 'a "Apple" 1 "One" } 2)

to conveniently be made to return ['a "Apple"]. Incoherent, you say? I don't think so, and I think I know how to make it work.


In Bard you can make assertions that say a representation is a member of certain categories, like this:

(categories <hashmap> [<Mapping> <Sequence>])

Besides saying that <hashmap> is a member of <Mapping> and of <Sequence>, this assertion also establishes a preference: if you're trying to decide what category a <hashmap> belongs to, and you have a choice between <Mapping> and <Sequence>, then you should pick <Mapping> unless there's some constraint that prevents you from doing so. This ordering of preference comes from Bard's strategy for linearizing multiple categories.

The problem of type linearization comes up when you combine polymorphic operators that dispatch on type with a type system that allows one type to be a subtype of several other types. If a method exists on <Mapping>, and another exists on <Sequence>, but there's no method on <hashmap>, then how does a function know which method to choose when it's applied to a <hashmap>? A linearization scheme is a procedure for deterministically answering that question.

Bard uses C3, a linearization scheme that was invented for Dylan, and as an improvement on the CLOS linearization scheme. C3 is also used by Python and by some Perl object systems. Its great virtue, besides being deterministic and computationally tractable, is that it yields results that are very rarely unexpected.

The order of categories in a categories assertion provides the precedence order that C3 needs to linearize types, but it also gives us another benefit: given the assertion above, Bard can decide that the value in this expression:


 ({ "name" "Fred" age: 101 'a "Apple" 1 "One" } 1)

should be treated as a mapping. That being the case, the expression returns "one".

What if we really did want to treat the mapping as a sequence? How about this:


(( as <Sequence> { "name" "Fred" age: 101 'a "Apple" 1 "One" }) 1)

We just tell Bard which category to use. Using the precedence established by the assertion, Bard will pick the category <Mapping>, because it has priority over <Sequence>. But we can add another constraint; we can tell it specifically that we want it to choose <Sequence>.

One thing I'm still wondering about is what this expression returns:


( as <Sequence> { "name" "Fred" age: 101 'a "Apple" 1 "One" })

I mean, clearly it returns a mapping equivalent to the input mapping, and just as clearly, it's a mapping that any enclosing expression views as a <Sequence> (rather than as a <Mapping>). But I'm still thinking about what that means down at the level of nuts and bolts. What I do not want to do is return a new value that has a different representation. The point of the exercise is to be able to treat a single value simultaneously as an instance of more than one type.

At the level of syntax, the above treatment says exactly what I mean, and at the level of implementation, it's clearly possible to do what it describes. I'm just still figuring out how Bard ought to keep track of the type constraints introduced by "as" expressions.

(Also, of course, it's no good writing something like (as <Refrigerator> 3), if <Refrigerator> isn't in the list of 3's categories. If you use an "as" expression to constrain the type of a value, you have to constrain the value to a type it can actually be.)

0 comments

Less Is More

Wise authors in many fields have extolled the virtues of simplicity, and the difficulty of achieving them. There is "Einstein's razor, often paraphrased as "Make everything as simple as possible, but not simpler." The Revised4 Report on the Algorithmic Language Scheme admonishes, "Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary." Pascal once wrote to a friend that he had "made this longer because I have not had the time to make it shorter."

I've been spending such time as I have available to make Bard smaller and simpler. It's a challenging enterprise. I'm seeking ways to strip out as much as possible, and leave behind a language that is still appealing. At the same time, I'm trying to address some knotty issues that came up in discussion. Here's a brief overview of what I've reached so far.

Sequences

In Bard, an abstract type called a sequence takes the traditional place of lists in Lisp. I say that sequences are abstract; what I mean is that when you write a sequence expression, Bard is free to choose any representation that conforms to its sequence protocol. You can, of course, construct particular representations if you wish, but if you just write a sequence expression, you may not know which specific representation Bard chooses.

One advantage of that approach is that an expression that would be a list made of cons cells in an older Lisp may instead be an instance of some other sequence type that has convenient ways to store metadata about the expression. This happy quality addresses an issue pointed out in several places by David Moon: conventional s-expressions don't have a convenient place to store such metadata, which complicates building development tools. With older Lisps, you have to hang a bag on the side of your syntax trees to keep track of things like where the source code resides and so forth. Bard is free to choose a representation for source code that has storage for those sorts of things.

There are two simple syntaxes for sequences:

(x y z) means "apply the value of x to the values of y and z." This is the familiar Lisp list notation, or, more specifically, the notation used by Scheme for literal lists.

[x y z] means "construct a sequence consisting of the values of x, y, and z in that order." I always regretted that there wasn't a simpler notation for constructing a list that you don't intend to treat as a function call.

Values

Everything is Bard is an atom or a collection. An atom is a single non-composite value, such as an integer or a Boolean value.

Many atoms are familiar: integers and floats; true and false; the null value (which is presently named "nothing" in Bard). Cells are less familiar. A cell is an atom that contains another value. You can retrieve the cell's value by calling the value function. Cells are usually read-only, but you can create mutable cells. If a cell is mutable, you can change its value using the set! function.

A collection is a value that has component elements, like a sequence.

Collections map keys to values. You can extract a component of a collection by applying the collection to the appropriate key:

(["apple" "banana" "cherry" "durian" "eggplant"] 2) => "cherry"

A sequence is a collection that maps the natural numbers to elements. Some collections map other keys to values; for example:

( { name: "Fred" age: 47 shape: 'round } age:) => 47

A sequence is a collection that supports the Sequencing protocol, and supports using natural numbers as keys. A map is a collection that supports the Mapping protocol, and supports using arbitrary, unordered values as keys. Most collections are both maps and sequences; for example, besides applying the above table to .age, you could apply it instead to integers:

( { name: "Fred" age: 47 shape: 'round } 0) => [name: "Fred"]
( { name: "Fred" age: 47 shape: 'round } 1) => [age: 47]

Not all collections support both protocols, though.

Bard's collections include a few things that aren't usually thought of as collections. Functions are collections: they support the Mapping protocol. Modules are collections: they map names to cells.

Types

Bard's system of types occupies most of my time when I'm working on the language. It's the most interesting area to me, and the one with the toughest problems.

There are a bunch of straightforward datatypes: undefined, nothing, true and false, characters, integers and floats.

Sequences and maps are slightly more complicated, being aggregates, but the main work with them is just defining the Sequencing and Mapping protocols.

Then there are functions, modules, and streams--streams are treated as a sort of special kind of collection--basically sequences of (potentially) infinite length.

Text is what most languages call strings. I call them text in Bard because "string" usually connotes some specific representation, and like most common types in Bard, text is an abstract type. It might have several different representations. A text is a sequence of characters.

Names are what older Lisps call symbols and keywords. A name is an atom that is denoted by a text, like foo or Kalamazoo. Names are collected into modules. There are three ways to write a name, depending on what module you want to refer to:

bard.lang:foo means "the name 'foo' in the 'bard.lang' module". Each module has a unique name. If you write a name by prepending the name of a module on it, the resulting atom means that name in the given module. The module namespace is global; in other words, 'bard.lang' will always refer to the same module.

:foo means "the name 'foo' in the anonymous module." The name of the anonymous module is the empty text. Names in the anonymous module are what some older Lisps call "keywords": they are names that always evaluate to themselves:

:foo => :foo

By the way, for the sake of aesthetics, you can also write ":foo" as "foo:", when the name is in the anonymous module. Sometimes it's nice to be able to do this when you're using a keyword to label something, as, for example, in a map:

{ name: "Fred" age: 101 }

foo means, "the name 'foo' in the current module." All Bard code has a current module that determines the namespace it's using. If you write a name without any qualification (a "bare" name) then Bard treats it as a name in the current module.

Besides these straightforward datatypes, there are two more interesting areas: support for user-defined types, and support for functions that are polymorphic over the types of their arguments.

User-defined types

Bard provides four kinds of user-defined types:

Synonyms are simple new names for existing types. Synonyms are just a naming convenience. Often you can reuse an existing type, but your code will be clearer if you can give a name to the type that indicates its intended use. The type <bank-balance> is more informative than the type <float>.

Vectors are named sequences. You can optionally specify a type restriction on the elements that go in your vector type, and you can optionally set a minimum and maximum count.

Records are named maps. You can optionally restrict the types of a record's keys and values, and you can optionally specify that no keys may be added besides the ones enumerated in the definition.

Protocols are sets of operations that in turn define abstract types. Most of the work I've been doing on Bard is centered on making protocols the primary way to define types. Vectors and records are representations--that is, concrete ways to lay out data. Representations are not really types; they're just arrangements of storage. A real type is defined by what you can do with data, and that's what protocols are all about.

Here's a simple protocol:

(protocol JSONSerialization 
    [(toJSON <Value>) => <JSONString>]
    [(fromJSON <JSONString>) => <Value>])

(Yes, protocols declare return types.)

The protocol expression creates a protocol object, two functions, and two categories.

The protocol object's purpose is primarily organizational and diagnostic. It's useful to be able to ask Bard what protocols exist, and what their functions and categories are. Bard can also warnyou when protocols lack implementations, or when methods return the wrong types.

The functions are generic and abstract. They have no implementations, because we haven't given any yet. We use define-method for that.

Similarly, the categories are completely abstract types with no representations. We haven't told Bard what representations can be considered members of <Value> or <JSONString>.

The simplest way to tell Bard what representations are members of a category is to explicitly say it:

(categories <ascii-string> [<JSONString>])

The above expression asserts that the representation <ascii-string> is a member of the category <JSONString>. The categories special form has three forms:

(categories) returns a sequence of all known categories.

(categories x) returns a sequence of all categories of which x is a member. The returned sequence is stably-ordered--that is, it will always be returned in the same order, unless you explicitly do something to change it.

(categories x [y*]) asserts that x is a member of each category listed in [y*]. Furthermore, it asserts that the categories y* appear in the order given in the expression. Subsequent calls to (category x) will return those categories in the same order.

Bard uses the results of the categories operator to determine how to dispatch polymorphic functions. Suppose, for example that we call:

(foo "blue")

Now suppose that "blue" is an <ascii-string>, and suppose further that <ascii-string> is a member of <Name> and also a member of <Quality>. If there are methods for foo on both <Name> and <Quality>, how does Bard know which method to call? Simple: if (categories <ascii-string>) returns [<Name> <Quality>], then it choose the <Name> method, because <Name> appears first in the returned sequence.

More complicated sets of categories require a more complicated analysis, but the analysis is deterministic (Bard presently uses the C3 class-linearization method, because it has gotten good results in practice, and because I understand it).

Explicitly asserting membership in a category isn't the only way to establish category membership. define-method implicitly calls the categories operator:

(define-method toJSON ((x <ascii-string>))...)

This define-method form, besides giving an implementation of the method foo for the representation <ascii-string>, also has the effect of calling categories like this:

(categories <ascii-string> [<Value>])

In other words, it asserts that <ascii-string> is a member of <Value>.

I still have some implications to work out. This scheme provides convenient ways to define protocols and methods, and to bind representations to categories. It provides deterministic way to order dispatch that gets results that are mainly what a programmer would expect.

Calling categories more than once with the same first argument has a well-defined meaning:

(categories <ascii-string> [<Name>])
(categories <ascii-string> [<Quality>])

means the same things as

(categories <ascii-string> [<Name> <Quality>])

and that means that separate define-method forms also have a well-defined and understandable meaning. In addition, the form

(categories x)

also gives you a way to reliably determine what the dispatch order is.

The question that is currently bugging me is this: what does this mean?

(categories x [<Cat1> <Cat2> <Cat3>])
(categories x [<Cat2> <Cat1> <Cat4>])

Does the second expression completely replace the first? If so, it means that x is no longer a member of <Cat3>. That might be unexpected. But if not, then how do we resolve the fact that the two expressions disagree about the precedence of <Cat1> and <Cat2>?

I don't yet know the answer.

0 comments

I M VEE REE BEEZEE

When my daughter Greer was very small, I was in Silicon Valley working for Apple Computer. My wife Beth and I ran a martial arts studio on the side, and she handled most of the business in order to leave me free to concentrate on programming when I wasn't in the studio, discovering new and interesting ways to give myself stress injuries of the joints. We had two young children, Greer and her older sister Kate. As you can imagine, we were pretty darn busy, especially during hectic product-development phases, like when I was working on shipping Apple's Newton.

Like all small children, Greer tended to need quite a bit of parental attention, and like all harried parents, we would often tell her somewhat impatiently that we needed to temporarily concentrate on something else.

One day, when Beth was absorbed in handling something or other that needed handling, Greer came stmping out of her room with a colorfully-decorated piece of paper in her hand and an intense expression on her face. She stomped up to Beth and thrust the paper under her gaze. Startled, Beth took the sheet of paper, on which a legend was scrawled in large, colorful crayon letters: I M VEE REE BEEZY.

Beth collapsed laughing. Greer didn't think it was funny at al.

"I am very busy!"

She was adamant. She couldn't be bothered with Beth's hilarity. She was, by Goerge, very busy! She stomped away again to emphasize her point.

That's me, the past couple of months. VEE REE BEEZY.

I did some contract work that kept me quite busy, and which involved a fairly comprehensive nondisclosure agreement, so that's enough said about that. Well, I suppose it's safe to say it involved a large enterprise database system built to handle terabytes of data with billions of updates a day, and that is written in Common Lisp. It's certainly an interesting problem.

In my spare time, I had a gaming project that has been fairly intense. The wonderful player Gobblemoss of Shadowclan has, for the second time, created a large roleplaying player-versus-player event on Lord of the Rings Online's Landroval server. LOTRO has a game within the game called Monster Play, in which players of good-guy characters can create monster characters and play them in PVP battle against the good guys. Some players bring good guys to the battle; others bring bad guys, everyone fights, and a good time is had by all. Gobblemoss created a story around why the fight is taking place, and dozens of players on each side pitched in and brought that story to life. The bloody battle for domination of the Ettenmoors is taking place every night this week on Landroval from 10PM to 1AM, Eastern time.

I have had the honor of leading the main good-guy faction in the drama, a clan of Dwarves called the Frosthammer Clan. Preparation has absorbed all my spare time for months, but the thrill of the event has certainly been worth it. For four more nights I'll be fighting alongside the Frosthammers and their many allies, and dodging the bounty placed upon the head of my character, Thingvi Frosthammer, by the leaders of the wicked servants of the Great Eye.

So I haven't blogged for a couple of months, and I havent; reported progress on my implementations of Bard, or on the XG Software infrastructure. That doesn't mean no progress has been made.

I've been revising and expanding my very spartan notes on the Bard language. I received some thoughtful and thought-provoking comments on Bard from some very smart people, and I've been thinking every day about what to do about them. Except for the runtime representation of categories, all the basic pieces of Bard are working in the Common Lisp implementation. I've therefore concentrated on thinking about protocols and categories ad how they ought to work.

Frankly, some unanswered questions remain. I've decided to be conservative about them: I'll define and implement some set of behavior that seems minimal to me, and release Bard in that form, so that I can see what happens in practice. I imagine that anyone interested in actually using Bard will quickly discover deficiencies that need to be addressed, and those discoveries will drive subsequent revisions of the design.

Just lately, I've been working on describing the current state of the design, in preparation for completing an implementation of it. With any luck, I'll have something completed in a few weeks--that is, if implementing it doesn't uncover some hideous design mistake that forces me back to the drawing board.

I've also been working on a couple of other small programming projects, and on incrementally adding features to the XG software infrastructure. The small projects are steps toward small product releases, one in programming tools, and two others in apps for Apple's iPad. I have mixed feelings about the iPad apps. The programming environment for making iPad apps is pretty nice (mostly), and if an app is accepted by the App Store, the logistics of making sales is appealingly easy. But all of the unfortunate things about the iOS ecosystem that people are complaining about all over the web--unfortunate for programmers and unfortunate for consumers--bother me as well. If people thought Microsoft was uncomfortably like a malevolent monopoly in computing in the 1980s and 1990s, it's nothing compared to what Apple is trying for now. It creeps me out and makes me wonder if I shouldn't make a clean break.

I'm seriously considering replacing my iPad and iPhone with Android devices, or at least adding Androind devices so that I have more than one platform ready to hand. It doesn't help that iOS is not exactly friendly to any form of Lisp as a development platform.

I'm also seriously considering a new job and a move back to Colorado's front range, which is probably my favorite place in the world.

So there you have it. Nothing exciting to know about what's going on around here. Just lots of boring old work that might soon give birth to something exciting to the tiny population of people who care about what I work on.

Oh, and I've shipped off The Black Knight to a publisher for consideration, and am in the middle of a revision of it. Shipping it off to a publisher probably just means that it will get laid on top of a towering slush pile, but I'll keep at it. It must be time to finally dust off The Dream of the Witch, give it a going-over, and find another publisher to ship it off to.

Then I can think about which story to tell next. The fellow who returns from a failed mission to a neighboring star after thirty years...except that, because of relativity, four thousand years have passed on Earth? The consulting detective who has visions, and who has an anthropologist friend telling him he's a shaman while his psychiatrist tells him he's just schizotypal? The dark elf assassin who s sent to kill a major character from The Black Knight, and then has to do decide whether carrying out or aborting his mission is a better way to serve his oath? The amnesiac who discovers he's a shapeshifter when strangers try to kill him? One of these will be next; I'm not sure which.

So, again. Nothing much new to report, except that lots of stuff is coming, each thing at its own pace.


0 comments

iGhetto

So I like the iPad. It's a comfortable size and weight. The display is as good as people say. The multitouch UI is slick and fun, though awkward at times.

Unfortunately, there are problems on the software side.

I don't just mean the draconian restrictions that Apple has placed on developers, though those are a problem, to be sure. Developers can always adapt, though.

More troublesome are some fundamental limitations on application features.

By and large, apps on the iPad look pretty good, mainly because the resources Apple provides to developers are designed so that you have to go out of your way to build something that looks bad.

Looking good is only a small part of the story, though. How do the apps work? Well, Apple's UI guidelines are pushing app design in a nice direction, with a strong focus on the task at hand, and UI that tries hard to get out of your way, and to be simple and obvious when it has to be visible. These are all nice things.

The fly in the ointment is iPhoneOS' sandboxing, and what it does to application data, and worse, what it implies about the future of application data.

One of the most noticeable things that is not on the iPad or the iPhone is a file browser. There's no Finder, and apps don't have Open or Save dialogs. Apple gently discourages developers from designing document-based apps at all, and when an app is document-based, it encourages developers to use a UI that presents a flat list of all the documents you've created with the app—and nothing else.

The thing is, for the sake of safety and security, iPhoneOS does not allow apps to go rummaging around in a device's filesystem. An app has a sandbox—a folder that belongs to it—and it can look in there and nowhere else. That being the case, it doesn't make a lot of sense to provide an Open dialog, because there's only one place it can look. Might as well just show a list of what's in that one place. And the app can't write data to anyplace besides that one place, so there isn't much point in a Save dialog, either. The only thing you need to know when saving a document is the name of the document. Heck, you might as well generate a name automatically, and save the user the trouble of having to come up with one.

This is all great for ease of use. Users tend to find filesystem navigation confusing anyway. Developers and other geeks forget it, but you run smack up against it any time you spend time with an ordinary user (somebody without the propeller coming out of his head). People pretty much get instantly lost as soon as they have to interact in any way with the filesystem. So in this sense, iPhoneOS' jettisoning of the filesystem looks like a good thing. It makes everything much simpler and easier.

Except...

What about all the stuff on my desktop machine? I don't want to totally abandon all that stuff, just because I have an iPad.

Aha, you say, the iPad has nifty productivity apps, like iWork! You can use those for working with your data! And it has nifty new drawing and painting apps for working with graphic data. Yeah, that's all true, and some of those apps are pretty appealing...until you try to use them with the files you already have on your desktop machine. [Cue the orchestra to play the "deflating balloon" theme.]

Suppose you have a substantial document you created with Pages. Should be a piece of cake to view and edit it with the iPad version of Pages, right? Well...not so much. First of all, the iPad version of Pages has no idea that the file on your desktop exists, and there's no convenient way to tell it. Remember, there's no Finder on iPad (or on iPhoneOS generally), and so no straightforward way to look for a file, even on the iPad itself, much less on some other machine.

So there must be file-transfer solutions, right? Yeah...sort of. There's MobileStudio and AirSharing and even some simple FTP clients. The trouble is, each of them can only copy files to the app's own sandbox where, of course, Pages can't see it. If you want to use Pages to edit some file, the only way to make that happen is to use the rather clunky iTunes UI to copy the specific file you want into the sandbox of the app you want to use to edit it. Ouch. What's worse, that same iTunes UI doesn't provide any way to copy a modified file back onto your desktop machine after you've edited it. Double-ouch.

Pretty much the only halfway reasonable way you have to synchronize data files between iPad apps and your Desktop uses iTunes from desktop to iPad, and email from iPad to desktop. That's ugly.

In the specific case of iWork, Apple is building a solution that enables you to share documents between your iPad and any arbitrary other device or desktop, by "publishing" documents to a website at http://iWork.com. Of course, the first time I tried actually using this site with the working copy of my second novel, I ran into an error message that told me that my manuscript was too big. Apparently, there's a limit of fifty pages on Pages documents. My draft is three hundred fifty pages too long to share.

Steve Jobs has remarked that people don't want to create content on the iPad; they just want to consume it. This is classic Steve-speak. It means "we don't want you to interact with the filesystem on these devices, so we're going to tell you that you don't want to." My guess is that the problem arises originally from trying to make a consumer device for a large, broad market where there are difficult security implications of ordinary file-browsing. Where it leads is more interesting.

It's a piece of cake to make a file browser that runs on iPhoneOS. The trouble is, Apple doesn't really want iPhoneOS owners rummaging around in the device's filesystem, messing with arbitrary system files, opening security holes in the device that black hats can exploit to take control of other people's phones and so forth. Heck, professional geeks have full-time jobs keeping track of all the holes that users can accidentally and casually open in network security, and that's on machines that have full-time admins watching over them.

You can argue that there are now millions of personal computers in everyday use, and the world has not come to an end because of security holes that clueless users open. That's true, but there are giant botnets with thousands or even millions of enslaved computers, and they exist precisely because ordinary users don't know enough to secure their computers against being hijacked (or how to detect when they've been hijacked).

I get the security concerns. It isn't that iPhoneOS can't interoperate conveniently with desktops. It's that nobody knows how to solve the security problem in a way that is simultaneously secure and convenient. It's almost a law of nature that security is the opposite of convenience. And if a phone isn't convenient, forget it; nobody's going to buy it. Convenience is the whole point.

Which makes the entire concept of the iPad bizarre. What exactly is it for? Is it a netbook? No; it really isn't. Apple's solution in that space is the MacBook Air. It's a computer that interoperates nicely and runs plain old Mac OS X. Is it a giant iPod Touch? Yeah, sort of; except the size of the screen and the speed of the processor make it attractive to run stuff like the iWork apps and really full-featured games.

The iPad isn't really anything we're used to. It *could* be a nice, ultraportable extension of the desktop, a big step into a world of "cloud computing" that is finally appealing, except for the lack of integration—strike that, the anti-integration—with desktop filesystems, which turns it into a ghetto. Your data lives in the world of the desktop, or it lives in the world of the iPad, and never the twain shall meet. Getting files from one to the other is such a pain that you're probably just going to want to avoid doing that as much as possible.

But if you could somehow arrange for all of your important data to live on the web, "in the cloud", aaahhhh...

And that's where we're headed. That's why filesystem integration sucks. Apple doesn't want you to integrate with your desktop. It wants you to live in the cloud. All your data are belong to us. Apple's not trying to be Microsoft. It's trying to be AT&T. It's figuring that before long, we'll look up and discover that everyone has an iPhone and an iPad, and everyone's important stuff is stored on Apple servers, and every time you read a book or pay a bill or look up a phone number or watch a movie, or look for a nearby restaurant, or check your appointments, or leave a message, you'll pay a little fee to Apple.

Filesystem integration? Pfffft. What filesystem?


0 comments

No news is good news

It's been a while since I posted anything. There are a few people who follow my blog to see what I'm up to, and some of them send me mail when I'm quiet for too long. I guess I've been quiet for too long. So, with that in mind, here's what I'm working on at the moment:

I'm working on a contract for someone right now, and it's keeping me pretty busy. It's technically fun and interesting. There are some process lessons that are not necessarily fun, but they are certainly informative, and those are getting some good discussion in the bywicket group.

There's the bywicket group itself, of course. "Bywicket" is a made-up word from one of my fictional worlds. When I set up a server and a bunch of services for the loosely-affiliated group of developers I collaborate with, I named the domain bywicket, and it became convenient to refer to those developers as "the bywicket group".

There's Toybox, the first milestone for the XGO gaming project. "XGO" means "XG Online." "XG" is the abbreviation we use for "Explorers' Guild," which is the name of a loosely-organized group of around two or three dozen people who play games together and design and implement software together. Abbreviated "XG", it usually means our ongoing software project, which consists of server and client infrastructure for building games and social networks, plus specific game milestones we are working on using that infrastructure. "Toybox" is the first game we intend to release, currently slated to begin limited play testing by the end of this year.

There's the Bard language and compiler. The contract I'm working on takes up most of my time, and Toybox takes up most of what's left over. When I need a break from both and want to hack some code, I work on Bard. Mainly, I work on the Bard implementation that is built on Clozure Common Lisp. At present the reader works, modules and environments are implemented, and it has a comprehensive set of (simple) compiler tests, all of which pass except for two of the most important: compiling methods, and compiling most special forms. Actually, compilers for both of those exist, but each has several cases, and currently only the simplest of cases are implemented. When I'm tired or done with other work and want to write some code for fun, I add another case to one of those compilers. In this way, Bard inches toward an initial release.

I also think about Bard in quiet moments, like when I'm going to sleep. I wonder what else I can strip out of it without crippling it; how can I make it simpler and more uniform in character? It's possible that the initial implementation will be even simpler than the one I described for the Twin-City Lispers group, as long as making it simpler doesn't prevent me from writing a working program.

Finally, there's The Black Knight. I'm working on Chapter 18 (of 31) of this second novel. It took seven years to write the first novel; it looks like I might finish this second one in half the time. Woohoo! Recently, In order to give it to a friend, I printed it to PDF and was surprised to find that it was four hundred pages. Well, Michael King once told me my native form was the epic. Looks like he was right.

That's my story, and I'm sticking to it.

0 comments

Closos, Part 2

The first post about Closos generated a couple of interesting comments. Generally speaking, I direct my posts to a pretty small readership, consisting of my future self and a few close programmer and gamer friends. The posts are public, though, and because they're public, they sometimes generate some interesting comments from a wider readerhip. So it was with "Closos".

The first post was pretty much a confection: I got out of bed and wanted to just jot down something about the general idea of finding an organizing metaphor for a Lisp-flavored OS. I didn't say much of anything very specific, nor did I mean to; the topic of the post was the history of this idea—the idea of identifying an organizing metaphor that is Lisp-flavored.

One interesting comment was that the metaphor "everything is a lambda" doesn't mean anything. By contrast, mathrick wrote, "everything is a file" means something specific, because we have a concrete idea of what files are and what they do. But that's not quite fair, I think. We have a concrete idea of what files are and what they do because the metaphor has already been worked out concretely in practical detail. That's the trouble with a metaphor; it only communicates if the listener already knows what you're talking about.

Rainer Joswig knew what I was talking about. He provided a couple of comments that described the OS of the MIT Lisp Machines. He characterized it as "Flavoros"—an OS in which everything is a flavor. That's just about right. But just as the designers of Plan 9 moved on from Unix to design a new OS that takes the metaphor further, I imagine moving on from Genera to envision a modern Lisp OS with an updated vision of the organizing metaphor.

Let me just take a moment to emphasize again that I'm just imagining things now. I'm not actually working on anything. Or, rather, I'm working on a bunch of other things that either pay me right now or that I expect to pay me at some point in the future.

Just as Unix acquired features faster than they could be integrated into the everything-a-file metaphor, so too did the LispM systems acquire new features faster than they could be fully digested. Besides which, of course, the Unix computing culture thrived, while the LispM ecology, for various historical reasons, did not.

I come from a background of writing code for Apple Computer. I spent several years working on a Newton OS that was written in Lisp, and then on a great application-development system called SK8, also written in Lisp. Both of those systems (and indeed, the C++ Newton system, too) were built on frame languages.

In both Newton and SK8, everything was a frame, in the same way that in Genera, everything was a flavor. Newton had no files and no filesystem; instead, everything was stored in a "soup" of frames with automatic persistence. As in Rainer's description of the LispM system, all the system's resources were represented by objects linked in a graph.

Frames are something like Dictionaries in Smalltalk or Python; or like hashes in Perl, or like Maps in Clojure. They're objects that associate values with keys. Frames, though, have some semantics that these other associative arrays don't. A value in a frame is stored in a slot. A slot might be a simple container, as in these other examples, but it might not. It might not exist in the frame at all; it might be computed dynamically, on demand. Or it might exist, but not be just a simple container. In SK8's MacFrames language, for example, a slot can itself be a frame, with any number of additional slots describing what can be in the slot, how it can be gotten, what else happens when it's accessed, and so on.

Frames also support inheritance. A frame system can have many different kinds of inheritance—"is a", "has a", containment, implication, and others—operating all at the same time, and dynamically reconfigurable at runtime.

So why don't I say that "everything is a frame"? I don't think frames or flavors are quite right as the organizing metaphor. The file captures a great deal of what you want to think about in a system for computing. It enfolds the ideas of input and output, and when extended to directories, it includes the concept of network and hierarchical organization. At the same time, it remains appealingly simple. Frames and flavors don't achieve quite the same density of concept.

Files don't feel like Lisp, though. That's what had me looking for another metaphor. The one I settled on is clear in my mind, but doesn't have a one-word name that, in its technical sense, is in wide use. That shouldn't be surprising; the only reason that "file" has such a name name is that it's been worked out in practice for forty years, and the word and its technical usage are everywhere.

"Lambda" isn't the word I had in mind. Despite what mathrick appears to think, I'm not saying "everything is a lambda". As I said, the "Closos" post was about trying to think of this concept; it was not meant to be a clear explanation of the concept itself. A reasonably succinct description of the concept might be "a lexical closure over a lambda with an environment that includes zero or more possibly-lazy lists." Not exactly musical, I know; that's the problem.

It has been said that "objects are a poor man's closures." The lexical environment of a closure offers the same opportunities for managing references to data as a frame. The lambda in the middle of the frame provides a computational engine. Input and output streams are possibly-lazy lists, bound to variables in the closure's environment. Variable bindings provide the glue for connecting these objects into arbitrary graphs.

Moreover, because of the opacity of the closure's lexical contour, outsiders can only know what the closure wants them to know about its connections, and the graph that one closure sees might be completely different from the one that another sees. This quality is another insight pioneered by Plan 9: my filesystem doesn't have to look the same as your filesystem, even if they are made up of the same files. The entire computing world is connected in a single graph, sure; but that doesn't mean I see what you see. It's enough that we see enough of the same nodes in ways that are convenient to each of us.

How about, "everything is an actor"? That's what I said in reply to mathrick's objection. An actor, in the sense of the actor model, is an autonomous unit of computation that accepts inputs and sends outputs, and computes new states of itself. Scheme, from which I take the notion of a lexical closure over a lambda, was designed in part to serve as a language for exploring actors.

Aside from its utility in designing concurrent algorithms, the actor model offers a handy word and concept on which to hang a lot of computing—just as the concept of "file" does. Like the file metaphor, the actor metaphor offers convenient ways to talk and think about input, output, organization, access, resources, and so forth.

Maybe, instead of adopting a name that connotes a walled French vineyard, I should have swiped terminology from SK8. In SK8, the stage was the data structure that represented all objects that a user could interact with on-screen. Any data placed on the stage had to be rendered, and so it was wrapped in a data structure called a costume that facilitated the rendering. SK8 wasn't an actor system, but it certainly had the right vocabulary for one.

That must be why I named my hobby programming language "Bard". Maybe if it ever evolves a development environment, it should be called "Globe".


0 comments

Closos

In the middle of the 1980s, the same folks at Bell Labs who had previously invented Unix began a new OS-research project called Plan 9. Although Plan 9 is not in wide use outside Bell Labs, and it's not the sort of operating system you'd expect large numbers of ordinary computer users to take to—at least, not in its present form—it's nevertheless a very interesting project.

The characteristic of Plan 9 that is perhaps easiest to describe is also one of its most interesting features. When the creators of Unix conceived it, they invented a novel way to control I/O resources: they represented them all as files. In earlier systems, different kinds of I/O devices had been represented with a wide variety of different APIs. In Unix, these all became files, and could all be addressed in the same way. If you wanted data from a device, you read its file. If you wanted to change its state somehow, you wrote to its file. Suddenly, thousands of different kinds of devices all became one thing.

It worked very well, greatly simplifying system programming on Unix, and making possible the evolution of a distinct style of system programming, in which system tasks are organized in terms of file reads and writes, and piping data through sequences of file-like objects.

The insight of the Plan 9 developers was that they had not taken their model far enough. Once Unix escaped into the wild, it began to acquire new features, and those new features were not, by and large, carefully designed to preserve the Unix vision that everything was a file. Soon there was once again a bewildering variety of APIs that any Unix system programmer had to understand.

In designing Plan 9, its creators resolved to return to the roots of Unix, and this time push the file metaphor as far as they could. The result is an operating system in which everything is a file. Files on disk, of course, are files. But so is the network: every network-accessible system in the world is a directory in the Plan 9 filesystem. Websites are directories in the filesystem. Windows are files; to display something in a window, you write the something to the window's file. To burn data to a CD, you copy the data to the CD's file in /mnt/cd/.

Everything is a file.

When I first began to learn to program in Lisp, I was fascinated by Lisp machines. So much so, in fact, that I wangled a couple of different Symbolics machines to play with for several years. The Symbolics operating system, Genera, was not very much like Plan 9, but they both took a completely original, clean-slate approach to making computing hardware usable. That has been done relatively few times in computing history, and for good reason. Developing an operating system is time-consuming and expensive, and its odds of financial success are low. Technical excellence is no guarantee of success. Plan 9 itself, for example, is certainly technically excellent, but it hasn't seen the wide adoption of its ancestor Unix, in part precisely because Unix is so widely adopted and, whatever its failings, is adequate for the tasks to which its put. The cost of adapting to a system as new and different as Plan 9 are manifestly not worth the expense of giving up substantial investments in Unix.

Still, for a certain kind of mind, the lure of rethinking computing environments is irresistible, despite its costs and risks. I had the very good fortune to work for a few years on a clean-slate OS project at Apple. Apple's Newton engineers wrote an entirely novel OS for their handheld computer. Much of the work was done in a dialect of Lisp, my favorite language, and all of it was original, ignoring compatibility with any previously-existing operating system. It's delicious to try to think of everything in an OS as brand new; to try to imagine how you can do things in terms of the resources you have available, instead of in terms of what has been done before.

One lesson of Plan 9 is that the right organizing metaphor is extremely powerful. It was that same principle that made the Mac successful: Apple stumbled over the graphical desktop metaphor at Xerox, and realized it was just the kind of organizing metaphor that could pull together a great product.

So the whole time I was playing with Symbolics machines and working on Newton and fiddling with Plan 9, I was thinking about what organizing metaphor captures the essence of Lisp.

When Linus Torvalds created Linux, I thought that was great; he released a free Unix-like kernel that could be used to build up a Unix-like operating system, reusing the metaphors that had grown up in that world. If someone released a similar Lispy kernel, around which a Lispy culture of computing could grow, what would it look like? What would be the organizing metaphor of a new free Lisp-machine OS?

The old Lisp machines didn't have a single clear organizing metaphor in the way that Plan 9 or the Mac had. They had Lisp, of course; a machine whose system programming language is Lisp is quite wonderful, in that everything about the system is manifest and accessible right down to the hardware. You can easily inspect and modify everything, and with much greater safety than in, say, a Unix system. As Rainer Joswig likes to say, everything in the system is alive; programming is not a matter of manufacturing dead, static things, but of live interaction with running code.

Still, there is not the central organizing vision that everything is a file, or that everything is a cartoon object that you can grab and manipulate with your hand. If there was such a concept for a Lisp OS, what wold it be?

Gradually, I came to think that is was something like a closure. I started out thinking about lists, of course, because that's the obvious place to start when asking this kind of question about Lisp. Lists are not quite the right thing, though. They're important, but a couple of things keep them from being the right starting point. A list is a tree, of course, and Plan 9 shows how powerful a tree is in organizing resources; it makes every file a node in a galactic network that, from any specific user's point a view, is a tree rooted at that user. Since everything computable in Plan 9 is a file, Plan 9 organizes the entire computational universe into a single vast tree with many virtual roots. That's pretty cool, but it's only part of the picture.

What about lambda? Lambda is also obviously very important. From the point of view of Lisp, lambda is computation. It's why computers do things;why they are more than just collections of inert data. I wanted a way to think about computing that combined lambda with lists, and I wanted one more thing.

Lisp machines were promiscuous. They had very little notion of security. They were single-user workstations, and the user was a programmer who wanted convenient access to everything the machine could do. Security was just another word for obstacles to productivity.

Nowadays it would be irresponsible to design an operating system that couldn't protect a user's resources from abuse. Networks are ubiquitous, and so is misuse of them. Any operating system from now on that is designed for general use has to take into account that its users will want protection against burglars and vandals.

It was this thought, combined with my general affection for the Common Lisp Object System, that led me to coin the name Closos for an imaginary Lisp OS built on a single organizing principle. In French, the world clos means a private, walled vineyard. In Catalan it means a walled garden. I imagined creating an operating system built on CLOS in the same sense that Dylan was built on CLOS (though Dylan was not Common Lisp). CLOS plus OS, became Closos.

So what's the organizing principle? I think it's the closure, more or less. A lambda wrapped in an environment. The lambda means the closure is active; it can do things. The lexical environment it's embedded in gives it memory—references to resources. If you imagine that various closures can be applied concurrently, and that they can pass data to one another, then the computing environment starts to look like the Actor model. The Actor model has a long and distinguished history in distributed and concurrent computing, and Erlang has famously demonstrated how far it can take us in productivity and reliability. We might easily forget that a primary motivation for the design of the Scheme language was to support experimentation in Actor programming.

It's important to repeat that Closos is pretty much completely imaginary. From time to time I pick up a little piece of the idea and work it out in some code that I then squirrel away somewhere to think about and maybe come back to again later. I have the vague notion that I might get something really written some time in the future, but lately I've noticed that my age keeps advancing relentlessly, and the number of items on my To-Do list that are in front of Closos never seems to diminish. So perhaps it's destined to remain a fond daydream, just right for woolgathering on a rainy day.


5 comments

Bard intricacies

The post about Bard generated a lot of interest. Thanks for all the supportive and helpful remarks, folks!

I got some questions about availability. As I mentioned in the original post, neither of the main implementations is ready for distribution yet. The high-level VM is still in an early stage, as I experiment with control structures and how to model state. I know how to use Haskell's nice set of monadic libraries to do that, but I'm actually interested in trying something a little different—basically, just treating the VM registers as function parameters, so that machine state is entirely captured by the visible parameters. Generally, that's exactly what monadic models of state try to avoid, but I like the notion of having all the registers manifest in the arguments of the VM main function, so I'm experimenting with that.

The Common Lisp implementation is much farther along. It has a reader capable of handling all of the Bard surface syntax (see http://bywicket.com/groups/bard/wiki/8a872/Lexical_syntax.html) except for Pairs and module-qualified symbols. It has implementations of all the base representations. I used the very nice FSet library (http://common-lisp.net/project/fset/) to implement Sequences and Maps in a way that has reasonable amortized complexity.

I need to update an earlier version of the compiler to work with the current reader and data structures, and then it should be possible to write and run Bard programs at the REPL.

There are some tricky issues with Protocols and Categories, though. I'm pretty happy with the basic idea—even excited about it—but there are some ramifications that are tricky to work out in detail. Basically, as designed, Categories are the only kind of abstract type in Bard, and Categories are essentially logic variables that get unified with concrete types at method-definition time.

This scheme leads to a couple of wrinkles that I need to iron out. For example, if we are allowed to define methods on Categories, rather than just on representations, what should I do in the case that a Category given as the type of a function parameter cannot be unified with a concrete representation at definition time? Is that an error? Or do I allow it, assuming that the type will eventually get reified, at the cost of introducing runtime errors? Allowing it is handier for incremental program construction, but troublesome for reliability. Perhaps I should parameterize it (with some module variable like *unbound-categories-are-errors*)? I could always forbid Categories as argument specializers, which would make that problem go away entirely, but it would also make it less convenient to write polymorphic functions. Every define-method would always have to specialize its arguments on concrete types. If I go with the other approach, the requirement is only that every parameter must be unifiable to *some* concrete type.

A point raised by David Moon, that I had already been thinking about, is what to do about protocols with undefined functions. Presumably, when you define a protocol, you mean for all of the functions to be defined in order to support it for a given set of types. On the other hand, experience has shown us that programmers will often want to implement a subset of a declared protocol, just enough to get some functionality that they need, and will chafe at having to implement a lot of functions they don't want. If I design protocols to forbid you to partially implement a protocol, then I create a stumbling block for programmers, who are likely to kludge up workarounds. On the other hand, if I allow partially-implemented protocols, I don't know yet quite what that does to the type system. Does it still work? Can it decide whether a given value matches a defined method? I don't know yet. I don't understand my own design clearly enough yet.

David Moon also suggested that Bard really wants to have declared return types, because it helps so much with type inference. I agree, and my lame excuse for the lack of them at this point is that I haven't invented a syntax for them that doesn't make me wince (the Bard wiki currently documents the syntax-of-least-wincing so far). Never mind that I don't really know how to implement type inference properly yet.

So, as you can see, there are still some moderately hairy issues to resolve in the language. Nevertheless, a usable release (with some features either missing or implemented foolishly) is probably not too far in the future. Because I chose to build the initial implementation on Clozure Common Lisp, it will probably even be possible to write nontrivial programs with it, as long as I leave a way for you to make "foreign" calls to the underlying Lisp.

I continue to welcome comments and suggestions. If you know how to solve one of the problems I've mentioned (or a pointer to literature that addresses it), I'd be glad to hear from you. Or, if you know of an existing proof that some part of the design won't work, I'd be glad of that, too; it would save me wasted effort.


0 comments

Bard

I've made some substantial changes to the design of Categories for the 0.3 release. The changes are rather sweeping and incompatible; luckily, the community of users other than me is small, and I've cleared the changes with him.

Domains are out. So is the c3 domain and inheritance, for the moment; I'm willing to admit that there may still be a place for them, but I want to try an iteration without them, and see what happens.

In 0.3, protocols become first-class objects, and categories take on the central role in the object system that its name suggests. So far, the results are very gratifying. It seems like I may have hit upon the right organization of concepts, one that clearly reifies the ideas I expressed so many years ago in the "Protocols" article.

But this post actually isn't about Categories. It's about the programming language that Categories is a part of.

I wrote the first implementation of Categories to address things I wanted and couldn't get in Clojure. However, I've been working on my own design for a Lisp dialect since around 2002, a dialect that I've been calling "Bard". As Categories took shape, I realized that it was the last missing piece of the Bard design, and that, after several years of tentative implementations, I finally had all the pieces I needed for the language.

The latest version of Categories seems to seal the deal, and now I find that I'm ready to talk about Bard publicly.

I didn't set out to design a new language, originally. I just wanted a modernized implementation of the old Dylan language, from about 1992, before it lost its Lisp syntax. I had the incredibly good luck to spend 1992 through part of 1994 working on an operating system at Apple that was written in Dylan, and I've never had so much fun with a programming language before or since. I missed it; I wanted it back. I couldn't get it anymore, so I sort of decided to make one of my own in my spare time.

Along the way, a small handful of people have followed along and provided ideas and encouragement. I should particularly mention Joe Marshall, Michael Maynard, and most helpful of all, Philip McBride, all of whom gave me comments that either removed obstacles to my thinking, or provided encouragement when I needed it.

Bard owes a lot to Dylan, but it isn't Dylan. It isn't even old, circa-1992 Dylan anymore. I've been strongly influenced in its design by ideas from several other languages, especially Python, Clojure, Haskell, and ML.

Bard is definitely a Lisp—very much so. At the risk of drawing flames, I'd say that in some ways it's more a Lisp than even Scheme is. My reasoning there is that, technically, the source code of a Scheme program is text, and I think that's a misstep. Historically, and in most other Lisp dialiects, source code is explicitly or implicitly not text; it is s-expressions. Text is just a print representation of the s-expressions. Bard's design consciously adopts that convention, because I think it's an important feature.

Like Clojure, and like Dylan before it, Bard makes no attempt to be compatible with any previous Lisp. I'll a lot of grief for that, I know. There is a cadre of programming moralists who will wail and gnash their teeth that the world doesn't need Yet Another Object System, Yet Another Lisp, or Yet Another Language. All I can say in reply is, "So what? I couldn't care less what The World needs. Designing and implementing Bard is entirely about what I need." If you aren't interested in another Lisp, you are quite free to move right aong. There's nothing to C here.

Bard bears a resemblance to Clojure in some respects, although nearly every part of its design predates Clojure's. I did lift the architectural idea of distinguishing value from identity straight from Rich Hickey, and I'm grateful to him for it. It neatly solves some nagging problems I'd been having with how best to integrate features based on Common Lisp SERIES.

Whew! After all that, I suppose I should actually provide a little bit of description of the language itself.

Here are some examples of primitive values in Bard, using their standard lexical syntax:

  void     ; the "nothing" value
  10       ; an integer
  12.34    ; a float
  2/3      ; a ratio
  \space   ; a character
  \A       ; another character
  baz      ; a symbol.  Bard is case-sensitive; baz and Baz are
           ; different symbols.
  :Foo     ; a keyword.
  Bar:     ; also a keyword. :bar and bar: are different ways 
           ; to write the same keyword
  true     ; a Boolean
  false    ; the other Boolean
  ()       ; the empty sequence
  []       ; another way to write the empty sequence
  (sequence 0 1 2 3 4 5)
           ; an expression that returns the sequence 0,1,2,3,4,5
  [0 1 2 3 4 5]
           ; another way to return the sequence 0,1,2,3,4,5. 
           ; In Bard, as in other Lisps, "(foo bar baz)" applies 
           ; the function foo to the arguments bar and baz. 
           ; Written this way: "[foo bar baz]", the expression 
           ; denotes the same sequence, but also tells Bard to 
           ; treat it as a literal sequence, and not as a function 
           ; call. It's syntactic sugar for the "(sequence...)" 
           ; expression.
  ""       ; a text object
  "Hello!" ; another text object
  (0,1)    ; a pair
  (,)      ; the empty pair
  {}       ; the empty map
  { greeting: "Hello!",
    length: 5}
           ; a map containing two entries
Bard's object system is based on Categories. Protocols, (generic) functions, methods, representations, and categories are all central, first-class concepts in Bard. Types in general are represented by categories, which are purely abstract and defined by protocols. You never explicitly define a category; you define protocols, and they implicitly define categories.

This design falls out naturally from separating representation, protocol, and taxonomy, as I've discussed in previous posts about Categories. Let me define the terms a little more explicitly, and then show some more code.

Bard (and Categories) defines values in terms of representations, protocols, and categories.

A representation is a layout of data in memory.

A protocol is a definition of the operations that apply to some set of values.

A category is an abstraction over a set of values that play a role in a protocol.

For example, a 64-bit, unsigned, twos-complement integer is a representation. Here's a simple protocol:

(protocol Addable (add <Number> <Number>))

This expression defines a protocol named "Addable", a function named "add", and a category named "<Number>". It's a very simple protocol, with just one function; in most cases, protocols will define many functions and categories.

The protocol, function, and category are all first-class objects. They are also abstract; we don't yet have any thing we can call, nor any concrete values we can use. Well, we have the protocol, function, and category, but we can't add together any numbers with those, yet.

So what exactly is a <Number>? We don't know yet. We don't have any examples of a <Number>. We'll have to make one:

(define-method add ((x <integer>)(y <integer>))
   (+ x y))
Okay, now we have an example of an actual method for add that performs a concrete operation on two values whose representations are <integer>. Now we know that <integer> is an example of a <Number>, and that (+ x y) is an example of the function add.

After the above expression is evaluated, there is a method object associated with the function add, and if we ask Bard whether <integer> is a <Number>, it will say that it is:

  ? (isa? <integer> <Number>)
  true
Suppose instead that we ask this question:

  ? (isa? <float> <Number>)
  false
See, we haven't provided any examples that show <float> to be an example of <Number>, so Bard says it isn't one. Actually, I'm lying. In current running Bard systems, the answer would be true, because the standard library does, in fact, provide such examples.

This is what I meant when I said you don't explicitly define categories. Categories are defined by protocols. That was the central idea of the "Protocols" idea: that taxonomy is defined by protocol. The type of a particular representation is defined by the set of operations you can perform on it.

By the way, the functions and categories defined in one protocol can be freely reused in other protocols, as long as you don't try to define anything that contradicts already-existing protocols.

Defining a protocol creates a name for a class of operations, and for any abstract categories of values that they may operate on. You can ask Bard for a list of all the defined protocols, and you can ask each protocol for a list of all the categories it defines. At any given moment you can find out which representations belong to which categories.

You can extend any protocol by defining new methods for its functions. Bard provides functions for defining new representations, and you can add those new representations to any categories you need by defining methods.

That brings us back to the values I listed above. Let's look at the types of those values again. The convention in Bard is that representations are written like this: "<unsigned-32-bit-integer>", and categories are written like this: "<Number>"

  void     ; <void>
  10       ; <Integer>
  12.34    ; <Float>
  2/3      ; <Ratio>
  \space   ; <Character>
  \A       ; <Character>
  baz      ; <Symbol>
  :Foo     ; <Keyword>
  Bar:     ; <Keyword>
  true     ; <true>
  false    ; <false>
  ()       ; <empty-sequence>
  []       ; <empty-sequence>
  [0 1 2 3 4 5]
           ; <Sequence>
  ""       ; <empty-text>
  "Hello!" ; <Text>
  (0,1)    ; <Pair>
  (,)      ; <empty-pair>
  {}       ; <empty-map>
  { greeting: "Hello!",
    length: 5}
           ; <Map>

Notice that most of the types given are categories. Wherever it makes sense, Bard treats the types of values as categories, meaning that it doesn't promise to give you any particular representation (unless you specifically ask for one). That leaves the compiler free to use whatever representation it thinks will work best in a given context, while also providing ways for you to say that you want a specific representation when you know better than the compiler. In most cases, it doesn't matter which representation the compiler picks for a value, as long as it obeys the right protocol—and, because of the design of Categories, the value picked always obeys the right protocol.

Notable is the fact that

  (foo bar baz)
does not necessarily denote a traditional Lisp list. It might; a chain of cons cells is a perfectly valid thing for this expression to mean, as long as there are methods defined that make that representation a valid instance of <Sequence>. The above expression denotes a <Sequence>; in the absence of further explicit directions, the Bard compiler is free to represent the <Sequence> any way it chooses; semantically, it makes no difference. The point of the above list of types is that Bard treats most values and types this way. <Text>, for example, is Bard's equivalent of the string type in most languages, but <Text> is a category, not a representation. Using a <Text> value doesn't tie you down to any particular representation of text; you can use ASCII or Unicode, or Martian Hyper-Quantum Ropes, it doesn't matter, as long as the protocols defining <Text> are implemented for whatever representation you decide to use.

Like Clojure, ML, and Haskell, Bard primarily uses immutable data structures. Like Clojure and ML, it provides mutable containers for values, for those cases where you really need to assign a value to something. Bard calls these value containers cells, and they are instances of <Cell>.

By default, if you ask for a <Cell>, you'll get a <Versioned-Cell>. Both <Cell> and <Versioned-Cell> allow you to replace a value stored in the cell. <Versioned-Cell> remembers what the previous value was. You can ask a <Versioned-Cell> how many values it has held, and you can ask it for a particular one. In effect, a <Versioned-Cell> acts like a memoized (possibly infinite) stream of values, and also like an integer-indexed sequence.

Naturally, using <Versioned-Cell> involves some overhead; if you know for sure that you won't ever want to know a cell's previous values, you can explicitly ask for a non-versioned cell, and then the only overhead you'll pay for using it is Bard's thread-safety protections—and not even those, if your program is single-threaded.

The majority of Bard's data structures, which are immutable, are intended to be implemented using representations with good amortized algorithmic complexity. I have several partial implementations running. The Common Lisp one uses the FSet library for its aggregate data structures. The Haskell-based virtual machine uses nice data structures provided by several Haskell libraries. Clojure would make a reasonable substrate for a JVM-based implementation. You could reasonably expect to implement the language on Erlang, or, with the right data-structure libraries, on Gambit-C.

Bard uses a module system based on Dylan's. A module is a first-class object that contains a set of defined <Symbol> objects. When you refer to a symbol in Bard, you are referring to it in the context of a module. The syntax for module-qualified symbols looks like this:

  bard.lang.<Number>

That expression reads as "a reference to the <Symbol> <Number> in the module bard.lang". There is a single global namespace of names for modules, so that a module name is always unambiguous. A module is not a synonym for a source file; as in Dylan, I presume that someone might like an implementation that doesn't use discrete source files. I also presume, though, that most people will use plain old source files, and that at some point I'll define some straightforward way to map modules onto the files that define them.

Like Dylan modules, Bard modules manage only symbols, and by default, Bard makes symbols private. If you want other modules to be able to use a symbol, you export it. You can also import a symbol:

  (in-module user)
  (import (bard.lang.<Number>))
After the above expressions, a reference to <Number> in the user module means the same thing as bard.lang.<Number>. This only works, of course, if <Number> was previously exported from bard.lang.

As in Dylan, modules and files are orthogonal: all the code after an in-module expression uses the named module, until a subsequent in-module expression changes it.

There are a lot more mundane details, of course. Bard supports concurrency primitives generally similar to those in Erlang and Clojure; the exact set that will be available at first release is still under discussion. Most of the defining constructs of Lisp—including macros—are available in Bard. A reasonably complete library of functions for the above-described data structures will be available.

I said I have several implementations in the works. The most important ones are probably the one written in Clozure Common Lisp and the one written in Haskell. The Categories work in Gambit-C Scheme and in Ikarus Scheme could potentially evolve into Bard implementations, as well. I sort of hope they do; for one thing, I'd like to use Bard for some product work I'm currently doing in Gambit.

The CCL version is important because it's the farthest along. It's possible to write simple programs in it now, though it's not close to being ready for release. The roadmap for it is very clear, though; it's a simple matter of programming. It's worth keeping in mind, though, that Bard is a hobby project, and it takes lower priority than my paid work and the product work that I fund. So, although the amount of work to get it ready for a release is not really large at this point, it may take a while, anyway.

Other important features of the CCL implementation are that we may reasonably expect it to run on Mac OS X, Linux, and Windows once it's ready for release; and we may also reasonably expect it to have some sort of foreign-function interface. It will also be compiled to native code (though not necessarily very good native code in early releases).

It already supports interactive, incremental compilation, and can save and load images. These nice features are no credit to me; I'm just piggybacking on what CCL can do. They're important, though; image-based and incremental development are crucial features in my mind. I'd like Bard to eventually have an interactive environment that is reminiscent of SK8 and Smalltalk.

The interesting thing about the other main implementation is that it involves writing a rather quirky and maybe-interesting virtual machine in Haskell. I freely admit that I don't know how well it will work, but I'm keen to find out. There are certain aspects of Bard that are reminiscent of Haskell, and I'm basically curious to see what happens when I try making a VM for it in that langauge. I'm also using that implementation as an excuse to try writing a very high-level VM, one whose design is very close to the source language, instead of very close to the hardware beneath it. I've written several low-level ones before, using tricks like exploiting GCC's computed labels for speed in dispatch, and so forth; I wanted to try something really different and see what happened.

I'm going to talk to the tclisper's group on February 22nd about designing and implementing Lisp (by telepresence; I'm not physically going to be there). I imagine that talk will involve some more stuff about Bard. Who knows? I might even be able to run some example code for them.

If you couldn't care less about Bard, but are very interested in Categories, don't despair. Work on Categories 0.3 is progressing as well. In fact, they are in part the same thing: Categories is central to Bard's object model.

Really, though, work on Categories is actually work on Bard. Bard's the real goal.


7 comments

gregarious introvert

In an exchange with Zach Beane I made the observation that I work alone, but am way more productive if I can talk to someone about my work. That was my motivation when I began blogging. It seems to have worked well.

I was discussing with a couple of collaborators the development process for the XG games we're working on, and one of them, probably Philip or Keith, mentioned something about needing to arrange things so we don't get distracted from our goals. I replied that I don't get distracted; I just add To-Dos. I have a lot of To-Dos. A lot of them.

No, I mean a lot.

When I think about it, it seems a weird phenomenon. Not like I'm running low on weirdness, generally, but bear with me. I have a hard time working productively when other people are physically around me. They are incredibly distracting, even if they aren't talking to me or paying attention to me. Other phenomena seem related to this one: when driving a car, I cannot simultaneously carry on a conversation and navigate. My family thinks it's hilarious (I have to admit, I think they're right). I can drive just fine. I just can't deliberately go anywhere in particular. Start a conversation with me behind the wheel, and there's no telling where we'll end up. My daughter Greer gets entirely too much entertainment out of the "let's chat with mikel and see where we end up" game.

A similar thing happens in MMORPGs with voice chat. I understand the appeal of voice chat for most people. It's just way faster to simply talk to the people you're playing with, rather than having to try to type when something fast and dramatic is happening to you. That is, it's faster if you aren't me. It's like the part of my brain that parses other people's speech is also the part of my brain that organizes goal-oriented behavior, and it can only do one or the other at a time. If people are talking to me in-game, I involuntarily listen to what they're saying, and completely forget about whatever it is that I'm trying to accomplish with my character until the talking stops.

Oddly, this does not happen when I'm reading text.

So anyway, I'm comfortable working alone, and will keep plugging away at the things I believe in, even if nobody else pays any attention to them. But, on the other hand, I get a big chunk of my motivation for work by talking to other people about it. My brain is a never-ending stream of "hey! you know what we could do...?" but not all the ideas are of equal quality, of course. Some are total rubbish. Even the good ones are generally half-formed, and need to be banged around in conversation before their value really becomes apparent. So I like talking to people about them.

Besides that, I like to see people's reactions. If they get excited about something, I get excited about it, too. That doesn't just go for my work; it goes for everything. I'd rather see a movie with someone. My mother and my daughter are both great movie companions, because they'll talk with me about the movie afterward. I love that. The beauty of nature is wonderful when I'm alone, and even better when someone is standing beside me going, "ooohhhh....aaaahhhh".

The same goes for my work. I'll keep writing code and stories and songs, drawing maps and illustrations and conclusions, playing games and pretend and guitar, even if nobody pays any attention. But I'll do more of it longer and with much greater energy if someone is sharing it.

I'm an oddly gregarious introvert.

0 comments

powerhouse

Do you know the song "Powerhouse"? It's great. I must have first heard it watching Warner Brothers' cartoons as a child.

YouTube - Raymond Scott Quintette - Powerhouse - Hit Parade

There are a zillions fantastic versions of it. Check out this one:

YouTube - The Philharmonicas : Powerhouse

I think my favorite performance of it was Don Byron's.

YouTube - Fast Portrait of a Filmmaker

(The video is irrelevant, though sort of amusing, but the snippet of Don Byron recording is great).

Anyhow, I always think of "Powerhouse" when I've got a lot of activity going on. It's the perfect "holy ned, I'm crazy busy" music. And I am crazy busy lately.

I'm working a new contract that is super interesting. I'm sworn to secrecy, but it's a great project.I'm having a ton of fun and learning a lot about a bunch of things I need to know about. A lot of it is stuff that is directly relevant to our work on XG games, although it's in a market that is pretty much completely different in every possible way. I get to collaborate with a bunch of really smart people solving a lot of the same problems the XG group needs to solve, and I get to do some stuff that they need, too. Fun!

Speaking of XG, that's coming right along. We're past the part where everything we do is boring infrastructure, and have begun to work on things that are actually pieces of a game. THe project is called ToyBox, and it's not an MMORPG--not yet. It'll be a web destination for people who want to get together groups to play PnP roleplaying games without having to be physically in the same place. Maps, characters, and art from the old XG PnP games will be part of it. I think it's going to be a lot of fun.

The TCLisper's group invited me to talk about something Lispy later this month, and they said I could do it without traveling. Whew! So I thought up a Laff Riot f Lispy things to talk about and asked them to pick one. They didn't exqctly. They sort of picked about half of my proposed topics, but I think I have a handle on how to talk about them all and make it fun.

Meanwhile, my heroic attempt to have fun in WoW again has failed miserably, but do not despair! Shadowclan has come up with another great RP event for Lord of the Rings Online, and one small part of it is the revival of the Frosthammer Clan! Shadowclan's Gobblemoss is at the helm, and based on past experience, that means it's going to be wonderful.

So I'm really really tired now, but it's a happy tired

Cue Don Byron.

0 comments