Implementing a sentence generator

An early Chomskyan grammar

In 1957 Chomsky proposed that an account of English could be provided through a grammatical description of a set of core sentences (which he called kernel sentences) together with a set of optional rules (transformations) which provided for the derivation from the kernel of other related sentence types. (The treatment of sentences thus followed in very general terms a methodology which you may remember from learning word forms in a language like French. If you know one of the forms of a verb - say the infinitive form parler 'to speak' - and you know the appropriate set of transformations, you can derive the other forms of the verb - like parlez, parlera etc.. For Chomsky Is Babar an elephant? would be handled by a transformational rule which specified how interrogative sentence structures - roughly questions - were to be derived from declarative structures - roughly statements, like Babar is an elephant.) Of course you need to know how to construct kernel sentences, and u nlike words such as parler you don't just learn them like a parrot. But the description of the kernel which Chomsky sketched briefly in an appendix assumes no more than we have done implicitly in Lonely Hearts, or than Nesfield in his account, viz that sentences have a hierarchical structure. It reads like this:

Chomsky's grammar and the procedures of Lonely Hearts

At first glance, this (partial) grammar of Chomsky's looks quite different from a collection of Logo procedures. But recall the techniques we used in Lonely Hearts. There were essentially two types of procedure. One, call it type A, had a form which we can generalise as:

to category
op (se constituent1 constituent2 constituent3)

When you know that --> in Chomsky's notation can be read as is composed of or expands into, and + means no more than concatenate, it is clear that a rule such as

2. VP --> Verb + NP

can be translated directly into a procedure of type A as:

to vp
op (se verb np)

The other type of procedure we used - call it type B - had the general form

to ultimate.constituent
op pickrandom [ultimate.constituent1 ultimate.constituent2 ultimate.constituent3 . . .]

This is directly paralleled in Chomsky's grammar by a rule such as

11. M --> will, can, may, shall must

where commas are used to separate constituents within a list of alternatives.

In Logo we would say:

to m
op pickrandom [will can may shall must]

One novelty, form our point of view, is that Chomsky identifies the constituents of sentence structure by names which appear to refer to their internal form rather than to their function. He talks about noun phrases rather than about subjects and objects like Nesfield or like number or type or someone in our Lonely Hearts environment. But this decision, whatever its merits, is clearly not one which affects the possibility of representing the rules within the framework we have used for Lonely Hearts since all that is provided for by this framework is the expression of constituency relations. It makes no difference to the form of the procedures whether we consider a sentence to be composed of a subject and a predicate or a noun phrase and a verb phrase. As far as LOGO is concerned these are just different sequences of characters used as names.

Problems in implementing Chomsky's rules

On the face of it, then, we could make a start at representing Chomsky's kernel rules using procedures of type A and B. But there is at least one problem which we will quickly bump into. In Lonely Hearts all of the variation in the form of the sentences is at the level of the ultimate constituents (which are 'selected' by the pickrandom in type B procedures). All of our advertisements had as a result exactly the same tree structure. Differentiation was provided for only in the choice of leaves at the end of the branches. Chomsky's notation in rule 3, on the other hand, is intended to represent a choice of two different sub-trees branching from NP. Our present framework cannot cope with this. If we try to handle Chomsky's rule 3 with

to np
op pickrandom [np.sing np.plur]

we will find that np.sing or np.plur will appear (literally) in the output list whether or not procedures called np.sing and np.plur have been defined. The problem is not without solution. You will eventually discover that it can be rectified by passing the output of pickrandom to the primitive run as in the line op run pickrandom [np.sing np.plur]. But we will not follow up this possibility here.

We also find ourselves confronting problems when dealing with optional elements. Chomsky identifies these freely omittable constituents by enclosing them in parentheses - e.g. the (M) in rule 10. As you know, we can, in fact, deal with such situations at a practical level. In this particular case, we can treat M as obligatory and provide an empty list as one of the possible realisations of M in the next rule, rule 11. (Cf. the use of the technique to handle the optionality of the word please in the procedure phone in Lonely Hearts.) But we are, in effect, no longer representing accurately the grammar which we are trying to implement.

Before attempting to tackle these difficulties, however, there is a potentially more serious problem which we should attempt to deal with first. Even if - with suitable modifications - it could be got to work, an implementation of Chomsky's grammar based on procedures of types A and B would be one in which the linguistic rules of the description are inextricably meshed with the mechanism for exploiting them. In other words, if we wished to change the linguistic rules we would need to define new procedures or edit our old ones.

Modularisation - separating 'knowledge' and 'process'

From a programming point of view, it is generally considered good practice to separate the data, the 'facts', from what you want to do with them. You can then change either one without changing the other. From a linguist's point of view, the need is perhaps even more obvious, since it clear that exactly the same mechanism should be employed in interpreting the linguistic description (the 'facts') of any language or sub-language. The ideal implementation of a grammar, then, will be one in which the linguistic rules - making up the linguistic knowledge base - form a self-contained module quite independent from the algorithm which is used to process them. In these circumstances substituting one set of rules for another becomes a straightforward business. (Just as it is a straightforward business for a linguist who is competent in interpreting the conventions of Chomsky's description,to apply the same skills in working with a similarly formulated description of French.) Not only that, it also is much easier for a non-programmer linguist to interact with such a system, since all that is required is an ability to formulate linguistic rules according to some agreed conventions - there is no need for any new procedures to be written.

A first stab at implementing the linguistic rules

Let us suppose, seriously oversimplifying for a moment, that the grammar rules of English allow no variation in the form of a sentence. I.e. they generate only one sentence although they do implicitly assign it a hierarchical structure.

In order to meet the requirement that the grammar is expressed as data statements rather than procedurally, let us next suppose that each rule of the grammar is handled by assigning values to variables, so that in each case the variable's name is a category and the variable's value is the list of the category's constituents. One such possible grammar is thus:

make "s [np vp]
make "np [det n]
make "vp [v np]
make "n [cat]
make "det [the]
make "v [likes]

Notice that if we represent the grammatical rules in this way, we are claiming that there is no need to impose any ordering on these rules. The Logo variables representing them just co-exist in the workspace. Of course, you will discover that the rules do get used in a particular order, but that is in consequence of what they do rather than of the order in which they are presented. It follows from this, too, that you can add new rules and delete old ones easily and independently.

And a first stab at a processing algorithm

Now let us suppose that the algorithm we require to apply to such a set of rules is one which, when provided with a list containing one category as input, will output the list of ultimate constituents into which the category expands (according to the rules). If given [np] the algorithm should return [the cat]. If given [s] it should return [the cat likes the cat]. The mechanism in other words should recursively expand its input list - the initial input category being expanded into its immediate constituents, which in turn are expanded into their immediate constituents . . . until no further expansion is possible.

Given that you are familiar with recursively defined procedures, you can imagine that one possible strategy is as follows:

  1. The input list is processed from left to right.
  2. If the first word in the input is a variable name (i.e. is a category capable of expansion) then it is replaced by the value of the variable (by the expansion of the category)
  3. The whole new list is then reprocessed.
  4. If the first word of the input is not a variable name (and therefore must represent an ultimate constituent) then it is maintained at the head of a list the rest of which is then processed.
  5. When there are no variable names left in the list (when the end of the list is reached), the task is complete.

This technique translates straightforwardly into Logo as a recursive procedure of a reasonably familiar pattern:

to rewrite :string
if empty? :string [op [ ]]
if thing? first :string [op rewrite se thing first :string bf :string]
op se first :string rewrite bf :string

Dealing with choices in high (and low) level rules

We noted earlier that Lonely Hearts would have difficulty when choices are available between categories as well as between words - when the trees generated may vary not only in their leaves but also in their branching structure. By pure good fortune our new modular system already provides us with the means to circumvent that problem.

As far as rewrite is concerned, its job is simply to substitute variable values for variable names. It does this, moreover, with no regard to the level of the linguistic hierarchy (i.e. the stage of derivation) which its input list represents. If, therefore, we reintroduce the possibility of choice into our grammar rules by using lists of alternatives as the values of our variables, and if we introduce pickrandom into rewrite to select from these lists of alternative structures, nothing prevents us from providing for variant expansions at any level in the hierarchical structure of the sentence.

The modified grammar and algorithm now look like this:

make "s [[np vp]]
make "np [[det n] [det adj n]]
make "vp [[v np]]
make "n [cat man elephant]
make "det [the my this]
make "v [likes hates chases fights]

to rewrite :string
if empty? :string [op [ ]]
if thing? first :string [op rewrite se pickrandom thing first :string bf :string]
op se first :string rewrite bf :string

Notice that, since pickrandom is always used in the selection of a category expansion, even those categories which have only one possible expansion - like vp according to the grammar above - must be represented as a list of alternatives. When pickrandom selects from [[v np]] it makes a choice as normal, even though it has no real choice.

And what about optional constituents?

There is still a problem left with the parenthesis notation used for identifying optional constituents. Two obvious tactics are available without necessitating a change in the algorithm.

1. Directly list the set of possible structures for Aux, so that for

Aux --> C(M) (have+en) (be + ing)

we have

make "aux [[c] [c m] [c m have en] [c m have en be ing] [c have en] . . . ].

Clearly, this could be long winded.

2. Throw in the towel and use again the technique of having empty lists as possible expansions of constituents. In this way, to represent Aux --> C(M) (have+en) (be + ing), we would have:

make "aux [[c m have be]]

making it look as though the optional constituents are obligatory.

But we would then also associate each of these 'obligatory' constituents with a list of expansions which includes the empty list:

make "m [[ ] m]
make "have [[ ] [have EN]]
make "be [ [ ] [be ing]]

You might prefer however to try some other approach. For example, you could mark every optional constituent in an expansion in some special way - say with an initial question mark. Then, whenever an expansion is selected by pickrandom it is immediately passed as input to another procedure (you could call this opt) which returns a version of its input in which each marked item is deleted or not (at random) and if not deleted appears without the initial question mark. E.g. opt [c ?m ?have ?be] returns [c m have be] or [c have] or [c be] or [c m have] . . . Opt would clearly be a recursive procedure!

Ron Brasington
Department of Linguistic Science
The University of Reading