
to category
op (se constituent1 constituent2 constituent3)
end
When you know that --> in Chomsky's notation can be read as is composed of or expands into, and + means no more than concatenate, it is clear that a rule such as
2. VP --> Verb + NP
can be translated directly into a procedure of type A as:
to vp
op (se verb np)
end
The other type of procedure we used - call it type B - had the general form
to ultimate.constituent
op
pickrandom [ultimate.constituent1 ultimate.constituent2 ultimate.constituent3 . . .]
end
This is directly paralleled in Chomsky's grammar by a rule such as
11. M --> will, can, may, shall must
where commas are used to separate constituents within a list of alternatives.
In Logo we would say:
to
m
op pickrandom [will can may shall must]
end
One novelty, form our point of view, is that Chomsky identifies the constituents of sentence structure by names which appear to refer to their internal form rather than to their function. He talks about noun phrases rather than about subjects and objects like Nesfield or like number or type or someone in our Lonely Hearts environment. But this decision, whatever its merits, is clearly not one which affects the possibility of representing the rules within the framework we have used for Lonely Hearts since all that is provided for by this framework is the expression of constituency relations. It makes no difference to the form of the procedures whether we consider a sentence to be composed of a subject and a predicate or a noun phrase and a verb phrase. As far as LOGO is concerned these are just different sequences of characters used as names.
to np
op pickrandom [np.sing np.plur]
end
we will find that np.sing or np.plur will appear (literally) in the output list whether or not procedures called np.sing and np.plur have been defined. The problem is not without solution. You will eventually discover that it can be rectified by passing the output of pickrandom to the primitive run as in the line op run pickrandom [np.sing np.plur]. But we will not follow up this possibility here.
We also find ourselves confronting problems when dealing with optional elements. Chomsky identifies these freely omittable constituents by enclosing them in parentheses - e.g. the (M) in rule 10. As you know, we can, in fact, deal with such situations at a practical level. In this particular case, we can treat M as obligatory and provide an empty list as one of the possible realisations of M in the next rule, rule 11. (Cf. the use of the technique to handle the optionality of the word please in the procedure phone in Lonely Hearts.) But we are, in effect, no longer representing accurately the grammar which we are trying to implement.
Before attempting to tackle these difficulties, however, there is a potentially more serious problem which we should attempt to deal with first. Even if - with suitable modifications - it could be got to work, an implementation of Chomsky's grammar based on procedures of types A and B would be one in which the linguistic rules of the description are inextricably meshed with the mechanism for exploiting them. In other words, if we wished to change the linguistic rules we would need to define new procedures or edit our old ones.
In order to meet the requirement that the grammar is expressed as data statements rather than procedurally, let us next suppose that each rule of the grammar is handled by assigning values to variables, so that in each case the variable's name is a category and the variable's value is the list of the category's constituents. One such possible grammar is thus:
make "s [np vp]
make "np [det n]
make "vp [v np]
make "n [cat]
make "det [the]
make "v [likes]
Notice that if we represent the grammatical rules in this way, we are claiming that there is no need to impose any ordering on these rules. The Logo variables representing them just co-exist in the workspace. Of course, you will discover that the rules do get used in a particular order, but that is in consequence of what they do rather than of the order in which they are presented. It follows from this, too, that you can add new rules and delete old ones easily and independently.
Given that you are familiar with recursively defined procedures, you can imagine that one possible strategy is as follows:
This technique translates straightforwardly into Logo as a recursive procedure of a reasonably familiar pattern:
to rewrite :string
if empty? :string [op [ ]]
if thing? first :string [op rewrite se thing first :string bf :string]
op se first :string rewrite bf :string
end
As far as rewrite is concerned, its job is simply to substitute variable values for variable names. It does this, moreover, with no regard to the level of the linguistic hierarchy (i.e. the stage of derivation) which its input list represents. If, therefore, we reintroduce the possibility of choice into our grammar rules by using lists of alternatives as the values of our variables, and if we introduce pickrandom into rewrite to select from these lists of alternative structures, nothing prevents us from providing for variant expansions at any level in the hierarchical structure of the sentence.
The modified grammar and algorithm now look like this:
make "s [[np vp]]
make "np [[det n] [det adj n]]
make "vp [[v np]]
make "n [cat man elephant]
make "det [the my this]
make "v [likes hates chases fights]
to rewrite :string
if empty? :string [op [ ]]
if thing? first :string [op rewrite se pickrandom thing first :string bf :string]
op se first :string rewrite bf :string
end
Notice that, since pickrandom is always used in the selection of a category expansion, even those categories which have only one possible expansion - like vp according to the grammar above - must be represented as a list of alternatives. When pickrandom selects from [[v np]] it makes a choice as normal, even though it has no real choice.
1. Directly list the set of possible structures for Aux, so that for
Aux --> C(M) (have+en) (be + ing)
we have
make "aux [[c] [c m] [c m have en] [c m have en be ing] [c have en] . . . ].
Clearly, this could be long winded.
2. Throw in the towel and use again the technique of having empty lists as possible expansions of constituents. In this way, to represent Aux --> C(M) (have+en) (be + ing), we would have:
make "aux [[c m have be]]
making it look as though the optional constituents are obligatory.
But we would then also associate each of these 'obligatory' constituents with a list of expansions which includes the empty list:
make "m [[ ] m]
make "have [[ ] [have EN]]
make
"be [ [ ] [be ing]]
You might prefer however to try some other approach. For example, you could mark every optional constituent in an expansion in some special way - say with an initial question mark. Then, whenever an expansion is selected by pickrandom it is immediately passed as input to another procedure (you could call this opt) which returns a version of its input in which each marked item is deleted or not (at random) and if not deleted appears without the initial question mark. E.g. opt [c ?m ?have ?be] returns [c m have be] or [c have] or [c be] or [c m have] . . . Opt would clearly be a recursive procedure!
E-mail: ron.brasington@rdg.ac.uk