Choosing a high-level language

An embarrassment of riches

If we are going to use a general purpose high-level programming language then we are faced with another problem. Although individual programming languages may be more rigid and constrained in their form than natural languages, the range of types of programming language is far more varied. Indeed, any attempt to provide a comprehensive definition of programming language is likely to end up defining the notion in such a way that it inevitably covers things that, at first glance, seem not to look like computer languages at all.

Suppose, for instance, it is argued that the defining characteristic of a programming language is that it allows for the unambiguous expression of instructions through the use of a well-defined vocabulary and syntax. Well then, the conventions used to present knitting patterns or records of games of chess or, indeed, linguistic descriptions are drawn into the net, though the language of recipes may be considered perhaps just a touch too wayward for inclusion.

Even if we restrict our attention to accepted programming languages, the variety is remarkable. Words, for one thing, are not necessarily required. There are languages available which are graphic or pictographic rather than verbal. And all this variety means that you have a serious decision to make before you learn to talk to a computer. What sort of language and which language of that sort should you choose to speak?

Some general features to look for

One important distinction to make is between languages which are extensible and those which are not. An extensible language is one that you can develop by adding new vocabulary for special purposes rather than one which restricts you to some set of predefined functions. Human language is extensible in this sense. You will no doubt recently have learned new words like phoneme and allophone, for example. The function of such new vocabulary is to allow us to talk efficiently and economically about concepts which would otherwise require repeated paraphrase and explanation. For just the same reasons it makes sense in choosing a programming language to opt for one is extensible.

There are as it happens two languages commonly used by people working with natural language which fit the requirement of extensibility - Lisp and Prolog. These two languages have more than extensibility in common. Unlike the majority of programming languages which are designed for the development of monolithic programs which are simply either activated or not, Lisp and Prolog are more appropriately though of as allowing the creation of new working environments which you can use on different occasions, maybe for quite different purposes and in quite different ways.

The two languages do, however, offer two distinct ways of talking to a computer - two different styles - that people have called imperative and declarative. In Lisp, in imperative style, you tell the computer WHAT are the relevant facts and HOW to do the task. In Prolog, in declarative style, you tell the computer WHAT are the relevant facts and WHAT are the conditions under which any rules relating facts one to another are true. (A computer running Prolog knows itself how to respond to queries which you make based on the facts.)

There are cases to be made for either type of language when the data of primary interest is human language. Both make the programmer's life easy at times, less easy at others, though not for the same reasons. As the names for the two styles suggest, Prolog is fine when we need to state facts and define relations, Lisp is fine when we need to develop procedures for getting things done in a particular way. Of course, most of the time we need to do both. We are going to be taking the Lisp path not for partisan reasons but as much as anything else just because it is there. You should, when the chance is offered, take a walk along the other way too. Multilingualism is as useful when you need to talk to computers as it is when talking to people.

In fact - as you will have guessed - we will not stick directly to the Lisp path. Lisp is conceptually a beautifully simple language but its simple syntax can result on occasions in some rather nasty looking program texts. Luckily it has a more friendly, approachable relation, called Logo. Logo is, incidentally, an interpreted language in almost all of its incarnations. It is, in that respect well suited to on-going experimentation with programming ideas. We are going to learn Logo and see what we can do with it in the field of natural language.

Ron Brasington
Department of Linguistic Science
The University of Reading