I got into computational linguistics almost by accident.  It may have been a flight from what were to me the increasingly confusing activities of GB scholars. Now that I read even Ken Church had problems keeping up with it, I feel let off the hook. It may have been a desire to solve puzzles that were manageable. (Boy, was I in for a surprise.) It may have also been just plain more fun.

I learned
LISP, and immediately got funding and became a grammar writer in the Department of Computer science at UT. What a difference – I was rid of the annual struggle to land an Intro to Linguistics teaching job. Eventually, I wound up on the METAL machine translation project, which used the same parser, the Early algorithm, in use at CS, implemented by Jonathan Slocum. The context-free phrase structure grammar rules were enriched with  linguistic features for agreement, etc. (This process of adding linguistics features to cfg rules has always seemed like the semantic routines of compilers for programming languages to me).  Unlike in theoretical linguistics, everything in CL seemed local, and comprehensible. 

The parser was well-implemented, but the system suffered from grave deficiencies.  One of the drawbacks was the lack of consistent grammar writers.  METAL was a portal for graduate students on their way to bigger and better things.  This entailed a great deal of sloppiness and many patches in the rules. Another, more inherently intractable  problem was
combinatorial explosion: often, as a METAL grammar writer, you’d be looking at 100s of parses for relatively short sentences, and into the 1000s for longer ones. The problems associated with MT have been discussed plenty elsewhere; I don't have to go into it here.  One cannot help but wonder what would have happened to METAL had knowledge of stochastic methods been local at the time.

Like many linguists, I was afraid to ask about what I wanted to know about parsing, but unlike many linguists, I took Slocum's then quizzical LISP code home and would stare at it for hours, as yet unfamiliar with the mathematical parsing literauture.  Reading
Winograd as a source suggested itself in a few months, and much became clear.  Back then, a computational linguist made his own curriculum at most schools, although, in hindsight, quite a few sites were working very hard on some problems in ways that were to remedy some of the problems I had experienced first-hand – IBM, Xerox, BBN, etc.

On the education-side,  I took two
Prolog classes with visiting professors Werner Frey and Dr M Pinkal from Germany, and the final project was to see Pereira’s CHAT-80 at work and analyze and describe it. Prolog is a great language, allowing for declarative grammar-writing.  I need to mention as an anecdote that despite Prolog's apparent simplicity for a programmer or grammar writer, its mathematical analysis involves a concept named 'Skollemization' to get around a potential propositional logic - Universal Quantifier snare. Don't ask. I remember an ex-LexisNexis, Inc. colleague-who-shall-remained-unnamed who'd get apoplectic at the mere mention of the word. Hi Dave!

CHAT-80 was one of these dissertations people talk about for a long time, among others because it innovatively used a small ontology - see Sowa's discussion; I cannot get the link to work.  CHAT-80 used a few properties of Prolog to create a primitive version of a structured database, then had its Horn-clause grammar have at it and parse user-questions using extraposition properties of Prolog.
Wh-questions were back in my life, linking question question-words to their gaps.  Prolog's little inference engine at the same time was its Horn-clause-like rules. Thus it could answer such non-canned questions as What countries border on Switzerland? and What countries does Switzerland border on? The program used the grammar and the same tiny database to infer an answer. Logic and NLP at work, all rolled into one. I should mention CHAT-80 did not address question answering from unstructured text, as Kevin Cohen has pointed out to me.

I was also privileged to take a one-on-one course on formal language theory with mathematical linguist
Dr Robert Wall. The experience, consisting of working through McNaughton's introductory book on automata,  restored  my intellectual confidence, which was somewhat frayed after GB and the pressure of having to write a dissertation in it. Dr Wall, an educator in the classic sense, correctly diagnosed me as seeking a sense of mastery, and as a problem solver. Computational linguistics seemed to offer that.

It was also time to enter the real world, and, new courses under my belt, I accepted a job offer from Mead Data Central at Dayton, OH, to be part of the computational linguistics team.  Mead Data Central, now
LexisNexis, owned by Reed Elsevier, offered a Boolean document retrieval system, coupled with its own network, Meadnet, far ahead of its time, operating mainly in the legal market. Life was good, if somewhat uneventful, in the cornfields of Ohio. I learned PASCAL and  C. Not learning enough from Mark Wasson and Tim Humphrey, practical Mid-Westerners and statistical NLP-ers avant-la-lettre, was a strategic career mistake, but one I corrected later on. Almost 20 years ago they worked resp. on statistical document categorization and neural networks without the fame attributed to others who ported these concepts to NLP. I made lists of stop-words in for French and Dutch, investigated the influence of thesauri on retrieval, and advocated using morphological variants of search terms to enhance retrieval to incredulous byteheads. All these are minor accomplishments in today's fast-paced world. But the stopword lists are minor achievements that are accessed by thousands of users every day.  Or, perhaps I should say the achievement is they are not accessed every day.

Adversity struck and forced me to go back to the Netherlands. I wandered in the desert for quite a while computational-linguistics-wise.  Given my somewhat cynical attitude after the METAL experience - but it was only
somewhat cynical, because I saw early on what small, practical language-based tuning of retrieval systems could do - you may wonder why I am still in the field. My criticism is not flippant. It's based on a great deal of reflection, hands-on experience, and due-paying. The answer as to why I remain interested in NLP is on the following page.  There's much more electronic text now, statistics has entered the field, and I sincerely believe information retrieval and general online text-processing can be aided significantly by Natural Language Processing.

Previous
Next
Publications




1