The Next Wave of MT Publicity, by Alex Gross

to home

The Next Wave of MT Publicity
Published by the ATA Chronicle, July, 1994

The year is 1986, and I am sitting on my floor with my hacker genius friend, whom I'll call Mike. We are discussing the imminent wave of Artificial Intelligence programs which will soon take over the world and make us both vastly rich. In this venture I will provide the practical knowledge, Mike the programming skills. I am eager to put together a medical application, and Mike has some ideas of his own. We both believe there is no limit to the power of AI to harness ideas, learning, knowledge. Mike idly picks up a Roget's Thesaurus lying on my floor and leafs through it.

"You see how easy it is, Alex" he tells me, "All we need is the French equivalent of this book, I link them together with a program, and bingo: perfect Machine Translation!" I am hesitant and attempt to express my doubts. I try to tell Mike that it isn't that simple, but he will have none of it. He is supremely sure that language is a pushover, what programmers call a "trivial task."

Mike never built his MT system, even though he did go on to write an award-winning AI application that came closer than any to passing the Turing Test (more about that test later). So there is no doubt about his programming skills, nor those of many other programmers. What remains in doubt is the capacity of these highly specialized technicians to assess the deepest problems connected with MT, AI, and NLP (Natural Language Processing) applications in general.

Publicity about MT has come in waves. The first wave was launched by Turing, Weaver, Shannon, and other computer pioneers. A later wave emanated from IBM around the time of the 1964 World's Fair. The most recent wave started in the mid-Eighties and has culminated in the various micro and mainframe systems now familiar to us. Each wave has publicized much the same arguments:

1. MT will be faster than human translators.

2. MT will be more accurate than human translators.

3. MT will be cheaper than HT (though more recently this claim has been slurred over).

4. MT will break the language barrier and open the way to true and lasting human understanding (this point has also been deem-
phasized of late, though early enthusiasts greatly stressed it).

Soon the next wave of MT publicity will burst upon us, and the publicity mills are already gearing up.(1) In a year or two we will be reading about the incredible breakthroughs achieved by the "CYC" project, a unique Natural Language Processing experiment using massive parallel processing to build the supposed eight to ten million links embedded in human language. CYC supposedly comprises an "EnCYClopedia" of what we have all learned about the world around us. Once again all the familiar arguments about MT are likely to resurface. Even though CYC is not an MT system in itself, any success it enjoys will certainly reach out to embrace MT and other branches of AI.

There can be no doubt that the CYC project is an important one worthy of attention by all translators. For this reason—and also because its home base is Austin, Texas—I have asked Peter Krawutschke to determine if it will be possible for a group of computer-oriented ATA members to look in on CYC while we are in Austin this October. Perhaps it could also become possible for representatives of CYC to take part in our conference program.

The arguments for and against MT seem to come and go in an almost cyclical fashion, and some translators have come to view this subject with apprehension. But we need to pay attention to what is happening in MT and AI in general. Two unassailable arguments in its favor remain: 1) no one opposes MT where it really works, and 2) MT works quite well for those tasks where it is suitable. The main questions concern which tasks these may be, whether their number may grow, and how translators will come to be integrated into the overall continuum of MT, Computer Assisted Translation, and traditional techniques.

But as my friend Mike's attitude towards language shows, there are still some larger concerns about MT, which have dogged its development from its very beginnings and remain very much with us. Underlying the basic assumptions of MT are much the same notions often vocalized as "Why don't you just type it out in Spanish?" or "Just look at it and say it in English." What MT shares with such solecisms is the notion that the differences between two languages can easily be predicted and routinized. Noam Chomsky's concepts of "deep structure" or "universal grammar" reflect the same fallacies, in this case beefed up with many layers of academic terminology. Basic to all these approaches is the half-truth that language is inherently reasonable, which must be balanced against the other half-truth, that it is not reasonable at all. It is altogether possible—as I have argued elsewhere (2)—that on an evolutionary plane language may be at least partially an outgrowth of the spray marks used by animals to claim territory, attract mates, or repel rivals.

Similarities between MT and other coequal branches of AI—"voice-writing," text retrieval, robotics, Machine Vision—also cannot be overstressed: all have fallen behind schedule for closely related reasons. Voice-writing—which was originally supposed to catch every nuance of speech automatically—has now settled for asking the speaker to confirm or correct it following every word or phrase. Text retrieval has still not fully recovered from the ambitious claims made surrounding its birth. One thing the computer does best is to match up strings of text, though this was never strictly speaking "AI." But anything less than a perfect match requires what is called a "fuzzy search," which in turn often produces vast quantities of "quasi-results" requiring highly qualified humans to determine their relevance. This means we are still a long way from truly reliable research based on a given text base. This is because searching according to "key-words" is only as accurate as the key-words which have been entered. In other words, a search through a legal data base under the heading "Teenage Abortion" will not find:

JUDGE: Did you have the baby?
REPLY: No, I decided not to.

Robotics, envisioned as supplying us all with unlimited household servants, did not even succeed completely in taking over the factory floor—rather, the factory floor had to be redesigned from scratch to allow robots to work. And one still hears the story of the two Japanese welding robots who during a lull have been known to start welding each other. Even Isaac Asimov, the father of Robotics, expressed his disappointment that these machines were not robots as he envisioned them. And as for Machine Vision, how many people are ready to have a computer make the next left turn for them, much less drive them off into the sunset?

Like Asimov, even AI's primary advocate, Marvin Minsky, has taken to writing science fiction to promote his ideas, which begin to sound indeed more and more like SF and less like viable proposals. And even Minsky is now hedging on the future of MT, as this account of instant Japanese-English interpreting from his The Turing Option (co-authored with Harry Harrison) illustrates:

"He touched the phone disconnect button and the voxfax machine behind him instantly sprang to life, humming lightly as it disgorged the printed record of their phone conversation. His words were in black, while Mura's were in red for instant identification. The translation system had been programmed well, and as he glanced through it he saw no more than the usual number of errors.....The staff translator would later verify the correctness of the translation the computer had made." (3)

Of course this wondrous machine had already made a real-time onscreen "rough," though some may wonder what "the usual number of errors" was and how the protagonist recognized them without himself being an expert translator. (And if he were, why would he have needed this device in the first place?)

Thus, even science fiction has partially given up on old-fashioned, red-blooded AI. The whole point of the Turing Test was to make a computer so lifelike that those communicating with it via keyboard from another room would actually believe they were talking to a human. No machine has yet fully passed this test, which has since been subjected to many doubts and objections—even Alan Turing himself never supposed such a ruse could be maintained for more than a few minutes. (4) And so the days of the completely human computer may belong to the past rather than the future, except for those who believe that all sci-fi alternate futures are equally true.

No doubt some readers will regard this fairly accurate description of the state of AI art as a sacrilege, as a certain religious element has come to creep into some discussions about the computer. Those who venture any skepticism at all—critics like Hubert Dreyfus or John Searle or Joseph Weizenbaum—have sometimes been dismissed as simple-minded "Luddites," though recent research has shown that the true Luddites were far from simpletons.(5) Some of our experts seem so wedded to their vision of themselves as the shapers of the future that they are unable to notice any shortcomings in their outlook.

Engineering has often succeeded by breaking down large problems into smaller pieces, finding the solution for each one, and then fitting the pieces back together again. If this approach appears to fail, it can only be that the pieces were not made small enough—or perhaps inadequate funding. If massive parallel processing fails to solve the problem, Neural Nets or Markov Models or DNA-etched microchips or nanotechnological brain implants are sure to take care of it. The notion that language may not be susceptible to these approaches—that its semantic-contextual patterns may still fall through the holes of their mapping methods—is a foreign one for many computer scientists.

As the coming flood of MT, AI, and NLP rhetoric breaks over our heads once again, it is perhaps important that we all of us recognize it for what it is: often little more than a form of advertising. There are also some grounds for supposing that translators are not even supposed to hear any of this. The real target for all the shouting is probably a combination of MIS directors (Management Information Services), office managers, and other executives in large corporations.(6) These highly paid officers usually know next to nothing about language or translation and little enough about computers. What many of these MT vendors will be selling is not so much "Translating A Foreign Language" as "The Idea Of Translating A Foreign Language."

NOTES:

(1) CYC-O, Doug Lenat's Quixotic Quest to Create an Artificial Intelligence with Common Sense, Interview by Jeffrey Goldsmith,
WIRED, April, 1994, p. 94.

(2) Language and MT: Conflicting Technologies? Sci-tech Translation Journal, October 1993.

(3) Minsky, Marvin & Harrison, Harry: The Turing Option, p. 6, Warner Books, 1993.

(4) Unlike some of his followers, Turing was quite relaxed about this test and foresaw nothing more startling than that "an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning." Elsewhere he waxed whimsical and proposed that the computer should try to deceive an interrogator by pretending to be a woman, while a real woman would make the same claims from another room. Thus, the so-called Turing Test, intended to identify "true AI," is not—unlike many testing procedures in the physical and even biological sciences—constructed in a rigorous way. (Andrew Hodges: Alan Turing: the Enigma, pp. 415-17, Touchstone Books, Simon and Schuster, 1983).

5 Whole Earth Review: Robert Rossney, The New Old Luddites, Spring, 1994, pp 2-12; Reid, R.W: Land of Lost Content: the Luddite Revolt, 1812, Heinemann, 1986.

6 Perhaps symptomatic of this movement, a magazine called Multilingual Computing has been in existence for over a year, though very few copies have reached translators. Slick and well-written on its own level, it is evidently directed towards a corporate audience from a small town in Idaho. For more information, contact: Multilingual Computing, 111 Cedar Street, Suite 5, Sandpoint, Idaho 83864, Tel: (208) 263-8178, Fax (208) 263-6310.

COPYRIGHT STATEMENT:
This article is Copyright © 1994
by Alexander Gross. It may be
reproduced for individuals and for
educational purposes only. It may
not be used for any commercial (i.e.,
money-making) purpose without
written permission from the author.

to top
to linguistics menu
to home