Interesting People mailing list archives
Electronic Chinese Translator
From: David Farber <farber () central cis upenn edu>
Date: Wed, 28 Jul 93 14:14:12 PDT
Subject: Computer translates Chinese into everyday English Date: 28 Jul 93 18:17:07 GMT GAINESVILLE, Fla. (UPI) -- An engineering professor has patented a computer system that automatically translates Chinese into idiomatic English, a feat once considered almost impossible, the University of Florida said Wednesday. The difficulty arises because of the vast differences in the Chinese writing system and the English alphabet, as well as the large cultural gap between the two societies, said Julius Tou, graduate research professor emeritus of electrical engineering and computer sciences. With more than 50,000 characters and many irregular grammar rules, Chinese is one of the most complex languages in the world. The Chinese writing system has no alphabet because it was developed before the invention of alphabets. Words are not represented by combinations of letters, but by combinations of characters, Tou said. ``In Chinese, words can be created with one character, two characters or whatever,'' he said. ``One word can be spelled with one character or five characters, depending on the context. There are no fixed rules for this.'' In addition, Chinese sentences are strung together with no spaces between separate words. For example, a Chinese sentence would look like this ``hereaddearabby'' which could be broken up to read ``here add ear abby'' while the correct reading is ``he read Dear Abby.'' ``If you teach people to learn Chinese, they need to learn to find word sequences in the character strings,'' Tou said. ``Chinese people, because they grew up with the language, see the character string and automatically change it to word sequences. To them, it is intuitive, but machines have to use computer logic and artificial intelligence.'' Over the last decade, Tou has developed software and hardware capable of reorganizing character strings into word sequences and bridging the cultural gap between Chinese and English. Since structure and sentence composition of a language reflect the cultural background and thought processes of the people who speak it, understanding those aspects is essential in machine translation. For instance, translating only the characters in a Chinese sentence could lead to this baffling English version: ``Sun school chief blow cow target basis lead very huge big,'' instead of the correct, idiomatic version: ``President Sun's ability to brag is very great.'' To solve the problem created by cultural differences, Tou developed the concepts of linguistic canonical forms and information patterns for structural translation. An linguistic canonical form can best be described as the source language expression (Chinese) using the target language (English) thought processes and sentence structure. ``This computer converts Chinese to a linguistic canonical form and from there translates it to English, completing both tasks almost instantaneously,'' Tou said. He designed an input system that lets users type on a standard computer keyboard to create Chinese characters on the screen for the translation. His work has been awarded a U.S. patent and is being produced by CITAC Computers Inc., which has sold some units to companies in Taiwan. Full-scale production should begin in one or two years, once the last few bugs have been ironed out, Tou said. ``It's the difficulty involved in learning Chinese that makes the translation machine necessary,'' Tou said. ``We tested its translation capacity against an American who lived in China for four years and the machine beat him, in terms of accuracy as well as speed.''
Current thread:
- Electronic Chinese Translator David Farber (Jul 28)