Interesting People mailing list archives

Electronic Chinese Translator


From: David Farber <farber () central cis upenn edu>
Date: Wed, 28 Jul 93 14:14:12 PDT



 Subject: Computer translates Chinese into everyday English
 Date: 28 Jul 93 18:17:07 GMT

        GAINESVILLE, Fla. (UPI) -- An engineering professor has patented a
computer system that automatically translates Chinese into idiomatic
English, a feat once considered almost impossible, the University of
Florida said Wednesday.
        The difficulty arises because of the vast differences in the Chinese
writing system and the English alphabet, as well as the large cultural
gap between the two societies, said Julius Tou, graduate research
professor emeritus of electrical engineering and computer sciences.
        With more than 50,000 characters and many irregular grammar rules,
Chinese is one of the most complex languages in the world.
        The Chinese writing system has no alphabet because it was developed
before the invention of alphabets. Words are not represented by
combinations of letters, but by combinations of characters, Tou said.
        ``In Chinese, words can be created with one character, two characters
or whatever,'' he said. ``One word can be spelled with one character or
five characters, depending on the context. There are no fixed rules for
this.''
        In addition, Chinese sentences are strung together with no spaces
between separate words. For example, a Chinese sentence would look like
this ``hereaddearabby'' which could be broken up to read ``here add ear
abby'' while the correct reading is ``he read Dear Abby.''
        ``If you teach people to learn Chinese, they need to learn to find
word sequences in the character strings,'' Tou said. ``Chinese people,
because they grew up with the language, see the character string and
automatically change it to word sequences. To them, it is intuitive, but
machines have to use computer logic and artificial intelligence.''
        Over the last decade, Tou has developed software and hardware capable
of reorganizing character strings into word sequences and bridging the
cultural gap between Chinese and English.
        Since structure and sentence composition of a language reflect the
cultural background and thought processes of the people who speak it,
understanding those aspects is essential in machine translation.
        For instance, translating only the characters in a Chinese sentence
could lead to this baffling English version: ``Sun school chief blow cow
target basis lead very huge big,'' instead of the correct, idiomatic
version: ``President Sun's ability to brag is very great.''
        To solve the problem created by cultural differences, Tou developed
the concepts of linguistic canonical forms and information patterns for
structural translation. An linguistic canonical form can best be
described as the source language expression (Chinese) using the target
language (English) thought processes and sentence structure.
        ``This computer converts Chinese to a linguistic canonical form and
from there translates it to English, completing both tasks almost
instantaneously,'' Tou said.
        He designed an input system that lets users type on a standard
computer keyboard to create Chinese characters on the screen for the
translation.
        His work has been awarded a U.S. patent and is being produced by
CITAC Computers Inc., which has sold some units to companies in Taiwan.
Full-scale production should begin in one or two years, once the last
few bugs have been ironed out, Tou said.
        ``It's the difficulty involved in learning Chinese that makes the
translation machine necessary,'' Tou said. ``We tested its translation
capacity against an American who lived in China for four years and the
machine beat him, in terms of accuracy as well as speed.''


Current thread: