The Star-Ledger May 20, 2007
An offer that won't easily translate
IBM language device a Pentagon dilemma
By Kevin Coughlin
Paul Rieckhoff served in central Baghdad as a platoon leader for the most technologically advanced army in the world. But talking with local Iraqis often amounted to "pantomime and Pictionary."
"We felt like we had been beamed down from Mars," said Rieckhoff, a former first lieutenant with the U.S. 3rd Infantry Division whose memoir, "Chasing Ghosts," was published last year. "To be able to communicate would have saved lives on both sides."
Last month, IBM offered free software and portable devices to U.S. forces in Iraq -- where interpreters are scarce -- to translate English and Arabic on the fly. The offer has posed a curious challenge for Pentagon officials, who covet the technology but worry about the propriety of accepting freebies from a contractor.
"It is unusual to have someone donate something like this," said Army Lt. Col. Brian Maka, a Pentagon spokesman. "We have to do this legally. It's not as cut and dried as saying we'll take a gift."
The government risks legal fallout if the donation is perceived as giving IBM an edge over rivals vying for military contracts, experts said.
"If the Defense Department accepts all this stuff, there may be some sense there is no longer a level playing field," said military analyst John Pike of GlobalSecurity.org., which tracks security issues.
"Acquisition rules and regulations seem cumbersome. But mostly, they're put in place because of hard-won experience in the contracting process," said Stephen Ellis of Taxpayers for Common Sense, a nonpartisan budget watchdog group in Washington DC. "I'm sure this is what the Pentagon is wrestling with."
The software is called MASTOR, short for Multilingual Automatic Speech-to-Speech Translator. While it cannot top Star Trek's universal translator, it seems light years beyond Pictionary.
IBM has offered 10,000 copies of MASTOR, along with 1,000 devices and training. The company said it wanted to make the donation to honor an employee's son who was severely wounded on patrol in Ramadi in February.
Developed partly with military funds, the software runs on laptop computers and Phraselators, walkie-talkie-size machines built by VoxTec International Inc. of Maryland. When the Phraselators rolled out in 2004 they sold for between $2,000 and $2,300 each. MASTOR software is not yet commercially available.
The Pentagon will spend more than $75 million this year on translation research. The need is urgent. Last December, the Iraqi Study Group reported that only 33 of 1,000 U.S. embassy staffers in Baghdad spoke Arabic -- and just six were fluent.
Troops now tote more than 2,000 Phraselators. But these only utter pre-programmed Arabic phrases. Former soldiers said they often had no clue what Iraqis were saying in response.
Despite "impressive" progress on two-way translation by companies like IBM, BBN Technologies, SRI International and others, more work is needed, according to Andre Van Tilborg, deputy undersecretary of defense for science and technology.
"I doubt you'd find anybody to say any of the devices are totally adequate for use by the military so far," said Van Tilborg. Some field tests of MASTOR are under way, he said.
During a recent demonstration for The Star-Ledger, researchers at IBM's Watson Research Center in Yorktown Heights, N.Y., spoke basic English phrases into a Phraselator running MASTOR.
Within seconds the device spat out translations, in a clipped Baghdad dialect of Arabic, after displaying text in both languages. It also worked the other way: Arabic in, English out. In a quiet room with calm speech and pre-defined topics, MASTOR achieves accuracy of at least 80 percent, IBM claims.
It translates English better than Arabic, researchers acknowledged. IBM had fewer examples of Arabic speech and text for computer analysis, for one thing. And Baghdadi Arabic consists of colloquial expressions and spoken variations of standard Arabic that are not written down. So IBM had to invent a text version.
MASTOR also translates standard Arabic, which has tricky noun genders. The word "black" is different when it describes a car, which is female, than when it describes a shirt, which is male.
When a visitor asked "Where are the AK-47s?" in English, MASTOR smartly translated the Arabic version as, "Where are the Kalashnikovs?" That is how Iraqis refer to the assault rifle. An inquiry about a radiologist, however, came out as "Where is the employee?"
Glitches aside, it's hard to imagine any soldier in Iraq -- or any tourist to the Middle East -- not wanting this gear.
In fact, IBM's Yuqing Gao had business travelers in mind when she proposed MASTOR in 2000. At the time, the Beijing native envisioned translations between Mandarin Chinese and English. Arabic became the focus after the terror attacks of Sept. 11, 2001.
MASTOR's vocabulary includes about 50,000 English words and 100,000 Arabic words specific to "domains," or scenarios, such as vehicle checkpoints, explosives and medical situations.
"Speech recognition will never be perfect," Gao cautioned, citing a host of variables. It boils down to mathematical processes she compared to forecasting the stock market. "It's a guess, a statistical prediction," she said.
MASTOR makes educated guesses about speech based on analysis of sound frequencies that make up spoken words, and the frequency with which words occur in combination with other words.
"It looks at a lot of spoken and written conversations, takes word sequences, and counts how many times a word occurred after a given word," explained David Nahamoo, IBM's chief technologist for speech. "We call this N-grams. You look at the frequency of words in isolation, following another word, and being followed by another word."
Because one person's "would you" is another's "wouldja," MASTOR also makes informed guesses about slight acoustical deviations, using formulas called hidden Markov models.
A demo makes this sound easy, but recent advances have been a long time coming. IBM has promised technology like MASTOR since the 1950s.
Computer speed and memory no longer are the holdups. Now, said Van Tilborg, what's needed are better formulas and hands-free devices, to translate broad topics in many dialects and noisy places. "More good ideas need to come forward," he said.
MASTOR's Arabic voice is made up of fragments diced from 7,000 phrases, recorded over two weeks by IBM's Max Tahir. A native of Iraq's Kurdistan region, Tahir served as a U.S. translator in Baghdad from 2003 to 2005.
It was a lonely job.
"There were not enough interpreters," recalled Tahir, a naturalized U.S. citizen. "Hiring Iraqis was risky, they were not always available. A lot of times, soldiers needed something simple to communicate with the locals."
Former Army Staff Sgt. Chris McGurk remembers close calls at checkpoints when Iraqis did not understand U.S. commands.
"Sometimes they thought we were making fun of them," said McGurk. "To me, I felt like we were insulting to the Iraqis. We were using hand and arm signals ... like speaking to a child. And they don't deserve this."
© Copyright 2007,The Star-Ledger