1.
In statistical MT we can get P(target langauge) by using language models.
But how do we get P(source|target) as an example (from the slides) we
have english and french
E=argmax(Pe)*P(f|e)
If I understand your question, we find the argmax using some kind of stack
search or dynamic programming. The actual algorithm will depend on the translation model and how complex it is. (Also see p 486-487 of Manning & Schutze.)
2. Can you elaborate on the difference between diphone synthesis and unit selection synthesis?
In unit selection the size of the synthesis unit chosen for a particular
system may be one of many choices: half-phones, phones, diphones, syllables,
etc. The key idea is that we get multiple examples of the same unit in
different contexts (where context may be some combination of adjoining
phonemes and maybe prosody features, e.g. emphasized or non-emphasized). We
can cluster the examples to find representative units for acoustically
different units. With diphone synthesis, we only use one example of each
phone-phone transition and do not have different versions of the diphone
depending on context. Unit selection systems are more difficult to build
and do require more labelled data, but may produce much better quality than
diphone (though not necessarily).
3. What's the difference between co-articulation and reduction?
*Co-articulation: This means moving more than one articulator at once.
When you speak, there are several types of different "articulator" movements
that parts of your speech system can make to form different phonemes:
various tongue movements (dental, velar, etc), lips, etc. Co-articulation
means that your mouth, while forming part of one phoneme with one type of
articulator, is getting ready to transition to the next sound by setting up
a second articulator at the same time. This may or may not have much effect
on the sound being produced by the first articulator.
For example, in the word "snoozing", when your mouth is forming the "n"
sound, your lips are already rounded to get ready for the upcoming "oo".
Change the word to "sneezing" and you'll see that your lips get ready for
the "ee" during the "n" even though the "n" sound is exactly the same for
both words. This is co-articulating the "n" with the following vowel.
*Reduction: This refers to shortening or reducing the sound of a phoneme.
This often happens because the "correct" sound is more difficult to
articulate in some context. Usually reduction refers to a vowel, but may
also happen for some consonants. A vowel reduction example is the word
"for" in the phrase "What's for dinner?". The word "for" is said
differently in the phrase, than if you say the word "for" all by itself. In
the phrase, the "o" vowel is shorter and sounds more like "fer". This is
not lazy speech, it's a natural adjustment to make your mouth's job easier.
4. What are the Transfer and Interlingua approaches to Knowledge-Based Machine Translation (KBMT)?
These are two different approaches to KBMT. The transfer approach assumes
you can do things better/easier by having a different customized representation for
meaning for each language pair. The interlingua approach assumes that being
more general is more important by having a shared representation for all
languages.
5. What is categorization-based CLIR?
For categorization-based Cross-Lingual Information Retrieval, we assign every web page (say) a set of terms from
some controlled set of categories (for example, a BMW page might have the
keywords: german, luxury cars, automobiles).
If we have such categories for languages A and B, with a dictionary to
translate between them, we can find pages in the other languages just by
mapping the keywords to the other language.
6. Can you clarify the difference between example-based MT (EBMT) and statistical based MT. They seem very similar. When is one preferable over the other?
Example-based MT is a framework that relies on the assumption that, to
translate string X from language A to B, we can make use of a bilingual A-B
aligned training corpus. This corpus is assumed to contain some pretty good
approximations (if not identical copies) of X in language A, together with
corresponding target sentences in language B, which we try to combine to
construct a translation of X. These combination methods are generally but
not exclusively statistical.
Statistical MT systems assume a broader choice of models that may or may not
involve such direct use of a bilingual aligned corpus. For example, a
statistical MT system might try to learn how syntax trees map from language
A to B, and choose the most likely transformation path given string X. This
would involve single-word dictionaries and perhaps parallel examples of sentence parses of various kinds.
Example-based and statistical methods are less linguistically general in the
sense that they both rely on particular corpora to extract their translation
rules. However, both have the advantage that they learn automatically as
opposed to having hand-crafted rules, which are expensive to create.
Whether you choose EBMT or a statistical MT approach will be partially decided by the amount and nature of the training data available.
Statistical MT models range from the very simple (e.g. unigram) to the very complex and the more complex the model, the more parameters need to be
trained and correspondingly more training data will be needed. Example-based models assume a bilingual corpus large enough to contain close examples of the future text to be translated.
Also see http://www.compapp.dcu.ie/~away/EBMT.html