[SOLVED] 程序代写代做代考 Interlingua MT: Translation of Numbers

30 $

File Name: 程序代写代做代考_Interlingua_MT:__Translation_of_Numbers.zip
File Size: 602.88 KB

SKU: 5699410146 Category: Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Or Upload Your Assignment Here:


Interlingua MT:Translation of Numbers

Topics:
Number systems
Grammar for numbers
Parsing

*

Interlingua MT:
Translation of Numbers
Semantic processing
Generation

MT “pyramid”
(revisited)
Source language
Target language

Interlingua
Transfer: deeper rep.Transfer: semantic rep.
Transfer: functional structureTransfer: phrase structure

Direct translation: word for word
translation
No transfer process
needed for interlingua
*

Interlingua MT
Interlingua
Language1
Language2
Language3
Others …….
Advantage of interlingua:Adding a new language needsonly one more language pair:new language €€ Interlingua
*

*
What is interlingua?
An interlingua is supposed to be a universal

representation for … What?
meaning, of course
but what is meaning?
Under the circumstance of no clear meaning for

“meaning”, we may describe interlingua as
a universal representation for what can be conveyed

through human language communication
Question:
What can be conveyed by our languages?

How to design an interlingua?
Any clear idea about it? No
What we are sure to know is its

universality and
versatility

Think about the following
ontology of human knowledge
conceptions of what we know and can express viaspeech
ontology of objects in the world and in our languages
ontology of events
ontology of words, etc.

Any example to helpus understand it anybetter?
*

Interlingua MT for numbers

Interlingua:
*
Values
English numbers
Chinese numbers
Others …….
Arabic numbers
Used as universalArerparbeiscenntuamtiobnerfsorvalues

Number systems
Decimal numbers
Arabic numbers
Yes
Chinese numbers – ?

• <= 10,000, yes•> 10,000, still?
English numbers – ?

• <= 1,000, yes•> 1,000, still?
The distinctionbetween the two canbe exemplified bythe difficulties inconverting ortranslating betweenthem.

Basically yes,but with quitesome variation!

What is thedifference?
*

*
What define a number system?
Base
the set of digits (or, base symbols) used
the cardinality of the digit set (i.e., the number of digits)
decimal numbers
base 10

digits: {0,

1, 2, 3, 4, 5, 6, 7, 8, 9}
each digit has its own digit value.
Position

the place where a digit shows up.23388
each position has its position value:
|Base|Pos43210

What value does a digit represent?
2
3
388
0
4321
8×100
8×101
3×102
3×103
2×104
Digit value
Position value
Digit value
A digit represents different value when showing up in

different position
Position value
*
Digit x |Base|Pos

What is the value of a number?
All number’s value = sum of all its digits’ values.E.g.,

23,388
= 2×104 +
3×103 +
3×102 +
8×101 +
8×100
= 23,388

Hei! So trivial!
What kind of game are you
playing?
10

Let us play with binary numbers
Base 2
Digits=

{0, 1} (i.e., only 0 and 1 appear in a number)
Still trivial?
All computers play such a game.

How about numbers on other bases?
*
11,111 =1×24+
1×23+
1×22+
1×21+

=1×20

31

*
Octal numbers
{0, 1, 2, 3, 4, 5, 6, 7}
Base 8
Digits =


Numbers:
0,1,2,3,4,5,6,7,
10,11,12,13,14,15,16,17,
20,21,22,23,24,25,26,27,
30,31,32,33,34,35,36,37,……
• 3578=
=?3×82
+ 5×81
+
7×80
=
23910

*
Hexadecimal numbers
{0, 1, 2, …, 9, A, B, C, D, E, F}
Base 16
Digits =


Numbers:
0,1,2,…9,A,B,C,D,E,F
10,11,12,…19,1A,1B,1C,1D,1E,1F
20,21,22,…29,2A,2B,2C,2D,1E,2F
……
• 35716=?
=3×162+ 5×161+7×160=85510

*
Chinese numbers
Base 10, basically

•Digits = {零, 一, 二, 三, 四, 五, 六, 七, 八, 九}
Another set of digits: {壹, 貳, 叁…, 玖}
Position
Positions in Chinese numbers are explicitly

expressed
•Positions: {個}, 十拾, 百佰, 千仟, 万萬, 亿億, 兆
•Position values: 1, 10, 102, 103, 104, 108, 1012
•E.g.,
五 千 六 百 七 十八
= 5×103 + 6×102 + 7×101 + 8×100
= 5,67810

A grammar for Chinese numbers
G –> Digits

S –> {G} 十{G}

B –> G 百
B –> G 百S
B –> G 百 零G

Q –> G 千
Q –> G 千B
Q –> G 千 零SQ –> G 千 零G
W–> Q/B/S/G 萬
W –>
W –>
W –>
W–>
Q/B/S/G 萬 零
Q/B/S/G 萬QQ/B/S/G 萬 零BQ/B/S/G 萬 零S
G
CCoonnjjuunnccttiiioonn,,nnoottzzeerroo!!
*

Large numbers in Chinese
W –> Q/B/S/G 萬
W –> Q/B/S/G 萬 Q
W –> Q/B/S/G 萬 零 BW –> Q/B/S/G 萬 零 SW –> Q/B/S/G 萬 零 G
Q/B/S/G 兆 零
Q/B/S/G 兆 零
Z –> Q/B/S/G 兆
Z –> Q/B/S/G 兆 Y
Z –> Q/B/S/G 兆 零 YZ –> Q/B/S/G 兆 零 WZ –> Q/B/S/G 兆 零 QZ –>B
Z –>G
Problem:
Ambiguity in analysis
*
Y–>Q/B/S/G億
Y–>Q/B/S/G億W
Y–>Q/B/S/G億 零W
Y–>Q/B/S/G億 零Q
Y–>Q/B/S/G億 零B
Y–>Q/B/S/G億 零G

Solution
W –> B/S/G 萬
W –> B/S/G 萬 Q
W –> B/S/G 萬 零 BW –> B/S/G 萬 零 SW –> B/S/G 萬 零 G
Y –> B/S/G 億
Y –> B/S/G 億 WQ
Y –> B/S/G 億 零 WY –> B/S/G 億 零 QY –> B/S/G 億 零 BY –> B/S/G 億 零 G

YQ –> Q 億
YQ –> Q 億WQYQ –> Q 億 零W
Z –> Q/B/S/G 兆
Z –> Q/B/S/G 兆 YQ
Z –> Q/B/S/G 兆 零 YZ –> Q/B/S/G 兆 零 WZ –> Q/B/S/G 兆 零 QZ –> Q/B/S/G 兆 零 BZ –> Q/B/S/G 兆 零 G
*
WQ–>Q萬
WQ–>Q萬Q

WQ–>Q萬 零B
WQ–>Q萬 零S
WQ–>Q萬 零G

YQ–>Q億 零Q
YQ–>Q億 零B
YQ–>Q億 零G

*
Chinese numbers => values
Two steps:
Syntactic analysis
Parsing: to derive a syntactic tree (called parse tree)for an input sentence / number.
Result: a phrase structure tree.
Semantic interpretation:
To convert the parse tree

into a semantic / meaning representation,
namely, a value.

Semantic rules for interpretation
We need to define a semantic rule for each grammarrule to specify
how a phrase structure under the grammar rule is

interpreted into a meaning representation, i.e.,
how to convert a syntactic structure into meaning.

Z –> Q/B/S/G 兆 零Y
sem(Z)
= sem(Q/B/S/G 兆 零Y)
=
sem(Q/B/S/G)
x sem(兆) +
sem(Y)
*

Example: parsing







Q
G
Z
B
G
Y
x
x
+
x
x
20

Example: semantic interpretation





B
x
x
+
x
x
*
=3
G =3
Q
=3×103

=103=1012
=6
=6×102

=102=108
Y=6×1010
G =6
Z =3×1015 + 6×1010 = 3,000,060,000,000,000
=3×1015

*
Generation (i):Head
Given an Arabic number, generate its Chinese counterpart

Format: N = head * pos + tail
Denoted as: head(N, pos) and tail(N, pos), respectively

Given an input number X, how generate it? Heads and then tails

1012|8|4
1012|8|4
integer division!
remainder!

head(X,兆|亿|萬) = X /
tail(X,兆|亿|萬) = X %
Generate its head

gen(head(X,兆|亿|萬))) a Q-number < 104Generate its tailgen(tail(X,兆|亿|萬)) a number < 1012|8|4*Generation (ii): Tail < 104Generating a Q-numbers X< 104head(X,千/百/十) = X /tail(X,千/百/十) = X %103|2|1103|2|1Generate its headgen(head(X,千/百/十)))  a Q-number < 10Generate its tailgen(tail(X,千/百/十)) a number < 103|2|1Generation: example1. X=123,456,789,123,456,789gen(X) = gen(head(X,兆) 兆gen(tail(X,兆)= gen(123,456) 兆gen(789,123,456,789)2. X=123,456gen(X) ==gen(X,萬) 萬gen(tail(X,萬))gen(12) 萬gen(3,456)3. X=12gen(x) ==gen(head(X,十)) 十gen(tail(x,十))gen(1) 十gen(2)4. gen(1) = 一gen(2) = 二*Example: generation of conjunctiongen(3,000,060,000,000,000):head = 3000tail = 60,000,000,000gen(3000):head = 3tail = 0gen(60,000,000,000):head = 600tail = 0gen(600):head = 6  六tail = 0三千零六百億三千兆零六百億千三零兆百*六百億For Chinese, any time whentail is less than 1/10 of pos,insert a conjunction 零 tothe output.English part?Interlingua:valuesEnglish numbersChinese numbersOthers …….Arabic numbers???**Grammar for English numberExerciseDesign a grammar for English numbers, coveringthe range [0, 1,000,000,000,000,000-1], andUse it to analyse the English number for123,456,789,123 (or 123,456,789,123,456)Design the generation procedure for Englishnumbers and illustrate how it works for a realEnglish number, e.g., 123,456.*HintsIn lecture on interlingua, the following was given as the starting point for your design of the grammar for English numbers for HW2:D0 –> {zero}
D –> {one, two , .. nine}
D’ –> {ten, eleven, … nineteen}
T –> {twenty, …ninety}
and then five rules: H- –> D0 | D | D’ | T | T D
subsuming the following:
H- –> D0
H- –> D
H- –> D’
H- –> T
H- –> T D
to cover numbers under 100. (Do not add any extra symbol in a rule such as T –> D + D, which is wrong!)
Do not forget N –> H-, for N is our “axiom” (just like S for sentence). So are rules for Th-, M-, B-, etc.
Following the above fashion, we can have Th- –> D hundred {H-} for numbers in the range [100, 999]. As mentioned in class that people actually say “twenty hundred” and even “ninety nine hundred”, we can extend this rule into the following by replacing D with H-:
Th- –> H- hundred
Th- –> H- hundred H-
Th- –> H- hundred and H-  (For British English)
Originally, Th- is defined to cover [100, 999]. Given the larger coverage of H- that that of D, the Th- rules have certain overgeneration to generate number beyond 999. But conceptually, simply thinking of Th- as for number under 1000 is fine for other rules.

You may merge them into one line (NOT one rule!) as:
Th- –> H- hundred {and} {H-}
where {} means optional. Please check if any number in this range [100, 999] missing before moving on to rules for M-, B-, etc.

接上页:

M- –> [] thousand
M- –> Th- thousand Th-
M- –> Th- thousand and H-
M- –> Th- thousand H-
整理为M- –> Th- billion

…..

[…]billion […]million […]thousand […]

注意:
箭头换成标准箭头符号
中文数字gen(23)
=gen(2,十)+gen(3)
=gen(2)+gen(3)
英文数字gen(23)
=gen(2,tens)gen(3)
=twenty gen(3)

gen(19)是直接得出19的

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] 程序代写代做代考 Interlingua MT: Translation of Numbers
30 $