CN115662392A - Transliteration method based on phoneme memory, electronic equipment and storage medium - Google Patents

Transliteration method based on phoneme memory, electronic equipment and storage medium Download PDF

Info

Publication number
CN115662392A
CN115662392A CN202211595293.3A CN202211595293A CN115662392A CN 115662392 A CN115662392 A CN 115662392A CN 202211595293 A CN202211595293 A CN 202211595293A CN 115662392 A CN115662392 A CN 115662392A
Authority
CN
China
Prior art keywords
phoneme
letter
layer
vector
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211595293.3A
Other languages
Chinese (zh)
Other versions
CN115662392B (en
Inventor
宋彦
田元贺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202211595293.3A priority Critical patent/CN115662392B/en
Publication of CN115662392A publication Critical patent/CN115662392A/en
Application granted granted Critical
Publication of CN115662392B publication Critical patent/CN115662392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The invention discloses a transliteration method based on phoneme memory, electronic equipment and a storage medium, wherein the transliteration method comprises the following steps: 1. extracting the transliterated words and splitting the transliterated words into letters, 2, constructing a phoneme library, and extracting phoneme features associated with each letter; 3. constructing an encoder of an L layer, and encoding letters to obtain letter encoding vectors corresponding to each letter in each layer; 4. establishing an L-layer phoneme memory network for modeling letter coding vectors and phoneme characteristics to obtain a letter coding matrix; 5. inputting the letter coding matrix and a target letter output by the classifier at the previous t moment into a decoder of an L layer, and sending an obtained letter prediction vector output by the decoder at the t moment into the classifier to obtain a predicted target letter at the t moment; 6. assigning T +1 to T, and repeating the step 5 until the time T, thereby obtaining the predicted letter sequence. The invention aims to fuse the phoneme characteristics into a standard text generation process, thereby improving the transliteration quality and effect.

Description

Transliteration method based on phoneme memory, electronic equipment and storage medium
Technical Field
The invention belongs to the field of natural language processing, and particularly relates to a transliteration method based on phoneme memory, electronic equipment and a storage medium.
Background
Transliteration refers to translating a person's name in a source language, such as Smith, to text in a target language, such as chinese, such as Smith) without changing the pronunciation of the name in the source language. For example, the name "Smith" in the Source language English is transliterated to "Smith" in Chinese.
The existing method mostly regards the task as a sequence-to-sequence generation task, and adopts a high-level encoder and a decoder to generate name transliteration of a target language, and lacks the utilization of phonetic features, especially phoneme features, in a source language and a target language, so that words generated by transliteration lose pronunciation features of the source language, and the accuracy of transliteration is reduced.
Disclosure of Invention
The present invention is directed to overcome the above-mentioned shortcomings of the prior art, and provides a transliteration method, an electronic device and a storage medium based on phoneme memory, so as to integrate phoneme features into a standard text generation process, thereby improving the transliteration quality and effect.
In order to achieve the purpose, the invention adopts the following technical scheme:
the invention relates to a transliteration method based on phoneme memory, which is characterized by comprising the following steps:
step 1, extracting a plurality of transliterated words from a source language corpus, and splitting each word into letters; wherein, the ith word X i The split letter sequence is recorded asx i,1 ,…x i,j ,…,
Figure 755DEST_PATH_IMAGE001
},x i,j Represents the ith word X i The jth letter in (1), n i Represents the ith word X i The total number of middle letters;
step 2, selecting the jth letter from the phoneme libraryx i,j Associated m phoneme characteristics and forming a phoneme characteristic setS i,j ={s i,j,1 s i,j,u ,…s i,j,m And (c) the step of (c) in which,s i,j,u is the j-th letterx i,j An associated u-th phoneme feature, m being a total number of associated phoneme features;
step 3, constructing a transliteration network, comprising: an L-layer encoder, an L-layer phoneme memory network, an L-layer decoder and a classifier;
step 3.1, processing of an encoder:
the j letterx i,j Conversion to jth letter vector
Figure 397101DEST_PATH_IMAGE002
Then inputting the encoded vector into the encoder, sequentially processing the encoded vector by a multi-head self-attention layer of L layers to obtain L letter encoded vectors
Figure 270379DEST_PATH_IMAGE003
|l=1,2, \8230;, L }; wherein the content of the first and second substances,
Figure 135698DEST_PATH_IMAGE004
is shown aslThe multi-head of the layer outputs the jth letter encoding vector from the attention layer;
step 3.2, processing the phoneme memory network:
collecting phoneme characteristicsS i,j Conversion into phoneme vector set
Figure 182151DEST_PATH_IMAGE005
| u =1,2, \8230 |, m } back, and a hard page
Figure 901846DEST_PATH_IMAGE006
|l=1,2, \ 8230;, L } are input into the phoneme memory network together for processing, and n is enhanced i Letter code vector
Figure 8342DEST_PATH_IMAGE007
|l=1,2,…,L;j=1,2,…,n i And is marked as the ith word X i Alpha-code matrix H i (ii) a Wherein the content of the first and second substances,
Figure 813487DEST_PATH_IMAGE008
representing the u-th phonemes i,j,u The phoneme vector of (a);
Figure 398052DEST_PATH_IMAGE009
representing an enhanced jth letter code vector;
and 3.3, processing by a decoder:
encoding letters into matrix H i Inputting the letter and the target letter output by the classifier at the previous t moment into a decoder of an L layer, and obtaining a letter prediction vector h output by the decoder at the t moment i,t (ii) a When t =1, making the letter output by the classifier at the previous t moment be null;
and 3.4, processing by a classifier:
the classifier utilizes the full-connection layer to predict the letter vector h output by the decoder at the time t i,t Processing to obtain the ith word X at the current time t i Predicted target letter y i,t
Step 3.5, after assigning t +1 to t, returning to the step 3.3 for sequential execution until the step is finishedTUntil the moment, thereby obtaining the ith word X i Predicted letter sequence of { y } i,1 ,…, y i,t ,…, y i,T }。
The transliteration method based on phoneme memory of the invention is also characterized in that the step 2 comprises the following steps:
step 2.1, calculate the jth letter using equation (1)x i,j With the qth phoneme feature in the phoneme librarys q Point-by-point mutual information PMI of (x i,j ,s q ) To obtain the j letterx i,j Point-to-point mutual information with all M phoneme features { PMI (PMI) ((PMI))x i,j ,s q )|1<=q<=M };MRepresenting the number of all phoneme characteristics in the phoneme library;
Figure 785302DEST_PATH_IMAGE010
(1)
in the formula (1), p: (x i,j ,s q ) Represents the j letterx i,j With the qth phoneme features q Probability of co-occurrence; p (a)x i,j ) Represents the j letterx i,j Appearing in the ith word X i A probability of (1); p (a)s q ) Representing the qth phoneme features q Appearing in the ith word X i Probability in pronunciation of (a);
step 2.2 from point-to-point mutual information { PMI ((PMI))x i,j ,s q )|1<=q<Selecting M phoneme characteristics corresponding to the highest point-by-point mutual information from the M phoneme characteristics, and forming a phoneme characteristic setS i,j ={s i,j,1 s i,j,u ,…s i,j,m }。
Said step 3.2 comprises:
step 3.2.1, the u-th phonemes i,j,u Conversion to the u-th phoneme vector
Figure 383DEST_PATH_IMAGE011
Then, with
Figure 89562DEST_PATH_IMAGE003
Are input together intolIn a hierarchical phoneme memory network, said firstlThe layer phoneme memory network utilizes the pairs of formula (2) and formula (3)
Figure 681080DEST_PATH_IMAGE012
After mapping, get the secondlU-th phoneme key vector of layer
Figure 172104DEST_PATH_IMAGE013
And a firstlU-th phoneme value vector of layer
Figure 558086DEST_PATH_IMAGE014
Figure 400140DEST_PATH_IMAGE015
(2)
Figure 71381DEST_PATH_IMAGE016
(3)
In the formula (1) and the formula (2),
Figure 354595DEST_PATH_IMAGE017
is shown aslThe key matrix of the layer(s),
Figure 239374DEST_PATH_IMAGE018
denotes the firstlA matrix of values for the layers; reLU denotes the activation function; "·" denotes the multiplication of a matrix and a vector;
step 3.2.2, saidlThe phoneme memory network of the layer calculates the second by using the formula (4)lU-th phoneme weight of layer
Figure 240828DEST_PATH_IMAGE019
Figure 236466DEST_PATH_IMAGE020
(4)
In the formula (3), ". Denotes a vector inner product;
step 3.2.3, saidlThe hierarchical phoneme memory network calculates a weighted average vector using equation (5)
Figure 374186DEST_PATH_IMAGE021
Figure 164288DEST_PATH_IMAGE022
(5)
Step 3.2.4, saidlThe phoneme memory network of the layer is obtained by using the formula (6)lLayer jth letter reset vector
Figure 466087DEST_PATH_IMAGE023
Figure 937520DEST_PATH_IMAGE024
(6)
In the formula (5), sigmoid represents an activation function,
Figure 257643DEST_PATH_IMAGE025
and
Figure 421908DEST_PATH_IMAGE026
respectively representlA first reset matrix and a second reset matrix of a layer,
Figure 194692DEST_PATH_IMAGE027
is shown aslA reset offset vector of a layer;
step 3.2.5, saidlThe phoneme memory network of the layer is obtained by using the formula (7)lLayer enhanced jth letter code vector
Figure 469815DEST_PATH_IMAGE007
So that the enhanced jth letter code vector is output by the L-layer phoneme memory network
Figure 644445DEST_PATH_IMAGE007
|l=1,2, \ 8230;, L }, and thus enhanced n i Letter code vector
Figure 527081DEST_PATH_IMAGE028
|l=1,2,…,L, j=1,2,…,n i And is marked as the ith word X i Alpha-code matrix H i
Figure 990424DEST_PATH_IMAGE029
(7)
In the formula (7), the reaction mixture is,
Figure 865976DEST_PATH_IMAGE030
the product of the hadamard is represented,
Figure 98374DEST_PATH_IMAGE031
representing a vector concatenation, with 1 representing a vector with all dimension values of 1.
The electronic device of the invention comprises a memory and a processor, and is characterized in that the memory is used for storing programs for supporting the processor to execute the transliteration method, and the processor is configured to execute the programs stored in the memory.
The invention relates to a computer-readable storage medium, on which a computer program is stored, characterized in that the computer program executes the steps of the transliteration method when being executed by a processor.
Compared with the prior art, the invention has the beneficial effects that:
1. according to the method, for each letter in the input, the representation of the letter is enhanced by using the phoneme characteristics associated with the letter through the phoneme memory neural network of the L layer, so that the understanding of the model to the pronunciation characteristics of the target language is enhanced, the transliteration text generated by the model retains the phonetic characteristics of the source language as far as possible, and the model transliteration performance is improved.
2. The method realizes the identification and utilization of the importance of different phoneme characteristics by weighting different phoneme characteristics, and effectively avoids the influence of potential noise in the phoneme characteristics on the model performance.
Drawings
FIG. 1 is a flow chart of the transliteration method of the present invention.
Detailed Description
In this embodiment, a transliteration method based on phoneme memory is performed as shown in fig. 1, and includes the following steps:
step 1, extracting a plurality of transliterated words from a source language corpus, and splitting each word into letters; wherein, the ith word X i The split letter sequence is recorded asx i,1 ,…x i,j ,…,
Figure 401179DEST_PATH_IMAGE032
},x i,j Represents the ith word X i The j-th letter in (1), n i Represents the ith word X i The total number of middle letters; for example, 4 transliterated words extracted from an english source language corpus are { Tom, smith, bob, cook }, and then after the words are split, the letter sequence of the split 2 nd word "Smith" is { "S", "m", "i", "t", "h", wherein there are 5 letters in total, and the 3 rd letter is "i".
Step 2, selecting the jth letter from the phoneme libraryx i,j Associated m phoneme characteristics and forming a phoneme characteristic setS i,j ={s i,j,1 s i,j,u ,…s i,j,m And (c) the step of (c) in which,s i,j,u is the jth letterx i,j An associated u-th phoneme feature, m being a total number of associated phoneme features; for example, the phoneme library is a set including all international phonetic symbols { "a", "e", "o", "t", "g", "k", "i", "i:", "is" I ", \8230;". When m =3, the phoneme feature set extracted from it and associated with the 3 rd letter "i" is { "i:", { "i:" I "," i "}. The 1 st phoneme feature associated with the 3 rd letter is "i:".
Step 2.1, calculate the jth letter using equation (1)x i,j And the q-th phoneme feature in the phoneme librarys q PMI (a) of point-by-point mutual informationx i,j ,s q ) To obtain the j letterx i,j Point-to-point mutual information with all M phoneme features { PMI (PMI) ((PMI))x i,j ,s q )|1<=q<=M };MRepresenting the number of all phoneme characteristics in the phoneme library;
Figure 351818DEST_PATH_IMAGE033
(1)
in the formula (1), p: (x i,j ,s q ) Represents the j letterx i,j With the qth phoneme features q Probability of co-occurrence; p (a)x i,j ) Represents the j letterx i,j Appearing in the ith word X i The probability of (2); p (a)s q ) Representing the qth phoneme features q Appearing in the ith word X i Is determined. For example, the process of calculating the 3 rd letter "i" and the 8 th phoneme feature "i:" in the phoneme library is. The probability 0.6 of co-occurrence of the 3 rd letter "i" with the 8 th phoneme feature "i:" in the phoneme library is calculated, the probability 0.3 of the 3 rd letter "i" occurring in the 2 nd word "Smith" is calculated, and the probability 0.5 of the 8 th phoneme feature "i:" occurring in the pronunciation of "Smith" is calculated. Using formula (1), the point-to-point mutual information of the 3 rd letter "i" and the 8 th phoneme feature "i:" in the phoneme library is calculated to be 2. By the same method, the point-by-point mutual information of the 3 rd letter "i" and all the phoneme characteristics in the phoneme library can be calculated.
Step 2.2 from point-to-point mutual information { PMI ((PMI))x i,j ,s q )|1<=q<Selecting M phoneme characteristics corresponding to the highest point-by-point mutual information from the = M } and forming a phoneme characteristic setS i,j ={s i,j,1 s i,j,u ,…s i,j,m }. For example, the 3 highest-scoring phoneme feature of the 3 rd letter "i" is { "i:", { "i:" I ”, “i”}。
Step 3, building a transliteration network, comprising the following steps: an L-layer encoder, an L-layer phoneme memory network, an L-layer decoder and a classifier;
and 3.1, processing of an encoder:
the j letterx i,j Conversion to jth letter vector
Figure 31061DEST_PATH_IMAGE034
Then inputting the encoded vector into the encoder, sequentially processing the encoded vector by a multi-head self-attention layer of L layers to obtain L letter encoded vectors
Figure 665435DEST_PATH_IMAGE035
|l=1,2, \8230;, L }; wherein, the first and the second end of the pipe are connected with each other,
Figure 342404DEST_PATH_IMAGE035
denotes the firstlThe multi-head of the layer outputs the jth letter encoding vector from the attention layer; for example, when L =6, the 3 rd letter "i" is first converted into an alphabet vector, and then, after processing of a multi-head self-attention layer of 6 layers, a 6-letter coded vector of the 3 rd letter "i" is obtained.
Step 3.2, processing the phoneme memory network:
collecting phoneme characteristicsS i,j Converting into phoneme vector set
Figure 577077DEST_PATH_IMAGE008
| u =1,2, \8230 |, m } back, and a hard page
Figure 732115DEST_PATH_IMAGE036
|l=1,2, \8230:, L is input into the phoneme memory network together for processing, and n is enhanced i Letter code vector
Figure 1422DEST_PATH_IMAGE028
|l=1,2,…,L;j=1,2,…,n i And is marked as the ith word X i Alpha-code matrix H i (ii) a Wherein the content of the first and second substances,
Figure 583713DEST_PATH_IMAGE037
representing the u-th phonemes i,j,u The phoneme vector of (a);
Figure 305681DEST_PATH_IMAGE038
representing an enhanced jth letter code vector; for example, the phoneme feature set of the 3 rd letter "i {" i: ", {" i: " I "," i "} is converted to a set of 3 phoneme vectors, which 3 phoneme vectors are input into the phoneme memory network together with the 6 letter code vectors for the 3 rd letter" i ", resulting in an enhanced 3 rd letter code vector
Figure 264410DEST_PATH_IMAGE039
Then, the same operation is carried out on all the letters to obtain a letter coding matrix H of the 2 nd word "Smith 2
Step 3.2.1, the u-th phoneme features i,j,u Conversion to the u-th phoneme vector
Figure 873377DEST_PATH_IMAGE040
Then, with
Figure 954466DEST_PATH_IMAGE041
Are input together intolIn a hierarchical phoneme memory network, said firstlThe layer phoneme memory network utilizes the pairs of formula (2) and formula (3)
Figure 101413DEST_PATH_IMAGE011
After mapping, the first one is obtainedlLayer u-th phoneme key vector
Figure 660570DEST_PATH_IMAGE042
And a firstlU-th phoneme value vector of layer
Figure 576574DEST_PATH_IMAGE043
Figure 828564DEST_PATH_IMAGE015
(2)
Figure 462807DEST_PATH_IMAGE016
(3)
In the formula (1) and the formula (2),
Figure 576388DEST_PATH_IMAGE044
is shown aslThe key matrix of the layer(s),
Figure 346898DEST_PATH_IMAGE045
denotes the firstlA matrix of values for the layers; reLU denotes activation function; "·" denotes the multiplication of a matrix and a vector; for example, the 1 st phoneme feature "i:" is converted into a 1 st phoneme vector, and the 1 st phoneme vector and the 3 rd letter "i" are input into the phoneme memory network of the 4 th layer, so as to obtain the 1 st phoneme key vector of the 4 th layer and the first phoneme value vector of the 4 th layer.
Step 3.2.2, saidlThe phoneme memory network of the layer is calculated by using the formula (4)lU-th phoneme weight of layer
Figure 504210DEST_PATH_IMAGE046
Figure 688066DEST_PATH_IMAGE020
(4)
In the formula (3), ". Denotes a vector inner product; for example, the 3 rd letter "i" has a total of three phoneme weights at level 4, the first of which is 0.5, the second of which is 0.3, and the third of which is 0.2.
Step 3.2.3, saidlThe hierarchical phoneme memory network calculates a weighted average vector using equation (5)
Figure 526709DEST_PATH_IMAGE047
Figure 479622DEST_PATH_IMAGE022
(5)
For example, the weighted average vector of the layer 4 of the 3 rd letter "i" is the weighted average of the value vectors of the layer 4 of the 3 rd letter "i", which in turn has a weight of 0.5,0.3,0.2.
Step 3.2.4, saidlPhoneme memory of layersThe network utilizes formula (6) to obtainlLayer jth letter reset vector
Figure 11097DEST_PATH_IMAGE048
Figure 167403DEST_PATH_IMAGE024
(6)
In the formula (5), sigmoid represents an activation function,
Figure 872054DEST_PATH_IMAGE049
and
Figure 617156DEST_PATH_IMAGE050
respectively representlA first reset matrix and a second reset matrix of a layer,
Figure 381850DEST_PATH_IMAGE051
is shown aslReset offset vector of layer. Layer i jth letter reset vector due to the nature of sigmoid activation function
Figure 212403DEST_PATH_IMAGE052
The value of each dimension is between 0-1, representing the reset weight for each dimension of the vector.
Step 3.2.5, saidlThe phoneme memory network of the layer is obtained by using the formula (7)lLayer enhanced jth letter coded vector
Figure 455165DEST_PATH_IMAGE053
So that the enhanced jth letter code vector is output from the phoneme memory network of L layer
Figure 320353DEST_PATH_IMAGE053
|l=1,2, \ 8230;, L }, and thus enhanced n i Letter code vector
Figure 6680DEST_PATH_IMAGE054
|l=1,2,…,L, j=1,2,…,n i And is marked as the ith word X i Alphabet coding matrix H of i
Figure 324529DEST_PATH_IMAGE029
(7)
In the formula (7), the reaction mixture is,
Figure 370983DEST_PATH_IMAGE055
the hadamard product is represented by,
Figure 887415DEST_PATH_IMAGE056
representing a vector concatenation, with 1 representing a vector with all dimension values of 1.
Figure 931594DEST_PATH_IMAGE057
And
Figure 799056DEST_PATH_IMAGE058
play a role inlJth letter coded vector of a layer
Figure 586883DEST_PATH_IMAGE059
And a first step oflJth weighted average vector of a layer
Figure 991712DEST_PATH_IMAGE060
The contribution of each dimension is weighted separately.
And 3.3, processing by a decoder:
encoding letters into matrix H i Inputting the letter and the target letter output by the classifier at the previous t moment into a decoder of an L layer, and obtaining a letter prediction vector h output by the decoder at the t moment i,t (ii) a When t =1, making the letter output by the classifier at the previous t moment be null; for example, when t =1, the input to the decoder is the letter code matrix H i And a special letter for indicating an empty letter { "<s>"}. When t =3, the input of the decoder is the letter code matrix H i And the target letters { "history", "secret" } that the classifier has output.
And 3.4, processing by a classifier:
the classifier using full-link layers for the output of the decoder at time tLetter prediction vector h i,t Processing to obtain the ith word X at the current time t i Predicted target letter y i,t (ii) a For example, when t =1, the target letter predicted for the 2 nd word "Smith" at time t is "history"; when t =3 is, the target letter predicted for the 2 nd word "Smith" at time t is "s".
Step 3.5, after assigning t +1 to t, returning to the step 3.3 for sequential execution until the step is finishedTUntil the moment, thereby obtaining the ith word X i Predicted letter sequence y i,1 ,…, y i,t ,…, y i,T }. Specific judgment timeTIs that the ith word X is at time T +1 i Is a predictive letter of "</s>". For example, the 2 nd word X at time t =4 i Is predicted to be "</s>", then T =3, the predicted letter sequence {" history "," dense "," si "} for the 2 nd word" Smith "can be obtained.
In this embodiment, an electronic device includes a memory for storing a program that supports a processor to execute the transliteration method described above, and a processor configured to execute the program stored in the memory.
In this embodiment, a computer-readable storage medium stores a computer program, and the computer program is executed by a processor to perform the steps of the transliteration method.

Claims (5)

1. A transliteration method based on phoneme memory is characterized by comprising the following steps:
step 1, extracting a plurality of transliterated words from a source language corpus, and splitting each word into letters; wherein, the ith word X i The split letter sequence is recorded asx i,1 ,…x i,j ,…,
Figure 159768DEST_PATH_IMAGE001
},x i,j Represents the ith word X i The j-th letter in (1), n i Represents the ith word X i The total number of middle letters;
step 2, selecting the jth letter from the phoneme libraryx i,j M associated phoneme characteristics and forming a phoneme characteristic setS i,j ={s i,j,1 s i,j,u ,…s i,j,m And (c) the step of (c) in which,s i,j,u is the jth letterx i,j An associated u-th phoneme feature, m being a total number of associated phoneme features;
step 3, building a transliteration network, comprising the following steps: an L-layer encoder, an L-layer phoneme memory network, an L-layer decoder and a classifier;
step 3.1, processing of an encoder:
the jth letterx i,j Conversion to jth letter vector
Figure 419848DEST_PATH_IMAGE002
Then inputting the encoded vector into the encoder, sequentially processing the encoded vector by a multi-head self-attention layer of L layers to obtain L letter encoded vectors
Figure 46132DEST_PATH_IMAGE003
|l=1,2, \8230;, L }; wherein the content of the first and second substances,
Figure 278531DEST_PATH_IMAGE004
denotes the firstlThe multi-head of the layer outputs the jth letter encoding vector from the attention layer;
step 3.2, processing the phoneme memory network:
collecting phoneme feature setS i,j Conversion into phoneme vector set
Figure 846915DEST_PATH_IMAGE005
| u =1,2, \8230;, m } rear, with a great opening
Figure 531974DEST_PATH_IMAGE006
|l=1,2, \ 8230;, L } are input together into the phoneme memory netProcessing in the net to obtain enhanced n i Letter code vector
Figure 211217DEST_PATH_IMAGE007
|l=1,2,…,L;j=1,2,…,n i And is marked as the ith word X i Alpha-code matrix H i (ii) a Wherein the content of the first and second substances,
Figure 298122DEST_PATH_IMAGE008
representing the u-th phonemes i,j,u The phoneme vector of (a);
Figure 771829DEST_PATH_IMAGE009
representing an enhanced jth letter code vector;
and 3.3, processing by a decoder:
encoding letters into matrix H i Inputting the letter and the target letter output by the classifier at the previous t moment into a decoder of an L layer, and obtaining a letter prediction vector h output by the decoder at the t moment i,t (ii) a When t =1, making the letter output by the classifier at the previous t moment be null;
and 3.4, processing by a classifier:
the classifier utilizes the full-connection layer to predict the letter vector h output by the decoder at the time t i,t Processing to obtain the ith word X at the current time t i Predicted target letter y i,t
Step 3.5, after assigning t +1 to t, returning to the step 3.3 for sequential execution until the step is finishedTUntil the moment, thereby obtaining the ith word X i Predicted letter sequence of { y } i,1 ,…, y i,t ,…, y i,T }。
2. The transliteration method based on phoneme memory as claimed in claim 1, wherein said step 2 comprises:
step 2.1, calculate the jth letter using equation (1)x i,j And the q-th phoneme feature in the phoneme librarys q Point-by-point mutual information PMI of (x i,j ,s q ) To obtain the j letterx i,j Point-to-point mutual information with all M phoneme features { PMI: (x i,j ,s q )|1<=q<=M };MRepresenting the number of all phoneme characteristics in the phoneme library;
Figure 757233DEST_PATH_IMAGE010
(1)
in the formula (1), p: (1)x i,j ,s q ) Represents the j letterx i,j With the qth phoneme features q Probability of co-occurrence; p (a)x i,j ) Represents the jth letterx i,j Appearing in the ith word X i The probability of (2); p (a)s q ) Representing the qth phoneme features q Appearing in the ith word X i Probability in pronunciation of (a);
step 2.2 from point-to-point mutual information { PMI ((PMI))x i,j ,s q )|1<=q<Selecting M phoneme characteristics corresponding to the highest point-by-point mutual information from the M phoneme characteristics, and forming a phoneme characteristic setS i,j ={s i,j,1 s i,j,u ,…s i,j,m }。
3. Transliteration method based on phoneme memory according to claim 1, characterised in that said step 3.2 comprises:
step 3.2.1, the u-th phonemes i,j,u Conversion to the u-th phoneme vector
Figure 912271DEST_PATH_IMAGE011
Then, with
Figure 181579DEST_PATH_IMAGE003
Are input together intolIn a hierarchical phoneme memory network, the firstlLayer phoneme memory network using type (2)And formula (3) pair
Figure 29449DEST_PATH_IMAGE012
After mapping, the first one is obtainedlLayer u-th phoneme key vector
Figure 751417DEST_PATH_IMAGE013
And a firstlU-th phoneme value vector of layer
Figure 710146DEST_PATH_IMAGE014
Figure 568381DEST_PATH_IMAGE015
(2)
Figure 587152DEST_PATH_IMAGE016
(3)
In the formula (1) and the formula (2),
Figure 281570DEST_PATH_IMAGE017
is shown aslThe key matrix of the layer(s),
Figure 106306DEST_PATH_IMAGE018
is shown aslA matrix of values for the layers; reLU denotes the activation function; "·" denotes the multiplication of a matrix and a vector;
step 3.2.2, saidlThe phoneme memory network of the layer calculates the second by using the formula (4)lU-th phoneme weight of layer
Figure 22310DEST_PATH_IMAGE019
Figure 8720DEST_PATH_IMAGE020
(4)
In the formula (3), "·" represents a vector inner product;
step 3.2.3, saidlThe hierarchical phoneme memory network calculates a weighted average vector using equation (5)
Figure 642964DEST_PATH_IMAGE021
Figure 5812DEST_PATH_IMAGE022
(5)
Step 3.2.4, saidlThe phoneme memory network of the layer is obtained by using the formula (6)lLayer jth letter reset vector
Figure 41901DEST_PATH_IMAGE023
Figure 949946DEST_PATH_IMAGE024
(6)
In the formula (5), sigmoid represents an activation function,
Figure 868223DEST_PATH_IMAGE025
and
Figure 972445DEST_PATH_IMAGE026
respectively representlA first reset matrix and a second reset matrix of the layer,
Figure 659779DEST_PATH_IMAGE027
is shown aslA reset offset vector of a layer;
step 3.2.5, saidlThe phoneme memory network of the layer is obtained by using the formula (7)lLayer enhanced jth letter code vector
Figure 191254DEST_PATH_IMAGE007
So that the enhanced jth letter code vector is output from the phoneme memory network of L layer
Figure 596828DEST_PATH_IMAGE007
|l=1,2, \ 8230;, L }, and thus enhanced n i Letter code vector
Figure 504741DEST_PATH_IMAGE028
|l=1,2,…,L, j=1,2,…,n i And is marked as the ith word X i Alpha-code matrix H i
Figure 797313DEST_PATH_IMAGE029
(7)
In the formula (7), the reaction mixture is,
Figure 562007DEST_PATH_IMAGE030
the product of the hadamard is represented,
Figure 392559DEST_PATH_IMAGE031
representing a vector concatenation, with 1 representing a vector with all dimension values of 1.
4. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that enables the processor to perform the transliteration method of any of claims 1-3, and wherein the processor is configured to execute the program stored in the memory.
5. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the transliteration method according to any one of claims 1 to 3.
CN202211595293.3A 2022-12-13 2022-12-13 Transliteration method based on phoneme memory, electronic equipment and storage medium Active CN115662392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211595293.3A CN115662392B (en) 2022-12-13 2022-12-13 Transliteration method based on phoneme memory, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211595293.3A CN115662392B (en) 2022-12-13 2022-12-13 Transliteration method based on phoneme memory, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115662392A true CN115662392A (en) 2023-01-31
CN115662392B CN115662392B (en) 2023-04-25

Family

ID=85019419

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211595293.3A Active CN115662392B (en) 2022-12-13 2022-12-13 Transliteration method based on phoneme memory, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115662392B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1522851A (en) * 1974-07-17 1978-08-31 Threshold Tech Apparatus and method for recognizing words from among continuous speech
CN103020046A (en) * 2012-12-24 2013-04-03 哈尔滨工业大学 Name transliteration method on the basis of classification of name origin

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1522851A (en) * 1974-07-17 1978-08-31 Threshold Tech Apparatus and method for recognizing words from among continuous speech
CN103020046A (en) * 2012-12-24 2013-04-03 哈尔滨工业大学 Name transliteration method on the basis of classification of name origin

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋彦: "一种基于样本和特征选择的语种识别方法" *
金马等: "基于卷积神经网络的语种识别系统" *

Also Published As

Publication number Publication date
CN115662392B (en) 2023-04-25

Similar Documents

Publication Publication Date Title
CN111897949B (en) Guided text abstract generation method based on Transformer
CN111368565B (en) Text translation method, text translation device, storage medium and computer equipment
CN110059188B (en) Chinese emotion analysis method based on bidirectional time convolution network
CN111460807B (en) Sequence labeling method, device, computer equipment and storage medium
CN111859978A (en) Emotion text generation method based on deep learning
CN108460028B (en) Domain adaptation method for integrating sentence weight into neural machine translation
CN111291534A (en) Global coding method for automatic summarization of Chinese long text
CN112348911B (en) Semantic constraint-based method and system for generating fine-grained image by stacking texts
CN110362797B (en) Research report generation method and related equipment
CN110619127A (en) Mongolian Chinese machine translation method based on neural network turing machine
WO2023226292A1 (en) Method for extracting relation from text, relation extraction model, and medium
CN112561718A (en) Case microblog evaluation object emotion tendency analysis method based on BilSTM weight sharing
CN113128206A (en) Question generation method based on word importance weighting
CN114168754A (en) Relation extraction method based on syntactic dependency and fusion information
CN111401037A (en) Natural language generation method and device, electronic equipment and storage medium
WO2020040255A1 (en) Word coding device, analysis device, language model learning device, method, and program
CN113191150B (en) Multi-feature fusion Chinese medical text named entity identification method
CN109979461A (en) A kind of voice translation method and device
CN114528944B (en) Medical text coding method, device, equipment and readable storage medium
CN115662392A (en) Transliteration method based on phoneme memory, electronic equipment and storage medium
CN115270792A (en) Medical entity identification method and device
CN114580376A (en) Chinese abstract generating method based on component sentence method analysis
CN111428509B (en) Latin letter-based Uygur language processing method and system
CN112818688A (en) Text processing method, device, equipment and storage medium
CN110825869A (en) Text abstract generating method of variation generation decoder based on copying mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant