CN109101499B - Artificial intelligence voice learning method based on neural network - Google Patents
Artificial intelligence voice learning method based on neural network Download PDFInfo
- Publication number
- CN109101499B CN109101499B CN201810874085.4A CN201810874085A CN109101499B CN 109101499 B CN109101499 B CN 109101499B CN 201810874085 A CN201810874085 A CN 201810874085A CN 109101499 B CN109101499 B CN 109101499B
- Authority
- CN
- China
- Prior art keywords
- words
- voice
- association
- chinese
- foreign language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 18
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 11
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 10
- 238000013519 translation Methods 0.000 claims abstract description 56
- 238000010801 machine learning Methods 0.000 claims abstract description 7
- 238000004458 analytical method Methods 0.000 claims abstract description 5
- 230000014616 translation Effects 0.000 claims description 52
- 238000005516 engineering process Methods 0.000 claims description 5
- 238000011478 gradient descent method Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 230000003044 adaptive effect Effects 0.000 claims description 2
- 238000012937 correction Methods 0.000 claims description 2
- 238000007781 pre-processing Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 2
- 230000009191 jumping Effects 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/088—Non-supervised learning, e.g. competitive learning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
In order to further improve the efficiency and accuracy in online translation, the invention provides an artificial intelligence voice learning method based on a neural network, which comprises the following steps of (1) carrying out artificial intelligence learning of foreign language-Chinese text based on the adoption of a self-adaptive growth type neural network; and (2) performing foreign language-Chinese language voice translation. The invention can match the semantics and the context by a big data foreign language-Chinese comparison dictionary obtained by machine learning based on a 6-order depth probability analysis method, thereby reducing the operation amount by more than 40 percent compared with the method in the prior art, ensuring the translation accuracy and simultaneously improving the translation efficiency.
Description
Technical Field
The invention relates to the technical field of voice control, in particular to an artificial intelligence voice learning method based on a neural network.
Background
With the development of science and technology and the globalization of economy, on-line translation communication has been in more and more demand, whether in daily life or in academic field communication. Although simultaneous interpretation, portable machine translation devices, and the like have been in use, the accuracy of conventional machine translation devices and the efficiency of simultaneous interpretation personnel are worrisome in use scenarios involving meetings, classrooms, and the like in professional fields. Especially, when a certain party has a fast speech speed, machine translation is difficult to be performed, and simultaneous interpretation personnel need to use a re-confirmation mode to repeat languages which are not followed, so that unsmooth experience is brought to some use scenes.
In order to meet the requirements for improving the efficiency and accuracy of online translation at the same time, the chinese patent application with application number CN201710203439.8 discloses a multilingual intelligent preprocessing real-time statistics machine translation system, which comprises: the device comprises a receiving module, a preprocessing module, a machine translation module and a post-processing module. The receiving module comprises a text language receiving module and a voice recognition result receiving module; the preprocessing module comprises a text preprocessing module and a voice recognition result preprocessing module; the machine translation module is used for learning the translation of phrases by the phrases, finding out corresponding translation phrases for the phrases processed by the preprocessing module and connecting the phrases into a complete sentence; and the post-processing module is used for carrying out word punctuation standardization, case standardization and format standardization processing on the translation result so as to enable the translation result to be closer to the expression habit of the target language and output as a final result. However, such systems have limited resolution to the above-mentioned drawbacks of the prior art.
Disclosure of Invention
In order to further improve the efficiency and the accuracy in online translation, the invention provides an artificial intelligence voice learning method based on a neural network, which comprises the following steps:
(1) Based on the adoption of an adaptive growth type neural network to carry out the artificial intelligence learning of foreign language-Chinese text, the method comprises the following steps:
(10) Establishing a word library;
(20) Establishing a voice prediction model;
(2) And performing foreign language-Chinese language voice translation.
Further, the step (2) comprises: (10) establishing a word library;
(20) Establishing a voice prediction model;
(30) Converting the input voice into characters;
(40) And determining the translated text according to the word library and the voice prediction model.
Further, the step (10) comprises: establishing a first association between foreign words and words with Chinese meanings corresponding to the foreign words according to a dictionary, wherein the translations of the Chinese words are a plurality of Chinese translation words identified by a first sequence position in the dictionary as primary Chinese translation words, and the Chinese translation words at the subsequent sequence positions as secondary Chinese translation words.
Further, the step (20) comprises:
(201) Cutting words according to the foreign language articles to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing from the Chinese translation words according to the Chinese translation words of the foreign language articles;
(202) Indexing the first association and the second association;
further, the step (201) comprises: machine learning is performed in an unsupervised learning manner according to the foreign language articles.
Further, the step (201) comprises: and (3) performing machine learning on the foreign language articles and the translations thereof by adopting a random gradient descent method.
Further, the step (202) comprises:
the first association is used as a primary key, and information related to the first association appearing in the second association is indexed.
Further, the indexing, with the first association as a primary key, information related to the first association appearing from the second association includes:
(2021) Primary key information determination: in the first association, the English word Ei corresponds to the main Chinese translation word Cj; and according to the second association, the second-level words continuing after the word Cj form a set { Sm, pm }, then the word Cj is taken as a main key, wherein Pm is the probability that the word Sm appears after the Cj as the continuing second-level words, and i, j and m are natural numbers starting from 1;
(2022) Define the probability of occurrence of the word Cj:
p(S m |C j )=χ gh (p j ),
wherein
m =1,2,3,4,5,6; and is provided withIs to beMean value xi m Is an m-th order diagonal matrix of variance,
(2023) According to the probability p (S) m |C j ) Determining the degree of match of the word Cj with the context when the word Cj takes the current meaning:
computingWhether the value is less than a first preset threshold value: when the value is less than the preset value, determining that the position represented by j in Cj conforms to the context corresponding to Ei, otherwise, making j = j +1, jumping to the step (2022), and if j reaches the maximum value after traversal, making j =1 and continuing to the step (2024), wherein u and v are both natural numbers;
(2024) The matching degree of the correction Sm as the continuous second-level words of Cj with the context is as follows:
computingWhether the value is less than a second preset threshold value: when smaller, sm is determined to be consistent with the context as the next secondary word of Cj, otherwise let m = m +1, jump to step (2022), if m has traversed to its maximum, let m =1.
Further, the step (30) comprises:
(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;
(302) Obtaining a voice model according to the voice characteristic parameters;
(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;
(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.
Further, the step (40) comprises:
based on STT technology, the chinese translation words are used to generate speech.
The beneficial effects of the invention include: the big data foreign language-Chinese comparison dictionary obtained through machine learning is matched with the semantics and the context based on a 6-order depth probability analysis method, so that the operation amount is reduced by more than 40% compared with the method in the prior art, and the translation efficiency is improved while the translation accuracy is ensured.
Drawings
Fig. 1 shows a flow chart of the method of the invention.
Detailed Description
As shown in fig. 1, according to a preferred embodiment of the present invention, the present invention provides an artificial intelligence speech learning method based on a neural network, comprising:
(1) Based on the adoption of a self-adaptive growth type neural network to carry out the artificial intelligent learning of foreign language-Chinese text, the method comprises the following steps:
(10) Establishing a word library;
(20) Establishing a voice prediction model;
(2) And performing foreign language-Chinese language voice translation.
Preferably, the step (2) includes:
(30) Converting the input voice into characters;
(40) And determining the translated text according to the word library and the voice prediction model.
Preferably, the step (10) comprises: establishing a first association between foreign words and words with Chinese meanings corresponding to the foreign words according to a dictionary, wherein the translations of the Chinese words are a plurality of Chinese translation words identified by a first sequence position in the dictionary as primary Chinese translation words, and the Chinese translation words at the subsequent sequence positions as secondary Chinese translation words.
Preferably, the step (20) comprises:
(201) Cutting words according to the foreign language article to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing behind the Chinese translation words according to the Chinese translation words of the foreign language article;
(202) Indexing the first association and the second association;
preferably, the step (201) comprises: machine learning is performed in an unsupervised learning manner according to the foreign language articles.
Preferably, the step (201) comprises: and (4) performing machine learning on the foreign language articles and the translations thereof by adopting a random gradient descent method.
Preferably, said step (202) comprises:
the first association is used as a primary key, and information related to the first association appearing in the second association is indexed. The primary key is a database primary key representing the corresponding relationship between foreign language and Chinese characters.
Preferably, said taking the first association as a primary key, indexing information related to the first association appearing from the second association comprises:
(2021) Primary key information determination: in the first association, the English word Ei corresponds to the main Chinese translation word Cj; and according to the second association, the second-level words continuing after the word Cj form a set { Sm, pm }, then the word Cj is taken as a main key, wherein Pm is the probability that the word Sm appears after the Cj as the continuing second-level words, and i, j and m are natural numbers starting from 1;
(2022) Define the probability of occurrence of the word Cj:
p(S m |C j )=χ gh (p j ),
wherein
m =1,2,3,4,5,6; and is provided withTo be is asMean value xi m Is an m-th order diagonal matrix of variances,
(2023) According to the probability p (S) m |C j ) Determining the degree of match of the word Cj with the context when the word Cj takes the current meaning:
computingWhether the value is smaller than a first preset threshold value: when the value is less than the preset value, determining that the position represented by j in Cj conforms to the context corresponding to Ei, otherwise, making j = j +1, jumping to the step (2022), and if j reaches the maximum value after traversal, making j =1 and continuing to the step (2024), wherein u and v are both natural numbers;
(2024) The matching degree with the context when Sm is corrected as a continuous second-level word of Cj:
calculating outWhether the value is smaller than a second preset threshold value: when smaller, sm is determined to be consistent with the context as the next secondary word of Cj, otherwise let m = m +1, jump to step (2022), if m has traversed to its maximum, let m =1.
Preferably, the step (30) comprises:
(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;
(302) Obtaining a voice model according to the voice characteristic parameters;
(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;
(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.
Preferably, the step (40) comprises:
based on STT technology, namely Speech to Text technology, chinese translation words are utilized to generate voice.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which may be made by those skilled in the art without departing from the spirit and scope of the present invention as defined in the appended claims.
Claims (3)
1. An artificial intelligence voice learning method based on a neural network comprises the following steps:
(1) Based on the adoption of an adaptive growth type neural network to carry out the artificial intelligence learning of foreign language-Chinese text, the method comprises the following steps:
(10) Establishing a word library;
(20) Establishing a voice prediction model;
(2) Performing foreign language-Chinese language voice translation;
the step (2) comprises the following steps:
(30) Converting the input voice into characters;
(40) Determining translation characters according to the word bank and the voice prediction model;
the step (10) comprises: establishing a first association between foreign language words and words with Chinese meanings corresponding to the foreign language words according to a dictionary, wherein translations of the Chinese words are a plurality of Chinese translation words marked by a first sequence position in the dictionary and serve as main Chinese translation words, and Chinese translation words at a later sequence position serve as secondary Chinese translation words;
the step (20) comprises:
(201) Cutting words according to the foreign language articles to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing from the Chinese translation words according to the Chinese translation words of the foreign language articles;
(202) Indexing the first association and the second association;
the step (201) comprises: performing machine learning on foreign articles and translations thereof by adopting a random gradient descent method;
the step (202) comprises:
indexing information related to the first association appearing in the second association with the first association as a primary key;
the indexing, with the first association as a primary key, information related to the first association appearing from the second association includes:
(2021) Primary key information determination: assume that in the first association, the English term Ei corresponds to the primary Chinese translation term C j (ii) a And according to the second association, word C j The subsequent second-level words form a set { S } m ,p m }, the word C is used j Is a primary bond, wherein p m Is the word S m Appears at C j Then, as the probability of the successive secondary words, i, j and m are natural numbers starting from 1;
(2022) Definition of term C j Probability of occurrence:
p(S m |C j )=χ gh (p j ),
wherein
And is provided withTo be composed ofIs mean value, xi m Is an m-th order diagonal matrix of variance,
(2023) According to the probability p (S) m |C j ) Determining word C j Matching degree with context when taking current meaning:
computingWhether the value is less than a first preset threshold value: when less than, determine C j If j reaches the maximum value after traversal, j =1 and the step (2024) is continued, and u and v are both natural numbers;
(2024) Correction of S m As C j The matching degree of the continuous secondary words and the context:
2. The method according to claim 1, wherein said step (30) comprises:
(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;
(302) Obtaining a voice model according to the voice characteristic parameters;
(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;
(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.
3. The method according to claim 1, wherein said step (40) comprises:
speech is generated using the chinese translation words based on STT technology.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810874085.4A CN109101499B (en) | 2018-08-02 | 2018-08-02 | Artificial intelligence voice learning method based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810874085.4A CN109101499B (en) | 2018-08-02 | 2018-08-02 | Artificial intelligence voice learning method based on neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109101499A CN109101499A (en) | 2018-12-28 |
CN109101499B true CN109101499B (en) | 2022-12-16 |
Family
ID=64848278
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810874085.4A Active CN109101499B (en) | 2018-08-02 | 2018-08-02 | Artificial intelligence voice learning method based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109101499B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6789057B1 (en) * | 1997-01-07 | 2004-09-07 | Hitachi, Ltd. | Dictionary management method and apparatus |
CN105183720A (en) * | 2015-08-05 | 2015-12-23 | 百度在线网络技术(北京)有限公司 | Machine translation method and apparatus based on RNN model |
CN107102990A (en) * | 2016-02-19 | 2017-08-29 | 株式会社东芝 | The method and apparatus translated to voice |
CN107315741A (en) * | 2017-05-24 | 2017-11-03 | 清华大学 | Bilingual dictionary construction method and equipment |
-
2018
- 2018-08-02 CN CN201810874085.4A patent/CN109101499B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6789057B1 (en) * | 1997-01-07 | 2004-09-07 | Hitachi, Ltd. | Dictionary management method and apparatus |
CN105183720A (en) * | 2015-08-05 | 2015-12-23 | 百度在线网络技术(北京)有限公司 | Machine translation method and apparatus based on RNN model |
CN107102990A (en) * | 2016-02-19 | 2017-08-29 | 株式会社东芝 | The method and apparatus translated to voice |
CN107315741A (en) * | 2017-05-24 | 2017-11-03 | 清华大学 | Bilingual dictionary construction method and equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109101499A (en) | 2018-12-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109146610B (en) | Intelligent insurance recommendation method and device and intelligent insurance robot equipment | |
CN108304372B (en) | Entity extraction method and device, computer equipment and storage medium | |
CN109145276A (en) | A kind of text correction method after speech-to-text based on phonetic | |
CN110134946B (en) | Machine reading understanding method for complex data | |
CN112100349A (en) | Multi-turn dialogue method and device, electronic equipment and storage medium | |
CN110197279B (en) | Transformation model training method, device, equipment and storage medium | |
CN111709242B (en) | Chinese punctuation mark adding method based on named entity recognition | |
CN114722839B (en) | Man-machine cooperative dialogue interaction system and method | |
CN110909144A (en) | Question-answer dialogue method and device, electronic equipment and computer readable storage medium | |
CN114676255A (en) | Text processing method, device, equipment, storage medium and computer program product | |
CN111191442A (en) | Similar problem generation method, device, equipment and medium | |
CN109033073B (en) | Text inclusion recognition method and device based on vocabulary dependency triple | |
CN112200664A (en) | Repayment prediction method based on ERNIE model and DCNN model | |
CN114020906A (en) | Chinese medical text information matching method and system based on twin neural network | |
CN111984780A (en) | Multi-intention recognition model training method, multi-intention recognition method and related device | |
CN114153971A (en) | Error-containing Chinese text error correction, identification and classification equipment | |
CN112988970A (en) | Text matching algorithm serving intelligent question-answering system | |
CN113886562A (en) | AI resume screening method, system, equipment and storage medium | |
CN115935959A (en) | Method for labeling low-resource glue word sequence | |
CN115064154A (en) | Method and device for generating mixed language voice recognition model | |
CN113486174B (en) | Model training, reading understanding method and device, electronic equipment and storage medium | |
CN114722822A (en) | Named entity recognition method, device, equipment and computer readable storage medium | |
CN112949284B (en) | Text semantic similarity prediction method based on Transformer model | |
CN112528653A (en) | Short text entity identification method and system | |
CN112084788A (en) | Automatic marking method and system for implicit emotional tendency of image captions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20221129 Address after: 100089 305, Zone 2, Building 9, No. 8, Dongbei Wangxi Road, Haidian District, Beijing Applicant after: BEIJING ZHONGKE HUILIAN TECHNOLOGY Co.,Ltd. Address before: No. 16, Elbow Group, Fruit Village, Liuxi Miao Township, Yiliang County, Zhaotong City, Yunnan Province 657600 Applicant before: Wang Dajiang |
|
GR01 | Patent grant | ||
GR01 | Patent grant |