CN109101499B

CN109101499B - Artificial intelligence voice learning method based on neural network

Info

Publication number: CN109101499B
Application number: CN201810874085.4A
Authority: CN
Inventors: 王大江
Original assignee: Beijing Zhongke Huilian Technology Co ltd
Current assignee: Beijing Zhongke Huilian Technology Co ltd
Priority date: 2018-08-02
Filing date: 2018-08-02
Publication date: 2022-12-16
Anticipated expiration: 2038-08-02
Also published as: CN109101499A

Abstract

In order to further improve the efficiency and accuracy in online translation, the invention provides an artificial intelligence voice learning method based on a neural network, which comprises the following steps of (1) carrying out artificial intelligence learning of foreign language-Chinese text based on the adoption of a self-adaptive growth type neural network; and (2) performing foreign language-Chinese language voice translation. The invention can match the semantics and the context by a big data foreign language-Chinese comparison dictionary obtained by machine learning based on a 6-order depth probability analysis method, thereby reducing the operation amount by more than 40 percent compared with the method in the prior art, ensuring the translation accuracy and simultaneously improving the translation efficiency.

Description

Artificial intelligence voice learning method based on neural network

Technical Field

The invention relates to the technical field of voice control, in particular to an artificial intelligence voice learning method based on a neural network.

Background

With the development of science and technology and the globalization of economy, on-line translation communication has been in more and more demand, whether in daily life or in academic field communication. Although simultaneous interpretation, portable machine translation devices, and the like have been in use, the accuracy of conventional machine translation devices and the efficiency of simultaneous interpretation personnel are worrisome in use scenarios involving meetings, classrooms, and the like in professional fields. Especially, when a certain party has a fast speech speed, machine translation is difficult to be performed, and simultaneous interpretation personnel need to use a re-confirmation mode to repeat languages which are not followed, so that unsmooth experience is brought to some use scenes.

In order to meet the requirements for improving the efficiency and accuracy of online translation at the same time, the chinese patent application with application number CN201710203439.8 discloses a multilingual intelligent preprocessing real-time statistics machine translation system, which comprises: the device comprises a receiving module, a preprocessing module, a machine translation module and a post-processing module. The receiving module comprises a text language receiving module and a voice recognition result receiving module; the preprocessing module comprises a text preprocessing module and a voice recognition result preprocessing module; the machine translation module is used for learning the translation of phrases by the phrases, finding out corresponding translation phrases for the phrases processed by the preprocessing module and connecting the phrases into a complete sentence; and the post-processing module is used for carrying out word punctuation standardization, case standardization and format standardization processing on the translation result so as to enable the translation result to be closer to the expression habit of the target language and output as a final result. However, such systems have limited resolution to the above-mentioned drawbacks of the prior art.

Disclosure of Invention

In order to further improve the efficiency and the accuracy in online translation, the invention provides an artificial intelligence voice learning method based on a neural network, which comprises the following steps:

(1) Based on the adoption of an adaptive growth type neural network to carry out the artificial intelligence learning of foreign language-Chinese text, the method comprises the following steps:

(10) Establishing a word library;

(20) Establishing a voice prediction model;

(2) And performing foreign language-Chinese language voice translation.

Further, the step (2) comprises: (10) establishing a word library;

(20) Establishing a voice prediction model;

(30) Converting the input voice into characters;

(40) And determining the translated text according to the word library and the voice prediction model.

Further, the step (10) comprises: establishing a first association between foreign words and words with Chinese meanings corresponding to the foreign words according to a dictionary, wherein the translations of the Chinese words are a plurality of Chinese translation words identified by a first sequence position in the dictionary as primary Chinese translation words, and the Chinese translation words at the subsequent sequence positions as secondary Chinese translation words.

Further, the step (20) comprises:

(201) Cutting words according to the foreign language articles to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing from the Chinese translation words according to the Chinese translation words of the foreign language articles;

(202) Indexing the first association and the second association;

further, the step (201) comprises: machine learning is performed in an unsupervised learning manner according to the foreign language articles.

Further, the step (201) comprises: and (3) performing machine learning on the foreign language articles and the translations thereof by adopting a random gradient descent method.

Further, the step (202) comprises:

the first association is used as a primary key, and information related to the first association appearing in the second association is indexed.

Further, the indexing, with the first association as a primary key, information related to the first association appearing from the second association includes:

(2021) Primary key information determination: in the first association, the English word Ei corresponds to the main Chinese translation word Cj; and according to the second association, the second-level words continuing after the word Cj form a set { Sm, pm }, then the word Cj is taken as a main key, wherein Pm is the probability that the word Sm appears after the Cj as the continuing second-level words, and i, j and m are natural numbers starting from 1;

(2022) Define the probability of occurrence of the word Cj:

p(S _m |C _j )＝χ _gh (p _j )，

wherein

m =1,2,3,4,5,6; and is provided with

Is to be

Mean value xi _m Is an m-th order diagonal matrix of variance,

(2023) According to the probability p (S) _m |C _j ) Determining the degree of match of the word Cj with the context when the word Cj takes the current meaning:

calculating out

Wherein p' represents differentiating p;

computing

Whether the value is less than a first preset threshold value: when the value is less than the preset value, determining that the position represented by j in Cj conforms to the context corresponding to Ei, otherwise, making j = j +1, jumping to the step (2022), and if j reaches the maximum value after traversal, making j =1 and continuing to the step (2024), wherein u and v are both natural numbers;

(2024) The matching degree of the correction Sm as the continuous second-level words of Cj with the context is as follows:

computing

Whether the value is less than a second preset threshold value: when smaller, sm is determined to be consistent with the context as the next secondary word of Cj, otherwise let m = m +1, jump to step (2022), if m has traversed to its maximum, let m =1.

Further, the step (30) comprises:

(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;

(302) Obtaining a voice model according to the voice characteristic parameters;

(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;

(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.

Further, the step (40) comprises:

based on STT technology, the chinese translation words are used to generate speech.

The beneficial effects of the invention include: the big data foreign language-Chinese comparison dictionary obtained through machine learning is matched with the semantics and the context based on a 6-order depth probability analysis method, so that the operation amount is reduced by more than 40% compared with the method in the prior art, and the translation efficiency is improved while the translation accuracy is ensured.

Drawings

Fig. 1 shows a flow chart of the method of the invention.

Detailed Description

As shown in fig. 1, according to a preferred embodiment of the present invention, the present invention provides an artificial intelligence speech learning method based on a neural network, comprising:

(1) Based on the adoption of a self-adaptive growth type neural network to carry out the artificial intelligent learning of foreign language-Chinese text, the method comprises the following steps:

(10) Establishing a word library;

(20) Establishing a voice prediction model;

(2) And performing foreign language-Chinese language voice translation.

Preferably, the step (2) includes:

(30) Converting the input voice into characters;

Preferably, the step (10) comprises: establishing a first association between foreign words and words with Chinese meanings corresponding to the foreign words according to a dictionary, wherein the translations of the Chinese words are a plurality of Chinese translation words identified by a first sequence position in the dictionary as primary Chinese translation words, and the Chinese translation words at the subsequent sequence positions as secondary Chinese translation words.

Preferably, the step (20) comprises:

(201) Cutting words according to the foreign language article to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing behind the Chinese translation words according to the Chinese translation words of the foreign language article;

(202) Indexing the first association and the second association;

preferably, the step (201) comprises: machine learning is performed in an unsupervised learning manner according to the foreign language articles.

Preferably, the step (201) comprises: and (4) performing machine learning on the foreign language articles and the translations thereof by adopting a random gradient descent method.

Preferably, said step (202) comprises:

the first association is used as a primary key, and information related to the first association appearing in the second association is indexed. The primary key is a database primary key representing the corresponding relationship between foreign language and Chinese characters.

Preferably, said taking the first association as a primary key, indexing information related to the first association appearing from the second association comprises:

(2022) Define the probability of occurrence of the word Cj:

p(S _m |C _j )＝χ _gh (p _j )，

wherein

m =1,2,3,4,5,6; and is provided with

To be is as

Mean value xi _m Is an m-th order diagonal matrix of variances,

computing

Wherein p' represents differentiating p;

computing

Whether the value is smaller than a first preset threshold value: when the value is less than the preset value, determining that the position represented by j in Cj conforms to the context corresponding to Ei, otherwise, making j = j +1, jumping to the step (2022), and if j reaches the maximum value after traversal, making j =1 and continuing to the step (2024), wherein u and v are both natural numbers;

(2024) The matching degree with the context when Sm is corrected as a continuous second-level word of Cj:

calculating out

Whether the value is smaller than a second preset threshold value: when smaller, sm is determined to be consistent with the context as the next secondary word of Cj, otherwise let m = m +1, jump to step (2022), if m has traversed to its maximum, let m =1.

Preferably, the step (30) comprises:

(302) Obtaining a voice model according to the voice characteristic parameters;

Preferably, the step (40) comprises:

based on STT technology, namely Speech to Text technology, chinese translation words are utilized to generate voice.

The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which may be made by those skilled in the art without departing from the spirit and scope of the present invention as defined in the appended claims.

Claims

1. An artificial intelligence voice learning method based on a neural network comprises the following steps:

(10) Establishing a word library;

(20) Establishing a voice prediction model;

(2) Performing foreign language-Chinese language voice translation;

the step (2) comprises the following steps:

(30) Converting the input voice into characters;

(40) Determining translation characters according to the word bank and the voice prediction model;

the step (10) comprises: establishing a first association between foreign language words and words with Chinese meanings corresponding to the foreign language words according to a dictionary, wherein translations of the Chinese words are a plurality of Chinese translation words marked by a first sequence position in the dictionary and serve as main Chinese translation words, and Chinese translation words at a later sequence position serve as secondary Chinese translation words;

the step (20) comprises:

(202) Indexing the first association and the second association;

the step (201) comprises: performing machine learning on foreign articles and translations thereof by adopting a random gradient descent method;

the step (202) comprises:

indexing information related to the first association appearing in the second association with the first association as a primary key;

the indexing, with the first association as a primary key, information related to the first association appearing from the second association includes:

(2021) Primary key information determination: assume that in the first association, the English term Ei corresponds to the primary Chinese translation term C _j (ii) a And according to the second association, word C _j The subsequent second-level words form a set { S } _m ，p _m }, the word C is used _j Is a primary bond, wherein p _m Is the word S _m Appears at C _j Then, as the probability of the successive secondary words, i, j and m are natural numbers starting from 1;

(2022) Definition of term C _j Probability of occurrence:

p(S _m |C _j )＝χ _gh (p _j )，

wherein

And is provided with

To be composed of

Is mean value, xi _m Is an m-th order diagonal matrix of variance,

(2023) According to the probability p (S) _m |C _j ) Determining word C _j Matching degree with context when taking current meaning:

computing

Wherein p' represents differentiating p;

computing

Whether the value is less than a first preset threshold value: when less than, determine C _j If j reaches the maximum value after traversal, j =1 and the step (2024) is continued, and u and v are both natural numbers;

(2024) Correction of S _m As C _j The matching degree of the continuous secondary words and the context:

computing

Whether the value is less than a second preset threshold value: when less than, determine S _m As C _j Otherwise let m = m +1, jump to step (2022), and let m =1 if m has reached its maximum value through traversal.

2. The method according to claim 1, wherein said step (30) comprises:

(302) Obtaining a voice model according to the voice characteristic parameters;

3. The method according to claim 1, wherein said step (40) comprises:

speech is generated using the chinese translation words based on STT technology.