CN109101499B - Artificial intelligence voice learning method based on neural network - Google Patents

Artificial intelligence voice learning method based on neural network Download PDF

Info

Publication number
CN109101499B
CN109101499B CN201810874085.4A CN201810874085A CN109101499B CN 109101499 B CN109101499 B CN 109101499B CN 201810874085 A CN201810874085 A CN 201810874085A CN 109101499 B CN109101499 B CN 109101499B
Authority
CN
China
Prior art keywords
words
voice
association
chinese
foreign language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810874085.4A
Other languages
Chinese (zh)
Other versions
CN109101499A (en
Inventor
王大江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Huilian Technology Co ltd
Original Assignee
Beijing Zhongke Huilian Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Huilian Technology Co ltd filed Critical Beijing Zhongke Huilian Technology Co ltd
Priority to CN201810874085.4A priority Critical patent/CN109101499B/en
Publication of CN109101499A publication Critical patent/CN109101499A/en
Application granted granted Critical
Publication of CN109101499B publication Critical patent/CN109101499B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/58Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

In order to further improve the efficiency and accuracy in online translation, the invention provides an artificial intelligence voice learning method based on a neural network, which comprises the following steps of (1) carrying out artificial intelligence learning of foreign language-Chinese text based on the adoption of a self-adaptive growth type neural network; and (2) performing foreign language-Chinese language voice translation. The invention can match the semantics and the context by a big data foreign language-Chinese comparison dictionary obtained by machine learning based on a 6-order depth probability analysis method, thereby reducing the operation amount by more than 40 percent compared with the method in the prior art, ensuring the translation accuracy and simultaneously improving the translation efficiency.

Description

Artificial intelligence voice learning method based on neural network
Technical Field
The invention relates to the technical field of voice control, in particular to an artificial intelligence voice learning method based on a neural network.
Background
With the development of science and technology and the globalization of economy, on-line translation communication has been in more and more demand, whether in daily life or in academic field communication. Although simultaneous interpretation, portable machine translation devices, and the like have been in use, the accuracy of conventional machine translation devices and the efficiency of simultaneous interpretation personnel are worrisome in use scenarios involving meetings, classrooms, and the like in professional fields. Especially, when a certain party has a fast speech speed, machine translation is difficult to be performed, and simultaneous interpretation personnel need to use a re-confirmation mode to repeat languages which are not followed, so that unsmooth experience is brought to some use scenes.
In order to meet the requirements for improving the efficiency and accuracy of online translation at the same time, the chinese patent application with application number CN201710203439.8 discloses a multilingual intelligent preprocessing real-time statistics machine translation system, which comprises: the device comprises a receiving module, a preprocessing module, a machine translation module and a post-processing module. The receiving module comprises a text language receiving module and a voice recognition result receiving module; the preprocessing module comprises a text preprocessing module and a voice recognition result preprocessing module; the machine translation module is used for learning the translation of phrases by the phrases, finding out corresponding translation phrases for the phrases processed by the preprocessing module and connecting the phrases into a complete sentence; and the post-processing module is used for carrying out word punctuation standardization, case standardization and format standardization processing on the translation result so as to enable the translation result to be closer to the expression habit of the target language and output as a final result. However, such systems have limited resolution to the above-mentioned drawbacks of the prior art.
Disclosure of Invention
In order to further improve the efficiency and the accuracy in online translation, the invention provides an artificial intelligence voice learning method based on a neural network, which comprises the following steps:
(1) Based on the adoption of an adaptive growth type neural network to carry out the artificial intelligence learning of foreign language-Chinese text, the method comprises the following steps:
(10) Establishing a word library;
(20) Establishing a voice prediction model;
(2) And performing foreign language-Chinese language voice translation.
Further, the step (2) comprises: (10) establishing a word library;
(20) Establishing a voice prediction model;
(30) Converting the input voice into characters;
(40) And determining the translated text according to the word library and the voice prediction model.
Further, the step (10) comprises: establishing a first association between foreign words and words with Chinese meanings corresponding to the foreign words according to a dictionary, wherein the translations of the Chinese words are a plurality of Chinese translation words identified by a first sequence position in the dictionary as primary Chinese translation words, and the Chinese translation words at the subsequent sequence positions as secondary Chinese translation words.
Further, the step (20) comprises:
(201) Cutting words according to the foreign language articles to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing from the Chinese translation words according to the Chinese translation words of the foreign language articles;
(202) Indexing the first association and the second association;
further, the step (201) comprises: machine learning is performed in an unsupervised learning manner according to the foreign language articles.
Further, the step (201) comprises: and (3) performing machine learning on the foreign language articles and the translations thereof by adopting a random gradient descent method.
Further, the step (202) comprises:
the first association is used as a primary key, and information related to the first association appearing in the second association is indexed.
Further, the indexing, with the first association as a primary key, information related to the first association appearing from the second association includes:
(2021) Primary key information determination: in the first association, the English word Ei corresponds to the main Chinese translation word Cj; and according to the second association, the second-level words continuing after the word Cj form a set { Sm, pm }, then the word Cj is taken as a main key, wherein Pm is the probability that the word Sm appears after the Cj as the continuing second-level words, and i, j and m are natural numbers starting from 1;
(2022) Define the probability of occurrence of the word Cj:
p(S m |C j )=χ gh (p j ),
wherein
Figure GDA0003827632110000041
Figure GDA0003827632110000042
m =1,2,3,4,5,6; and is provided with
Figure GDA0003827632110000043
Is to be
Figure GDA0003827632110000044
Mean value xi m Is an m-th order diagonal matrix of variance,
Figure GDA0003827632110000045
(2023) According to the probability p (S) m |C j ) Determining the degree of match of the word Cj with the context when the word Cj takes the current meaning:
calculating out
Figure GDA0003827632110000046
Wherein p' represents differentiating p;
computing
Figure GDA0003827632110000047
Whether the value is less than a first preset threshold value: when the value is less than the preset value, determining that the position represented by j in Cj conforms to the context corresponding to Ei, otherwise, making j = j +1, jumping to the step (2022), and if j reaches the maximum value after traversal, making j =1 and continuing to the step (2024), wherein u and v are both natural numbers;
(2024) The matching degree of the correction Sm as the continuous second-level words of Cj with the context is as follows:
computing
Figure GDA0003827632110000051
Whether the value is less than a second preset threshold value: when smaller, sm is determined to be consistent with the context as the next secondary word of Cj, otherwise let m = m +1, jump to step (2022), if m has traversed to its maximum, let m =1.
Further, the step (30) comprises:
(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;
(302) Obtaining a voice model according to the voice characteristic parameters;
(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;
(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.
Further, the step (40) comprises:
based on STT technology, the chinese translation words are used to generate speech.
The beneficial effects of the invention include: the big data foreign language-Chinese comparison dictionary obtained through machine learning is matched with the semantics and the context based on a 6-order depth probability analysis method, so that the operation amount is reduced by more than 40% compared with the method in the prior art, and the translation efficiency is improved while the translation accuracy is ensured.
Drawings
Fig. 1 shows a flow chart of the method of the invention.
Detailed Description
As shown in fig. 1, according to a preferred embodiment of the present invention, the present invention provides an artificial intelligence speech learning method based on a neural network, comprising:
(1) Based on the adoption of a self-adaptive growth type neural network to carry out the artificial intelligent learning of foreign language-Chinese text, the method comprises the following steps:
(10) Establishing a word library;
(20) Establishing a voice prediction model;
(2) And performing foreign language-Chinese language voice translation.
Preferably, the step (2) includes:
(30) Converting the input voice into characters;
(40) And determining the translated text according to the word library and the voice prediction model.
Preferably, the step (10) comprises: establishing a first association between foreign words and words with Chinese meanings corresponding to the foreign words according to a dictionary, wherein the translations of the Chinese words are a plurality of Chinese translation words identified by a first sequence position in the dictionary as primary Chinese translation words, and the Chinese translation words at the subsequent sequence positions as secondary Chinese translation words.
Preferably, the step (20) comprises:
(201) Cutting words according to the foreign language article to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing behind the Chinese translation words according to the Chinese translation words of the foreign language article;
(202) Indexing the first association and the second association;
preferably, the step (201) comprises: machine learning is performed in an unsupervised learning manner according to the foreign language articles.
Preferably, the step (201) comprises: and (4) performing machine learning on the foreign language articles and the translations thereof by adopting a random gradient descent method.
Preferably, said step (202) comprises:
the first association is used as a primary key, and information related to the first association appearing in the second association is indexed. The primary key is a database primary key representing the corresponding relationship between foreign language and Chinese characters.
Preferably, said taking the first association as a primary key, indexing information related to the first association appearing from the second association comprises:
(2021) Primary key information determination: in the first association, the English word Ei corresponds to the main Chinese translation word Cj; and according to the second association, the second-level words continuing after the word Cj form a set { Sm, pm }, then the word Cj is taken as a main key, wherein Pm is the probability that the word Sm appears after the Cj as the continuing second-level words, and i, j and m are natural numbers starting from 1;
(2022) Define the probability of occurrence of the word Cj:
p(S m |C j )=χ gh (p j ),
wherein
Figure GDA0003827632110000081
Figure GDA0003827632110000082
m =1,2,3,4,5,6; and is provided with
Figure GDA0003827632110000083
To be is as
Figure GDA0003827632110000084
Mean value xi m Is an m-th order diagonal matrix of variances,
Figure GDA0003827632110000085
(2023) According to the probability p (S) m |C j ) Determining the degree of match of the word Cj with the context when the word Cj takes the current meaning:
computing
Figure GDA0003827632110000086
Wherein p' represents differentiating p;
computing
Figure GDA0003827632110000087
Whether the value is smaller than a first preset threshold value: when the value is less than the preset value, determining that the position represented by j in Cj conforms to the context corresponding to Ei, otherwise, making j = j +1, jumping to the step (2022), and if j reaches the maximum value after traversal, making j =1 and continuing to the step (2024), wherein u and v are both natural numbers;
(2024) The matching degree with the context when Sm is corrected as a continuous second-level word of Cj:
calculating out
Figure GDA0003827632110000091
Whether the value is smaller than a second preset threshold value: when smaller, sm is determined to be consistent with the context as the next secondary word of Cj, otherwise let m = m +1, jump to step (2022), if m has traversed to its maximum, let m =1.
Preferably, the step (30) comprises:
(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;
(302) Obtaining a voice model according to the voice characteristic parameters;
(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;
(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.
Preferably, the step (40) comprises:
based on STT technology, namely Speech to Text technology, chinese translation words are utilized to generate voice.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Any person skilled in the art can modify or change the above-mentioned embodiments without departing from the spirit and scope of the present invention. Accordingly, it is intended that all equivalent modifications or changes which may be made by those skilled in the art without departing from the spirit and scope of the present invention as defined in the appended claims.

Claims (3)

1. An artificial intelligence voice learning method based on a neural network comprises the following steps:
(1) Based on the adoption of an adaptive growth type neural network to carry out the artificial intelligence learning of foreign language-Chinese text, the method comprises the following steps:
(10) Establishing a word library;
(20) Establishing a voice prediction model;
(2) Performing foreign language-Chinese language voice translation;
the step (2) comprises the following steps:
(30) Converting the input voice into characters;
(40) Determining translation characters according to the word bank and the voice prediction model;
the step (10) comprises: establishing a first association between foreign language words and words with Chinese meanings corresponding to the foreign language words according to a dictionary, wherein translations of the Chinese words are a plurality of Chinese translation words marked by a first sequence position in the dictionary and serve as main Chinese translation words, and Chinese translation words at a later sequence position serve as secondary Chinese translation words;
the step (20) comprises:
(201) Cutting words according to the foreign language articles to obtain foreign language words, and establishing second association of the foreign language words, the Chinese translation words and second-level words continuing from the Chinese translation words according to the Chinese translation words of the foreign language articles;
(202) Indexing the first association and the second association;
the step (201) comprises: performing machine learning on foreign articles and translations thereof by adopting a random gradient descent method;
the step (202) comprises:
indexing information related to the first association appearing in the second association with the first association as a primary key;
the indexing, with the first association as a primary key, information related to the first association appearing from the second association includes:
(2021) Primary key information determination: assume that in the first association, the English term Ei corresponds to the primary Chinese translation term C j (ii) a And according to the second association, word C j The subsequent second-level words form a set { S } m ,p m }, the word C is used j Is a primary bond, wherein p m Is the word S m Appears at C j Then, as the probability of the successive secondary words, i, j and m are natural numbers starting from 1;
(2022) Definition of term C j Probability of occurrence:
p(S m |C j )=χ gh (p j ),
wherein
Figure FDA0003860044330000021
Figure FDA0003860044330000022
And is provided with
Figure FDA0003860044330000023
To be composed of
Figure FDA0003860044330000024
Is mean value, xi m Is an m-th order diagonal matrix of variance,
Figure FDA0003860044330000025
(2023) According to the probability p (S) m |C j ) Determining word C j Matching degree with context when taking current meaning:
computing
Figure FDA0003860044330000031
Wherein p' represents differentiating p;
computing
Figure FDA0003860044330000032
Whether the value is less than a first preset threshold value: when less than, determine C j If j reaches the maximum value after traversal, j =1 and the step (2024) is continued, and u and v are both natural numbers;
(2024) Correction of S m As C j The matching degree of the continuous secondary words and the context:
computing
Figure FDA0003860044330000033
Whether the value is less than a second preset threshold value: when less than, determine S m As C j Otherwise let m = m +1, jump to step (2022), and let m =1 if m has reached its maximum value through traversal.
2. The method according to claim 1, wherein said step (30) comprises:
(301) Carrying out linear analysis on an original voice signal to obtain a weighted cepstrum coefficient as a voice characteristic parameter;
(302) Obtaining a voice model according to the voice characteristic parameters;
(303) Matching the voice model for the voice to be recognized, determining an output probability value for each frame of voice aiming at different models by utilizing frame synchronization network search, reserving a plurality of paths in the matching process, and finally backtracking a matching result;
(304) And judging the matching result by using the state duration distribution and the optimal path probability distribution to reject the voice outside the recognition range and obtain a correct recognition result.
3. The method according to claim 1, wherein said step (40) comprises:
speech is generated using the chinese translation words based on STT technology.
CN201810874085.4A 2018-08-02 2018-08-02 Artificial intelligence voice learning method based on neural network Active CN109101499B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810874085.4A CN109101499B (en) 2018-08-02 2018-08-02 Artificial intelligence voice learning method based on neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810874085.4A CN109101499B (en) 2018-08-02 2018-08-02 Artificial intelligence voice learning method based on neural network

Publications (2)

Publication Number Publication Date
CN109101499A CN109101499A (en) 2018-12-28
CN109101499B true CN109101499B (en) 2022-12-16

Family

ID=64848278

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810874085.4A Active CN109101499B (en) 2018-08-02 2018-08-02 Artificial intelligence voice learning method based on neural network

Country Status (1)

Country Link
CN (1) CN109101499B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6789057B1 (en) * 1997-01-07 2004-09-07 Hitachi, Ltd. Dictionary management method and apparatus
CN105183720A (en) * 2015-08-05 2015-12-23 百度在线网络技术(北京)有限公司 Machine translation method and apparatus based on RNN model
CN107102990A (en) * 2016-02-19 2017-08-29 株式会社东芝 The method and apparatus translated to voice
CN107315741A (en) * 2017-05-24 2017-11-03 清华大学 Bilingual dictionary construction method and equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6789057B1 (en) * 1997-01-07 2004-09-07 Hitachi, Ltd. Dictionary management method and apparatus
CN105183720A (en) * 2015-08-05 2015-12-23 百度在线网络技术(北京)有限公司 Machine translation method and apparatus based on RNN model
CN107102990A (en) * 2016-02-19 2017-08-29 株式会社东芝 The method and apparatus translated to voice
CN107315741A (en) * 2017-05-24 2017-11-03 清华大学 Bilingual dictionary construction method and equipment

Also Published As

Publication number Publication date
CN109101499A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
CN109146610B (en) Intelligent insurance recommendation method and device and intelligent insurance robot equipment
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
CN109145276A (en) A kind of text correction method after speech-to-text based on phonetic
CN110134946B (en) Machine reading understanding method for complex data
CN112100349A (en) Multi-turn dialogue method and device, electronic equipment and storage medium
CN110197279B (en) Transformation model training method, device, equipment and storage medium
CN111709242B (en) Chinese punctuation mark adding method based on named entity recognition
CN114722839B (en) Man-machine cooperative dialogue interaction system and method
CN110909144A (en) Question-answer dialogue method and device, electronic equipment and computer readable storage medium
CN114676255A (en) Text processing method, device, equipment, storage medium and computer program product
CN111191442A (en) Similar problem generation method, device, equipment and medium
CN109033073B (en) Text inclusion recognition method and device based on vocabulary dependency triple
CN112200664A (en) Repayment prediction method based on ERNIE model and DCNN model
CN114020906A (en) Chinese medical text information matching method and system based on twin neural network
CN111984780A (en) Multi-intention recognition model training method, multi-intention recognition method and related device
CN114153971A (en) Error-containing Chinese text error correction, identification and classification equipment
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN113886562A (en) AI resume screening method, system, equipment and storage medium
CN115935959A (en) Method for labeling low-resource glue word sequence
CN115064154A (en) Method and device for generating mixed language voice recognition model
CN113486174B (en) Model training, reading understanding method and device, electronic equipment and storage medium
CN114722822A (en) Named entity recognition method, device, equipment and computer readable storage medium
CN112949284B (en) Text semantic similarity prediction method based on Transformer model
CN112528653A (en) Short text entity identification method and system
CN112084788A (en) Automatic marking method and system for implicit emotional tendency of image captions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20221129

Address after: 100089 305, Zone 2, Building 9, No. 8, Dongbei Wangxi Road, Haidian District, Beijing

Applicant after: BEIJING ZHONGKE HUILIAN TECHNOLOGY Co.,Ltd.

Address before: No. 16, Elbow Group, Fruit Village, Liuxi Miao Township, Yiliang County, Zhaotong City, Yunnan Province 657600

Applicant before: Wang Dajiang

GR01 Patent grant
GR01 Patent grant