CN110516265A

CN110516265A - A kind of single identification real-time translation system based on intelligent sound

Info

Publication number: CN110516265A
Application number: CN201910819189.XA
Authority: CN
Inventors: 张磊
Original assignee: Qingdao Guli Internet Technology Co Ltd
Current assignee: Qingdao Guli Internet Technology Co Ltd
Priority date: 2019-08-31
Filing date: 2019-08-31
Publication date: 2019-11-29

Abstract

The invention discloses a kind of single identification real-time translation system based on intelligent sound, it is related to the communications field, including sound acquisition module, sound processing module, achieve writing module and putting module, sound acquisition module collected sound signal simultaneously transmits, sound processing module receives and processing voice signal, user's throat is experienced according to vibration sensor whether vibrate establish user's sound bank and contrast phone library, feature identification is carried out to user's vocal print, obtain user's voice signal, it is compared with contrast phone library, further reject the voice of environment noise and other people, obtain the text information of language described in user, writing module is achieved by character translation into object language, and switch to the manuscript storage of minutes mode, manuscript is shown or is played by putting module.The present invention accurately identifies the sound of speaking of single use person, ignores other people one's voices in speech, then environment noise is filtered, single identification, real time translation, and translation is more accurate, and effect is more preferable.

Description

A kind of single identification real-time translation system based on intelligent sound

Technical field

The present invention relates to the communications field, especially a kind of single identification real-time translation system based on intelligent sound.

Background technique

With the development of science and technology the increasingly many and diverse and information content of international exchange sharply increases, the obstacle between different language Caused Information Problems increasingly influence the efficiency linked up, and human translation is needed to turn over the speech of participant in real time It translates, not only human cost is high, but also often occurs during human translation because interrupting caused by thinking.In the prior art Real time translation is carried out to the speech in meeting using machine translation mode, but due to sound source complex, noise in meeting room It is more, it is directly acquired using microphone, whether be user one's voice in speech, therefore collected sound presss from both sides if can not accurately identify Miscellaneous a large amount of independent voices cause sound identification and translation mistake occur, are unfavorable for real time translation, and the sound without the noise reduction process that cleans The directly processing of message breath is easier to cause to identify mistake.

Summary of the invention

In order to overcome the drawbacks described above in the presence of the prior art, the present invention provides a kind of based on the single of intelligent sound Identify real-time translation system.The technical solution adopted by the present invention to solve the technical problems is: a kind of list based on intelligent sound One identification real-time translation system, including sound acquisition module, sound processing module, archive writing module and putting module:

S1. the sound acquisition module collected sound signal and sound processing module is sent by voice signal；

S2. the sound processing module receives the voice signal of sound acquisition module, identifies the sound and language of user Type establishes user's sound bank and contrast phone library, rejects the sound of noise and other people, the voice signal translated to needs It is handled, obtain the corresponding text of language described in user and sends archive writing module for text information；

S3. the archive writing module by character translation at object language, and switch to minutes mode manuscript storage；

S4. manuscript is launched out or be broadcast using speech synthesis player by the putting module by display It puts.

A kind of above-mentioned single identification real-time translation system based on intelligent sound, the sound acquisition module are miniature ear Wheat, earphone, microphone and the external vibration sensor for touching throat including being inserted into human ear.

A kind of above-mentioned single identification real-time translation system based on intelligent sound, the sound processing module of stating is with sound Line recognition unit carries out feature identification to user's vocal print.

A kind of above-mentioned single identification real-time translation system based on intelligent sound, the miniature headset pass through bluetooth or net Network is connect with control device, and the control device is controlled by computer and connected.

A kind of above-mentioned single identification real-time translation system based on intelligent sound, the noise include cough, sneezing, Sound of the wind browses sound.

A kind of above-mentioned single identification real-time translation system based on intelligent sound, the working-flow are as follows:

A. the preparation stage, user is in quieter environment and wears miniature headset and carry out trying to speak, described miniature The external vibration sensor for touching throat of headset experiences vibration and records oscillation intensity, and the miniature headset will be adopted The voice signal of collection passes to sound recognition module, the sound and category of language of the sound recognition module identification user, sound Line recognition unit carries out feature identification to user's vocal print and establishes user's sound bank；

B. formally in use, vibration sensor cannot experience throat when user speaks when wearer is silent Vibration, miniature headset acquires the hum in ambient enviroment and the voice in addition to user and in sound recognition module at this time Contrast phone library is established under effect；

C. when wearer loquiturs, the external vibration sensor of the miniature headset experiences throat when user speaks The vibration of generated sufficient intensity, and trigger the judgement for the sound bank that sound processing module should be stored in voice signal, sound The voice signal that sound acquisition module is sent therefrom extracted by sound processing module meet user's vocal print feature audio deposit make User's sound bank, and user's sound bank and contrast phone library are compared, filtering noise is further rejected, obtains and needs to turn over The voice signal translated；

D. sound processing module carries out framing and feature extraction to the voice signal that needs are translated, with linear prediction cepstrum coefficient Every frame sound is become the multi-C vector comprising acoustic information by coefficient, is obtained audio data, is passed through acoustic model, dictionary and language Model carries out text output to the audio data after feature extraction, obtains the corresponding text of language described in user and believes text Breath is sent to archive writing module；

E. it is regular to text progress to achieve writing module, the sentence for meeting specification is organized into, by the character cell of input, sequence Column upload cloud, and object language needed for translating into user is switched to the manuscript of minutes mode and storage；

F. putting module is as needed, and manuscript is launched out by display, or using speech synthesis player into Row plays.

The invention has the advantages that the present invention, which passes through, judges that the vibration of Adam's apple triggers sound processing module and sound is believed The judgement of number processing mode, voice signal collected is stored in contrast phone library as a comparison when being not felt by Adam's apple vibration, feels Voice signal collected when being vibrated by Adam's apple, the voice of user is therefrom extracted by Application on Voiceprint Recognition unit, is established and is used Person's sound bank, the sound of the single speaker of identification that can be more accurate, user's sound bank and contrast phone library are carried out Comparison, preferably ignores other people one's voices in speech, can filter environment noise thoroughly, single identification, real time translation, translation is more Accurately, effect is more preferable.

Detailed description of the invention

Present invention will be further explained below with reference to the attached drawings and examples.

Fig. 1 is schematic diagram of the invention；

Fig. 2 is the schematic diagram of sound collection of the present invention.

Specific embodiment

In order to illustrate more clearly of technical solution of the present invention, following further describes the present invention with reference to the drawings, It should be evident that drawings described below is only one embodiment of the present of invention, for those of ordinary skill in the art For, without creative efforts, other embodiments are obtained according to this drawings and examples, belong to this hair Bright protection scope.

A kind of single identification real-time translation system based on intelligent sound, including sound acquisition module, sound processing module, Achieve writing module and putting module:

Detailed, the sound acquisition module is miniature headset, earphone, microphone and external touching including being inserted into human ear And the vibration sensor of throat, the sound processing module carry out feature knowledge to user's vocal print with Application on Voiceprint Recognition unit Not, the miniature headset is connect by bluetooth or network with control device, and the control device is controlled by computer and connected, described Noise includes cough, sneezing, sound of the wind, browses sound, the working-flow are as follows:

F. putting module is as needed, and manuscript is launched out by display, is broadcast using speech synthesis player It puts.

Above embodiments are only exemplary embodiment of the present invention, are not used in the limitation present invention, protection scope of the present invention It is defined by the claims.Those skilled in the art can within the spirit and scope of the present invention make respectively the present invention Kind modification or equivalent replacement, this modification or equivalent replacement also should be regarded as being within the scope of the present invention.

Claims

1. a kind of single identification real-time translation system based on intelligent sound, including sound acquisition module, sound processing module, deposit Shelves writing module and putting module, it is characterised in that:

S2. the sound processing module receives the voice signal of sound acquisition module, identifies the sound and category of language of user, User's sound bank and contrast phone library are established, the sound of noise and other people is rejected, the voice signal that needs are translated is carried out Processing obtains the corresponding text of language described in user and sends archive writing module for text information；

S3. the archive writing module will acquire character translation into object language, and switch to the manuscript storage of minutes mode；

S4. manuscript is launched out or be played out using speech synthesis player by the putting module by display.

2. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute Stating sound acquisition module is miniature headset, earphone, microphone and the external vibrating sensing for touching throat including being inserted into human ear Device.

3. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute It states sound processing module and feature identification is carried out to user's vocal print with Application on Voiceprint Recognition unit.

4. a kind of single identification real-time translation system based on intelligent sound according to claim 2, which is characterized in that institute It states miniature headset to connect by bluetooth or network with control device, the control device is controlled by computer and connected.

5. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute Noise is stated to include cough, sneezing, sound of the wind, browse sound.

6. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute State working-flow are as follows:

A. preparation stage, user are in quieter environment and wear miniature headset and carry out trying to speak, the miniature headset The external vibration sensor for touching throat experiences throat and vibrates and record oscillation intensity, and the miniature headset is by acquisition Voice signal passes to sound recognition module, the sound and category of language of the sound recognition module identification user, and vocal print is known Other unit carries out feature identification to user's vocal print and establishes user's sound bank；

B. formal in use, vibration sensor cannot experience the vibration of throat when user speaks when wearer is silent, Hum in miniature headset acquisition ambient enviroment and the voice in addition to user and the effect in sound recognition module at this time Under establish contrast phone library；

C. when wearer loquiturs, the external vibration sensor of the miniature headset is experienced throat when user speaks and is produced The vibration of raw sufficient intensity, and the judgement for the sound bank that sound processing module should be stored in voice signal is triggered, sound is adopted The voice signal that collection module is sent therefrom is extracted the audio deposit user for meeting user's vocal print feature by sound processing module Sound bank, and user's sound bank and contrast phone library are compared, filtering noise is further rejected, obtains what needs were translated Voice signal；

D. sound processing module carries out framing and feature extraction to the voice signal that needs are translated, with linear prediction residue error Every frame sound is become into the multi-C vector comprising acoustic information, audio data is obtained, passes through acoustic model, dictionary and language model Text output is carried out to the audio data after feature extraction, obtain the corresponding text of language described in user and sends out text information It is sent to archive writing module；

E. it is regular to text progress to achieve writing module, is organized into the sentence for meeting specification, it will be in the character cell of input, sequence Cloud is passed, object language needed for translating into user is switched to the manuscript of minutes mode and storage；

F. putting module is as needed, manuscript is launched out by display, or broadcast using speech synthesis player It puts.