CN110516265A - A kind of single identification real-time translation system based on intelligent sound - Google Patents

A kind of single identification real-time translation system based on intelligent sound Download PDF

Info

Publication number
CN110516265A
CN110516265A CN201910819189.XA CN201910819189A CN110516265A CN 110516265 A CN110516265 A CN 110516265A CN 201910819189 A CN201910819189 A CN 201910819189A CN 110516265 A CN110516265 A CN 110516265A
Authority
CN
China
Prior art keywords
sound
user
module
voice signal
system based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910819189.XA
Other languages
Chinese (zh)
Inventor
张磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Guli Internet Technology Co Ltd
Original Assignee
Qingdao Guli Internet Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Guli Internet Technology Co Ltd filed Critical Qingdao Guli Internet Technology Co Ltd
Priority to CN201910819189.XA priority Critical patent/CN110516265A/en
Publication of CN110516265A publication Critical patent/CN110516265A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces

Abstract

The invention discloses a kind of single identification real-time translation system based on intelligent sound, it is related to the communications field, including sound acquisition module, sound processing module, achieve writing module and putting module, sound acquisition module collected sound signal simultaneously transmits, sound processing module receives and processing voice signal, user's throat is experienced according to vibration sensor whether vibrate establish user's sound bank and contrast phone library, feature identification is carried out to user's vocal print, obtain user's voice signal, it is compared with contrast phone library, further reject the voice of environment noise and other people, obtain the text information of language described in user, writing module is achieved by character translation into object language, and switch to the manuscript storage of minutes mode, manuscript is shown or is played by putting module.The present invention accurately identifies the sound of speaking of single use person, ignores other people one's voices in speech, then environment noise is filtered, single identification, real time translation, and translation is more accurate, and effect is more preferable.

Description

A kind of single identification real-time translation system based on intelligent sound
Technical field
The present invention relates to the communications field, especially a kind of single identification real-time translation system based on intelligent sound.
Background technique
With the development of science and technology the increasingly many and diverse and information content of international exchange sharply increases, the obstacle between different language Caused Information Problems increasingly influence the efficiency linked up, and human translation is needed to turn over the speech of participant in real time It translates, not only human cost is high, but also often occurs during human translation because interrupting caused by thinking.In the prior art Real time translation is carried out to the speech in meeting using machine translation mode, but due to sound source complex, noise in meeting room It is more, it is directly acquired using microphone, whether be user one's voice in speech, therefore collected sound presss from both sides if can not accurately identify Miscellaneous a large amount of independent voices cause sound identification and translation mistake occur, are unfavorable for real time translation, and the sound without the noise reduction process that cleans The directly processing of message breath is easier to cause to identify mistake.
Summary of the invention
In order to overcome the drawbacks described above in the presence of the prior art, the present invention provides a kind of based on the single of intelligent sound Identify real-time translation system.The technical solution adopted by the present invention to solve the technical problems is: a kind of list based on intelligent sound One identification real-time translation system, including sound acquisition module, sound processing module, archive writing module and putting module:
S1. the sound acquisition module collected sound signal and sound processing module is sent by voice signal;
S2. the sound processing module receives the voice signal of sound acquisition module, identifies the sound and language of user Type establishes user's sound bank and contrast phone library, rejects the sound of noise and other people, the voice signal translated to needs It is handled, obtain the corresponding text of language described in user and sends archive writing module for text information;
S3. the archive writing module by character translation at object language, and switch to minutes mode manuscript storage;
S4. manuscript is launched out or be broadcast using speech synthesis player by the putting module by display It puts.
A kind of above-mentioned single identification real-time translation system based on intelligent sound, the sound acquisition module are miniature ear Wheat, earphone, microphone and the external vibration sensor for touching throat including being inserted into human ear.
A kind of above-mentioned single identification real-time translation system based on intelligent sound, the sound processing module of stating is with sound Line recognition unit carries out feature identification to user's vocal print.
A kind of above-mentioned single identification real-time translation system based on intelligent sound, the miniature headset pass through bluetooth or net Network is connect with control device, and the control device is controlled by computer and connected.
A kind of above-mentioned single identification real-time translation system based on intelligent sound, the noise include cough, sneezing, Sound of the wind browses sound.
A kind of above-mentioned single identification real-time translation system based on intelligent sound, the working-flow are as follows:
A. the preparation stage, user is in quieter environment and wears miniature headset and carry out trying to speak, described miniature The external vibration sensor for touching throat of headset experiences vibration and records oscillation intensity, and the miniature headset will be adopted The voice signal of collection passes to sound recognition module, the sound and category of language of the sound recognition module identification user, sound Line recognition unit carries out feature identification to user's vocal print and establishes user's sound bank;
B. formally in use, vibration sensor cannot experience throat when user speaks when wearer is silent Vibration, miniature headset acquires the hum in ambient enviroment and the voice in addition to user and in sound recognition module at this time Contrast phone library is established under effect;
C. when wearer loquiturs, the external vibration sensor of the miniature headset experiences throat when user speaks The vibration of generated sufficient intensity, and trigger the judgement for the sound bank that sound processing module should be stored in voice signal, sound The voice signal that sound acquisition module is sent therefrom extracted by sound processing module meet user's vocal print feature audio deposit make User's sound bank, and user's sound bank and contrast phone library are compared, filtering noise is further rejected, obtains and needs to turn over The voice signal translated;
D. sound processing module carries out framing and feature extraction to the voice signal that needs are translated, with linear prediction cepstrum coefficient Every frame sound is become the multi-C vector comprising acoustic information by coefficient, is obtained audio data, is passed through acoustic model, dictionary and language Model carries out text output to the audio data after feature extraction, obtains the corresponding text of language described in user and believes text Breath is sent to archive writing module;
E. it is regular to text progress to achieve writing module, the sentence for meeting specification is organized into, by the character cell of input, sequence Column upload cloud, and object language needed for translating into user is switched to the manuscript of minutes mode and storage;
F. putting module is as needed, and manuscript is launched out by display, or using speech synthesis player into Row plays.
The invention has the advantages that the present invention, which passes through, judges that the vibration of Adam's apple triggers sound processing module and sound is believed The judgement of number processing mode, voice signal collected is stored in contrast phone library as a comparison when being not felt by Adam's apple vibration, feels Voice signal collected when being vibrated by Adam's apple, the voice of user is therefrom extracted by Application on Voiceprint Recognition unit, is established and is used Person's sound bank, the sound of the single speaker of identification that can be more accurate, user's sound bank and contrast phone library are carried out Comparison, preferably ignores other people one's voices in speech, can filter environment noise thoroughly, single identification, real time translation, translation is more Accurately, effect is more preferable.
Detailed description of the invention
Present invention will be further explained below with reference to the attached drawings and examples.
Fig. 1 is schematic diagram of the invention;
Fig. 2 is the schematic diagram of sound collection of the present invention.
Specific embodiment
In order to illustrate more clearly of technical solution of the present invention, following further describes the present invention with reference to the drawings, It should be evident that drawings described below is only one embodiment of the present of invention, for those of ordinary skill in the art For, without creative efforts, other embodiments are obtained according to this drawings and examples, belong to this hair Bright protection scope.
A kind of single identification real-time translation system based on intelligent sound, including sound acquisition module, sound processing module, Achieve writing module and putting module:
S1. the sound acquisition module collected sound signal and sound processing module is sent by voice signal;
S2. the sound processing module receives the voice signal of sound acquisition module, identifies the sound and language of user Type establishes user's sound bank and contrast phone library, rejects the sound of noise and other people, the voice signal translated to needs It is handled, obtain the corresponding text of language described in user and sends archive writing module for text information;
S3. the archive writing module by character translation at object language, and switch to minutes mode manuscript storage;
S4. manuscript is launched out or be broadcast using speech synthesis player by the putting module by display It puts.
Detailed, the sound acquisition module is miniature headset, earphone, microphone and external touching including being inserted into human ear And the vibration sensor of throat, the sound processing module carry out feature knowledge to user's vocal print with Application on Voiceprint Recognition unit Not, the miniature headset is connect by bluetooth or network with control device, and the control device is controlled by computer and connected, described Noise includes cough, sneezing, sound of the wind, browses sound, the working-flow are as follows:
A. the preparation stage, user is in quieter environment and wears miniature headset and carry out trying to speak, described miniature The external vibration sensor for touching throat of headset experiences vibration and records oscillation intensity, and the miniature headset will be adopted The voice signal of collection passes to sound recognition module, the sound and category of language of the sound recognition module identification user, sound Line recognition unit carries out feature identification to user's vocal print and establishes user's sound bank;
B. formally in use, vibration sensor cannot experience throat when user speaks when wearer is silent Vibration, miniature headset acquires the hum in ambient enviroment and the voice in addition to user and in sound recognition module at this time Contrast phone library is established under effect;
C. when wearer loquiturs, the external vibration sensor of the miniature headset experiences throat when user speaks The vibration of generated sufficient intensity, and trigger the judgement for the sound bank that sound processing module should be stored in voice signal, sound The voice signal that sound acquisition module is sent therefrom extracted by sound processing module meet user's vocal print feature audio deposit make User's sound bank, and user's sound bank and contrast phone library are compared, filtering noise is further rejected, obtains and needs to turn over The voice signal translated;
D. sound processing module carries out framing and feature extraction to the voice signal that needs are translated, with linear prediction cepstrum coefficient Every frame sound is become the multi-C vector comprising acoustic information by coefficient, is obtained audio data, is passed through acoustic model, dictionary and language Model carries out text output to the audio data after feature extraction, obtains the corresponding text of language described in user and believes text Breath is sent to archive writing module;
E. it is regular to text progress to achieve writing module, the sentence for meeting specification is organized into, by the character cell of input, sequence Column upload cloud, and object language needed for translating into user is switched to the manuscript of minutes mode and storage;
F. putting module is as needed, and manuscript is launched out by display, is broadcast using speech synthesis player It puts.
Above embodiments are only exemplary embodiment of the present invention, are not used in the limitation present invention, protection scope of the present invention It is defined by the claims.Those skilled in the art can within the spirit and scope of the present invention make respectively the present invention Kind modification or equivalent replacement, this modification or equivalent replacement also should be regarded as being within the scope of the present invention.

Claims (6)

1. a kind of single identification real-time translation system based on intelligent sound, including sound acquisition module, sound processing module, deposit Shelves writing module and putting module, it is characterised in that:
S1. the sound acquisition module collected sound signal and sound processing module is sent by voice signal;
S2. the sound processing module receives the voice signal of sound acquisition module, identifies the sound and category of language of user, User's sound bank and contrast phone library are established, the sound of noise and other people is rejected, the voice signal that needs are translated is carried out Processing obtains the corresponding text of language described in user and sends archive writing module for text information;
S3. the archive writing module will acquire character translation into object language, and switch to the manuscript storage of minutes mode;
S4. manuscript is launched out or be played out using speech synthesis player by the putting module by display.
2. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute Stating sound acquisition module is miniature headset, earphone, microphone and the external vibrating sensing for touching throat including being inserted into human ear Device.
3. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute It states sound processing module and feature identification is carried out to user's vocal print with Application on Voiceprint Recognition unit.
4. a kind of single identification real-time translation system based on intelligent sound according to claim 2, which is characterized in that institute It states miniature headset to connect by bluetooth or network with control device, the control device is controlled by computer and connected.
5. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute Noise is stated to include cough, sneezing, sound of the wind, browse sound.
6. a kind of single identification real-time translation system based on intelligent sound according to claim 1, which is characterized in that institute State working-flow are as follows:
A. preparation stage, user are in quieter environment and wear miniature headset and carry out trying to speak, the miniature headset The external vibration sensor for touching throat experiences throat and vibrates and record oscillation intensity, and the miniature headset is by acquisition Voice signal passes to sound recognition module, the sound and category of language of the sound recognition module identification user, and vocal print is known Other unit carries out feature identification to user's vocal print and establishes user's sound bank;
B. formal in use, vibration sensor cannot experience the vibration of throat when user speaks when wearer is silent, Hum in miniature headset acquisition ambient enviroment and the voice in addition to user and the effect in sound recognition module at this time Under establish contrast phone library;
C. when wearer loquiturs, the external vibration sensor of the miniature headset is experienced throat when user speaks and is produced The vibration of raw sufficient intensity, and the judgement for the sound bank that sound processing module should be stored in voice signal is triggered, sound is adopted The voice signal that collection module is sent therefrom is extracted the audio deposit user for meeting user's vocal print feature by sound processing module Sound bank, and user's sound bank and contrast phone library are compared, filtering noise is further rejected, obtains what needs were translated Voice signal;
D. sound processing module carries out framing and feature extraction to the voice signal that needs are translated, with linear prediction residue error Every frame sound is become into the multi-C vector comprising acoustic information, audio data is obtained, passes through acoustic model, dictionary and language model Text output is carried out to the audio data after feature extraction, obtain the corresponding text of language described in user and sends out text information It is sent to archive writing module;
E. it is regular to text progress to achieve writing module, is organized into the sentence for meeting specification, it will be in the character cell of input, sequence Cloud is passed, object language needed for translating into user is switched to the manuscript of minutes mode and storage;
F. putting module is as needed, manuscript is launched out by display, or broadcast using speech synthesis player It puts.
CN201910819189.XA 2019-08-31 2019-08-31 A kind of single identification real-time translation system based on intelligent sound Pending CN110516265A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910819189.XA CN110516265A (en) 2019-08-31 2019-08-31 A kind of single identification real-time translation system based on intelligent sound

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910819189.XA CN110516265A (en) 2019-08-31 2019-08-31 A kind of single identification real-time translation system based on intelligent sound

Publications (1)

Publication Number Publication Date
CN110516265A true CN110516265A (en) 2019-11-29

Family

ID=68629934

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910819189.XA Pending CN110516265A (en) 2019-08-31 2019-08-31 A kind of single identification real-time translation system based on intelligent sound

Country Status (1)

Country Link
CN (1) CN110516265A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111601208A (en) * 2020-06-24 2020-08-28 佛山科学技术学院 Noise reduction translation earphone and translation method thereof
CN112967723A (en) * 2021-02-01 2021-06-15 珠海格力电器股份有限公司 Identity confirmation method and control device, and sleep parameter detection method and control device
WO2021134284A1 (en) * 2019-12-30 2021-07-08 深圳市欢太科技有限公司 Voice information processing method, hub device, control terminal and storage medium
CN113413613A (en) * 2021-06-17 2021-09-21 网易(杭州)网络有限公司 Method and device for optimizing voice chat in game, electronic equipment and medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002204489A (en) * 2000-12-28 2002-07-19 Nec Saitama Ltd Earphone microphone
CN201532762U (en) * 2009-06-04 2010-07-21 成都信息工程学院 Simultaneous interpretation device special for individuals
CN101950564A (en) * 2010-10-13 2011-01-19 镇江华扬信息科技有限公司 Remote digital voice acquisition, analysis and identification system
CN106454605A (en) * 2016-11-30 2017-02-22 南京小脚印网络科技有限公司 Intelligent translation earphone system
CN106486125A (en) * 2016-09-29 2017-03-08 安徽声讯信息技术有限公司 A kind of simultaneous interpretation system based on speech recognition technology
CN108010524A (en) * 2017-12-04 2018-05-08 深圳市沃特沃德股份有限公司 Speech translation system and method
CN108447497A (en) * 2018-03-07 2018-08-24 陈勇 A method of independently going out oneself sounding in noisy environment
CN109033092A (en) * 2018-06-13 2018-12-18 深圳市思创达塑胶模具有限公司 A kind of real-time translation system, method and interpreting equipment
CN109376363A (en) * 2018-09-04 2019-02-22 出门问问信息科技有限公司 A kind of real-time voice interpretation method and device based on earphone
CN109545216A (en) * 2018-12-28 2019-03-29 合肥凯捷技术有限公司 A kind of audio recognition method and speech recognition system
CN109686363A (en) * 2019-02-26 2019-04-26 深圳市合言信息科技有限公司 A kind of on-the-spot meeting artificial intelligence simultaneous interpretation equipment
CN109977427A (en) * 2019-03-12 2019-07-05 东华大学 A kind of miniature wearable real time translator

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002204489A (en) * 2000-12-28 2002-07-19 Nec Saitama Ltd Earphone microphone
CN201532762U (en) * 2009-06-04 2010-07-21 成都信息工程学院 Simultaneous interpretation device special for individuals
CN101950564A (en) * 2010-10-13 2011-01-19 镇江华扬信息科技有限公司 Remote digital voice acquisition, analysis and identification system
CN106486125A (en) * 2016-09-29 2017-03-08 安徽声讯信息技术有限公司 A kind of simultaneous interpretation system based on speech recognition technology
CN106454605A (en) * 2016-11-30 2017-02-22 南京小脚印网络科技有限公司 Intelligent translation earphone system
CN108010524A (en) * 2017-12-04 2018-05-08 深圳市沃特沃德股份有限公司 Speech translation system and method
CN108447497A (en) * 2018-03-07 2018-08-24 陈勇 A method of independently going out oneself sounding in noisy environment
CN109033092A (en) * 2018-06-13 2018-12-18 深圳市思创达塑胶模具有限公司 A kind of real-time translation system, method and interpreting equipment
CN109376363A (en) * 2018-09-04 2019-02-22 出门问问信息科技有限公司 A kind of real-time voice interpretation method and device based on earphone
CN109545216A (en) * 2018-12-28 2019-03-29 合肥凯捷技术有限公司 A kind of audio recognition method and speech recognition system
CN109686363A (en) * 2019-02-26 2019-04-26 深圳市合言信息科技有限公司 A kind of on-the-spot meeting artificial intelligence simultaneous interpretation equipment
CN109977427A (en) * 2019-03-12 2019-07-05 东华大学 A kind of miniature wearable real time translator

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021134284A1 (en) * 2019-12-30 2021-07-08 深圳市欢太科技有限公司 Voice information processing method, hub device, control terminal and storage medium
CN111601208A (en) * 2020-06-24 2020-08-28 佛山科学技术学院 Noise reduction translation earphone and translation method thereof
CN112967723A (en) * 2021-02-01 2021-06-15 珠海格力电器股份有限公司 Identity confirmation method and control device, and sleep parameter detection method and control device
CN113413613A (en) * 2021-06-17 2021-09-21 网易(杭州)网络有限公司 Method and device for optimizing voice chat in game, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN110516265A (en) A kind of single identification real-time translation system based on intelligent sound
CN108159702B (en) Multi-player voice game processing method and device
US20130211826A1 (en) Audio Signals as Buffered Streams of Audio Signals and Metadata
KR101108114B1 (en) System and method for analyzing continuous sound, expressing emotion and producing communication about pet dog
CN109074806A (en) Distributed audio output is controlled to realize voice output
CN105206271A (en) Intelligent equipment voice wake-up method and system for realizing method
CA2311439A1 (en) Conversational data mining
CN107112026A (en) System, the method and apparatus for recognizing and handling for intelligent sound
CN107210040A (en) The operating method of phonetic function and the electronic equipment for supporting this method
CN100592749C (en) Conversation assisting system and method
CN108198569A (en) A kind of audio-frequency processing method, device, equipment and readable storage medium storing program for executing
CN110097890A (en) A kind of method of speech processing, device and the device for speech processes
US11862153B1 (en) System for recognizing and responding to environmental noises
US20210343270A1 (en) Speech translation method and translation apparatus
CN107093421A (en) A kind of speech simulation method and apparatus
CN107277276A (en) One kind possesses voice control function smart mobile phone
CN110223711A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
CN110111776A (en) Interactive voice based on microphone signal wakes up electronic equipment, method and medium
JP2009178783A (en) Communication robot and its control method
CN116420188A (en) Speech filtering of other speakers from call and audio messages
CN110232909A (en) A kind of audio-frequency processing method, device, equipment and readable storage medium storing program for executing
TW201826167A (en) Method for face expression feedback and intelligent robot
JP7400364B2 (en) Speech recognition system and information processing method
CN107322593A (en) Can outdoor moving company family endowment robot
TW200413961A (en) Device using handheld communication equipment to calculate and process natural language and method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191129