CN109389970A

CN109389970A - A kind of speech analysis recognition methods

Info

Publication number: CN109389970A
Application number: CN201811616352.4A
Authority: CN
Inventors: 孙昊; 程庚; 陈雅芹; 黄健; 殷国龙
Original assignee: Hefei Kaijie Technology Co Ltd
Current assignee: Hefei Kaijie Technology Co Ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2019-02-26

Abstract

The invention discloses a kind of speech analysis recognition methods, and specifically include following operating procedure: S1: voice training, input speech signal is identified under quiet environment when first used, and analyze record sound characteristic parameter as target sound parameter；S2: user voice information is obtained, primary speech signal is calculated as；S3: processing is extracted to primary speech signal, and identifies sound characteristic parameter, the acoustic information of target sound characteristic parameter is extracted；S4: the acoustic information after extraction process is identified.The ambient noise that the present invention is inputted by intercepting the ambient enviroment voice signal inputted in the closest time with voice as this voice, and the ambient noise signal is removed in subsequent voice analysis processing, improve the accuracy of echo signal, by the way that targeted voice signal is carried out segment store, be conducive to carry out delay process to a large amount of targeted voice signal, avoid the problem that disposably inputting too long voice and leading to not Correct Analysis identification.

Description

A kind of speech analysis recognition methods

Technical field

The invention belongs to technical field of voice recognition, and in particular to a kind of speech analysis recognition methods.

Background technique

Speech recognition is a cross discipline.In the latest 20 years, speech recognition technology obtains marked improvement, starts from experiment Move towards market in room.It is contemplated that speech recognition technology will enter industry, household electrical appliances, communication, automotive electronics, doctor in coming 10 years The every field such as treatment, home services, consumption electronic product.Voice letter of the existing audio recognition method in the case where handling complex environment Number when, there is a problem of targeted voice signal obtain inaccuracy, especially under conditions of thering is third party to be simultaneously emitted by voice, nothing Method distinguishes targeted voice signal, poor to the extraction accuracy of targeted voice signal, and in input voice every time, surrounding ring Border noise has the characteristics that change in different periods, can also interfere to the acquisition of targeted voice signal.

Summary of the invention

The purpose of the present invention is to provide a kind of speech analysis recognition methods, to solve mentioned above in the background art ask Topic.

To achieve the above object, the invention provides the following technical scheme: a kind of speech analysis recognition methods, specifically include as Lower operating procedure:

S1: voice training, input speech signal is identified under quiet environment when first used, and analyzes record sound characteristic Parameter is as target sound parameter；

S2: obtaining user voice information, be calculated as primary speech signal, with the storage of wav formatted file；

S3: processing is extracted to primary speech signal, sound frequency is analyzed, and identify sound characteristic parameter, to target sound The acoustic information of characteristic parameter extracts；

S4: the acoustic information after extraction process is identified, and by recognition result and speech database and grammar database into Row comparison, obtains speech recognition result.

Preferably, primary speech signal includes targeted voice signal, ambient noise signal and third party's voice in the S2 Signal, wherein the characteristic parameter of ambient noise signal is ambient noise extracts gained in 0.3-0.5 seconds before targeted voice signal.

Preferably, original Speech processing includes the frequency wave for analyzing primary speech signal in the S3, extracts sound Parameters,acoustic, including ambient noise signal feature, calculating primary speech signal feature, identification targeted voice signal feature and identification Third party's phonetic feature, the influence of wiping out background noise, and enhancing processing is carried out to targeted voice signal, improve target language message Number identifiability, improve the accuracy of targeted voice signal.

Preferably, preferred that segmentation dividing processing is carried out to voice signal when being identified to targeted voice signal, and will Sound bite after segmentation saves in order, while identifying according to preservation sequence to sound bite, and recognition result is pressed Sequential output.

Technical effect and advantage of the invention:

The present invention passes through surrounding's ring before user inputs voice every time, in the interception time closest with voice input time The ambient noise that border voice signal is inputted as this voice, and remove the ambient noise signal in subsequent voice analysis processing It removes, the acquisition accuracy of echo signal can be effectively improved, reduce the interference of environmental noise, by believing the sound in original signal Number carry out audio parameter signature analysis, obtain targeted voice signal, can be by the way that targeted voice signal be carried out segment store, favorably In carrying out delay process to a large amount of targeted voice signal, the capacity of speech recognition is improved, avoids disposably inputting too long voice And the problem of leading to not Correct Analysis identification.

Specific embodiment

Below in conjunction with the embodiment of the present invention, technical scheme in the embodiment of the invention is clearly and completely described, Obviously, described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Based in the present invention Embodiment, every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all Belong to the scope of protection of the invention.

A kind of speech analysis recognition methods, specifically includes following operating procedure:

Primary speech signal includes targeted voice signal, ambient noise signal and third party's voice signal in the S2, The characteristic parameter of middle ambient noise signal is ambient noise extracts gained in 0.3-0.5 seconds before targeted voice signal.When user exists When open air uses speech recognition system, in external environment there are certain environmental noise can together with the voice messaging of user it is defeated Enter, therefore, surrounding's ring before user inputs voice every time, in the interception 0.3-0.5 second closest with voice input time The ambient noise that border voice signal is inputted as this voice, and remove the ambient noise signal in subsequent voice analysis processing It removes, the acquisition accuracy of echo signal can be effectively improved, reduce the interference of environmental noise.

Original Speech processing includes the frequency wave for analyzing primary speech signal in the S3, extracts the acoustics ginseng of sound Number, including ambient noise signal feature, calculating primary speech signal feature, identification targeted voice signal feature and identification third party Phonetic feature, the influence of wiping out background noise, and carry out enhancing processing to targeted voice signal, that improves targeted voice signal can Identity improves the accuracy of targeted voice signal.Each primary speech signal is subjected to analysis extraction, obtain and is deposited in advance The targeted voice signal that voice signal similar in the audio parameter feature of targeted voice signal is the input of this voice is stored up, and to this Targeted voice signal is analyzed and processed, and third-party voice signal is certain poor with the presence of pre-stored targeted voice signal It is different, therefore, ignore third party's voice signal automatically, improves the precision of targeted voice signal.

It is preferred that segmentation dividing processing is carried out to voice signal when being identified to targeted voice signal, and will be after segmentation Sound bite save in order, while sound bite is identified according to preservation sequence, and recognition result is defeated in order Out.It can be conducive to carry out delay process to a large amount of targeted voice signal, mention by the way that targeted voice signal is carried out segment store The capacity of high speech recognition avoids the problem that disposably inputting too long voice and leading to not Correct Analysis identification.

The present invention passes through the week before user inputs voice every time, in the interception time closest with voice input time The ambient noise that environmental sound signal is inputted as this voice is enclosed, and believes the ambient noise in subsequent voice analysis processing Number removal, can effectively improve the acquisition accuracy of echo signal, reduce the interference of environmental noise, by by the sound in original signal Sound signal carry out audio parameter signature analysis, obtain targeted voice signal, can by the way that targeted voice signal is carried out segment store, Be conducive to carry out delay process to a large amount of targeted voice signal, improve the capacity of speech recognition, avoid disposably inputting too long Voice and lead to not Correct Analysis identification the problem of.

Finally, it should be noted that the foregoing is only a preferred embodiment of the present invention, it is not intended to restrict the invention, Although the present invention is described in detail referring to the foregoing embodiments, for those skilled in the art, still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features, All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in of the invention Within protection scope.

Claims

1. a kind of speech analysis recognition methods, which is characterized in that specifically include following operating procedure:

2. a kind of speech analysis recognition methods according to claim 1, it is characterised in that: primary speech signal in the S2 Including targeted voice signal, ambient noise signal and third party's voice signal, wherein the characteristic parameter of ambient noise signal is mesh Ambient noise extracts gained in 0.3-0.5 seconds before poster sound signal.

3. a kind of speech analysis recognition methods according to claim 1, it is characterised in that: primary speech signal in the S3 Processing includes the frequency wave of analysis primary speech signal, extracts the parameters,acoustic of sound, including ambient noise signal feature, calculating original Beginning phonic signal character, identification targeted voice signal feature and identification third party's phonetic feature, the influence of wiping out background noise, and Enhancing processing is carried out to targeted voice signal.

4. a kind of speech analysis recognition methods according to claim 3, it is characterised in that: carried out to targeted voice signal When identification, first choice carries out segmentation dividing processing to voice signal, and the sound bite after segmentation is saved in order, while according to Preservation sequence identifies sound bite, and recognition result is exported in order.