CN107170444A

CN107170444A - Aviation cockpit environment self-adaption phonetic feature model training method

Info

Publication number: CN107170444A
Application number: CN201710450397.8A
Authority: CN
Inventors: 温泉; 姚竞; 黄梅娇
Original assignee: Shanghai Aviation Electric Co Ltd
Current assignee: Shanghai Aviation Electric Co Ltd
Priority date: 2017-06-15
Filing date: 2017-06-15
Publication date: 2017-09-15

Abstract

The present invention discloses aviation cockpit environment self-adaption phonetic feature model training method, includes, step S1, the personal adaptive voice feature of collection；Step S2 is marked there is provided personal adaptive voice；There is provided foundation characteristic model by step S3；Step S4, using deep neural network（DNN）Adaptive algorithm, updates foundation characteristic model, to generate adaptive model with reference to the corresponding personal adaptive voice mark of personal adaptive voice feature；Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition；And, step S6, model packing, generation personal characteristics storehouse.The advantage of the invention is that：Personal adaptive voice feature is updated into foundation characteristic model, the higher adaptive model of recognition capability is generated, can effectively improve the discrimination of avionics voice products has remarkable result.

Description

Aviation cockpit environment self-adaption phonetic feature model training method

Technical field

The present invention relates to field of speech recognition, especially aviation cockpit environment self-adaption phonetic feature model training method, There is remarkable result for the discrimination for improving avionics voice products.

Background technology

With developing rapidly for electronic technology and aircraft technology, the man-machine system technical field based on cognition/perception is not Come one of big field of avionics key technology ten, and speech recognition technology is based in cognition/perception man-machine system technology very An important key technology.At present, existing speech recognition is designed primarily directed to received pronunciation, if the language of human pilot Sound is not up to standard or with personal touch, and often discrimination is relatively low.Speech recognition technology how is set truly to help to drive Can personnel complete the control to aircraft, the key of practical application is obtained as the technology.

The content of the invention

The present invention seeks to overcome the problem of discrimination is relatively low in the prior art, there is provided a kind of new aviation cabin ambient Adaptive voice characteristic model training method.

In order to realize this purpose, technical scheme is as follows：Aviation cockpit environment self-adaption phonetic feature model Training method, includes,

Step S1, the personal adaptive voice feature of collection：

Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input；

Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, and simulation aviation cabin ambient enters Row collection；

Step S13, extracts personal adaptive voice feature：

Step S131, obtains frame number speech data；

Step S132, every 400 sampled points of frame are designed as by frame number speech data；

Step S133, using 75 dimension Mel frequency marking coefficients（MFC）As speech characteristic parameter, per frame speech characteristic parameter 75 × 16bit；

Step S2 is marked there is provided personal adaptive voice：

Step S21, the personal adaptive text data of input；

Step S22, according to the standard criterion of pronunciation dictionary, by the content of text that text data is related to personal adaptively according to phoneme Status list is converted into the phoneme notation form of three-tone structure；

There is provided foundation characteristic model by step S3：

Foundation characteristic model is multilayer deep neural network（DNN）Model, input layer is phonetic feature supporting in step S1, defeated It is voice annotation supporting in step S2 to go out layer；

Step S4, using deep neural network（DNN）Adaptive algorithm, with reference to corresponding of personal adaptive voice feature People's adaptive voice mark updates foundation characteristic model, to generate adaptive model；And,

Step S6, model packing, generation personal characteristics storehouse.

As the preferred scheme of aviation cockpit environment self-adaption phonetic feature model training method, foundation characteristic model passes through Extensive speech data training, it is preferable that training speech data time>3000 hours.

As the preferred scheme of aviation cockpit environment self-adaption phonetic feature model training method, between step S4 and step S6 Also there is step S5,

Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition；

Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech is life Make word；

Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133（MFC）Feature；

Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, obtains test text Notebook data, by being contrasted with the voice annotation in step 52, obtains the recognition performance of adaptive model.

Compared with prior art, advantages of the present invention at least that：Personal adaptive voice feature is updated into foundation characteristic Model, the higher adaptive model of generation recognition capability, can effectively improve the discrimination of avionics voice products has aobvious Write effect.

Brief description of the drawings

Fig. 1 is the schematic flow sheet of one embodiment of the invention.

Embodiment

The present invention is described in further detail below by specific embodiment combination accompanying drawing.

Fig. 1 is referred to, aviation cockpit environment self-adaption phonetic feature model training method is shown in the figure.This method according to Secondary execution following steps：

Step S1, the personal adaptive voice feature of collection.

Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input.

Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, simulates aviation cockpit ring Border is acquired.

Step S13, extracts personal adaptive voice feature：

Step S131, obtains frame number speech data.

Step S132, every 400 sampled points of frame are designed as by frame number speech data.

Step S133, using 75 dimension Mel frequency marking coefficients（MFC）As speech characteristic parameter, per frame speech characteristic parameter 75 ×16bit。

Step S2 is marked there is provided personal adaptive voice.

Step S21, the personal adaptive text data of input.

Step S22, according to the standard criterion of pronunciation dictionary, the content of text that personal adaptive text data is related to according to Phoneme state list is converted into the phoneme notation form of three-tone structure.

There is provided foundation characteristic model by step S3.

Foundation characteristic model is multilayer deep neural network（DNN）Model, input layer is special for voice supporting in step S1 Levy, output layer is voice annotation supporting in step S2.Foundation characteristic model is trained by extensive speech data（>3000 is small When）.

Step S4, using deep neural network（DNN）Adaptive algorithm, it is corresponding with reference to personal adaptive voice feature Personal adaptive voice mark update foundation characteristic model, to generate adaptive model.

Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition.

Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech For order word.

Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133（MFC）Feature.

Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, is surveyed Text data is tried, by being contrasted with the voice annotation in step 52, the recognition performance of adaptive model is obtained.

Step S6, model packing, generation personal characteristics storehouse.

Wherein, multilayer deep neural network（DNN）Model is carried out using multilayer neural network for speaker's pronunciation characteristic Nonlinear fitting, has robustness high relative to conventional model, and noiseproof feature is strong, the characteristics of discrimination is high.Using speaking on a small quantity People's voice carries out adaptively, DNN models being made to more conform to speaker's feature for standard DNN models.

Embodiments of the present invention are only expressed above, it describes more specific and detailed, but and can not therefore understand For the limitation to patent of invention scope.It should be pointed out that for the person of ordinary skill of the art, not departing from this hair On the premise of bright design, various modifications and improvements can be made, these belong to protection scope of the present invention.Therefore, this hair The protection domain of bright patent should be determined by the appended claims.

Claims

1. aviation cockpit environment self-adaption phonetic feature model training method, it is characterised in that include,

Step S1, the personal adaptive voice feature of collection：

Step S13, extracts personal adaptive voice feature：

Step S131, obtains frame number speech data；

Step S2 is marked there is provided personal adaptive voice：

Step S21, the personal adaptive text data of input；

There is provided foundation characteristic model by step S3：

Step S6, model packing, generation personal characteristics storehouse.

2. aviation cockpit environment self-adaption phonetic feature model training method according to claim 1, it is characterised in that base Plinth characteristic model is trained by extensive speech data, it is preferable that the training speech data time>3000 hours.

3. aviation cockpit environment self-adaption phonetic feature model training method according to claim 1, it is characterised in that step Suddenly also there is step S5 between S4 and step S6,