CN107170444A - Aviation cockpit environment self-adaption phonetic feature model training method - Google Patents

Aviation cockpit environment self-adaption phonetic feature model training method Download PDF

Info

Publication number
CN107170444A
CN107170444A CN201710450397.8A CN201710450397A CN107170444A CN 107170444 A CN107170444 A CN 107170444A CN 201710450397 A CN201710450397 A CN 201710450397A CN 107170444 A CN107170444 A CN 107170444A
Authority
CN
China
Prior art keywords
adaptive
model
personal
voice
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710450397.8A
Other languages
Chinese (zh)
Inventor
温泉
姚竞
黄梅娇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Aviation Electric Co Ltd
Original Assignee
Shanghai Aviation Electric Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Aviation Electric Co Ltd filed Critical Shanghai Aviation Electric Co Ltd
Priority to CN201710450397.8A priority Critical patent/CN107170444A/en
Publication of CN107170444A publication Critical patent/CN107170444A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks

Abstract

The present invention discloses aviation cockpit environment self-adaption phonetic feature model training method, includes, step S1, the personal adaptive voice feature of collection;Step S2 is marked there is provided personal adaptive voice;There is provided foundation characteristic model by step S3;Step S4, using deep neural network(DNN)Adaptive algorithm, updates foundation characteristic model, to generate adaptive model with reference to the corresponding personal adaptive voice mark of personal adaptive voice feature;Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition;And, step S6, model packing, generation personal characteristics storehouse.The advantage of the invention is that:Personal adaptive voice feature is updated into foundation characteristic model, the higher adaptive model of recognition capability is generated, can effectively improve the discrimination of avionics voice products has remarkable result.

Description

Aviation cockpit environment self-adaption phonetic feature model training method
Technical field
The present invention relates to field of speech recognition, especially aviation cockpit environment self-adaption phonetic feature model training method, There is remarkable result for the discrimination for improving avionics voice products.
Background technology
With developing rapidly for electronic technology and aircraft technology, the man-machine system technical field based on cognition/perception is not Come one of big field of avionics key technology ten, and speech recognition technology is based in cognition/perception man-machine system technology very An important key technology.At present, existing speech recognition is designed primarily directed to received pronunciation, if the language of human pilot Sound is not up to standard or with personal touch, and often discrimination is relatively low.Speech recognition technology how is set truly to help to drive Can personnel complete the control to aircraft, the key of practical application is obtained as the technology.
The content of the invention
The present invention seeks to overcome the problem of discrimination is relatively low in the prior art, there is provided a kind of new aviation cabin ambient Adaptive voice characteristic model training method.
In order to realize this purpose, technical scheme is as follows:Aviation cockpit environment self-adaption phonetic feature model Training method, includes,
Step S1, the personal adaptive voice feature of collection:
Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input;
Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, and simulation aviation cabin ambient enters Row collection;
Step S13, extracts personal adaptive voice feature:
Step S131, obtains frame number speech data;
Step S132, every 400 sampled points of frame are designed as by frame number speech data;
Step S133, using 75 dimension Mel frequency marking coefficients(MFC)As speech characteristic parameter, per frame speech characteristic parameter 75 × 16bit;
Step S2 is marked there is provided personal adaptive voice:
Step S21, the personal adaptive text data of input;
Step S22, according to the standard criterion of pronunciation dictionary, by the content of text that text data is related to personal adaptively according to phoneme Status list is converted into the phoneme notation form of three-tone structure;
There is provided foundation characteristic model by step S3:
Foundation characteristic model is multilayer deep neural network(DNN)Model, input layer is phonetic feature supporting in step S1, defeated It is voice annotation supporting in step S2 to go out layer;
Step S4, using deep neural network(DNN)Adaptive algorithm, with reference to corresponding of personal adaptive voice feature People's adaptive voice mark updates foundation characteristic model, to generate adaptive model;And,
Step S6, model packing, generation personal characteristics storehouse.
As the preferred scheme of aviation cockpit environment self-adaption phonetic feature model training method, foundation characteristic model passes through Extensive speech data training, it is preferable that training speech data time>3000 hours.
As the preferred scheme of aviation cockpit environment self-adaption phonetic feature model training method, between step S4 and step S6 Also there is step S5,
Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition;
Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech is life Make word;
Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133(MFC)Feature;
Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, obtains test text Notebook data, by being contrasted with the voice annotation in step 52, obtains the recognition performance of adaptive model.
Compared with prior art, advantages of the present invention at least that:Personal adaptive voice feature is updated into foundation characteristic Model, the higher adaptive model of generation recognition capability, can effectively improve the discrimination of avionics voice products has aobvious Write effect.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of one embodiment of the invention.
Embodiment
The present invention is described in further detail below by specific embodiment combination accompanying drawing.
Fig. 1 is referred to, aviation cockpit environment self-adaption phonetic feature model training method is shown in the figure.This method according to Secondary execution following steps:
Step S1, the personal adaptive voice feature of collection.
Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input.
Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, simulates aviation cockpit ring Border is acquired.
Step S13, extracts personal adaptive voice feature:
Step S131, obtains frame number speech data.
Step S132, every 400 sampled points of frame are designed as by frame number speech data.
Step S133, using 75 dimension Mel frequency marking coefficients(MFC)As speech characteristic parameter, per frame speech characteristic parameter 75 ×16bit。
Step S2 is marked there is provided personal adaptive voice.
Step S21, the personal adaptive text data of input.
Step S22, according to the standard criterion of pronunciation dictionary, the content of text that personal adaptive text data is related to according to Phoneme state list is converted into the phoneme notation form of three-tone structure.
There is provided foundation characteristic model by step S3.
Foundation characteristic model is multilayer deep neural network(DNN)Model, input layer is special for voice supporting in step S1 Levy, output layer is voice annotation supporting in step S2.Foundation characteristic model is trained by extensive speech data(>3000 is small When).
Step S4, using deep neural network(DNN)Adaptive algorithm, it is corresponding with reference to personal adaptive voice feature Personal adaptive voice mark update foundation characteristic model, to generate adaptive model.
Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition.
Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech For order word.
Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133(MFC)Feature.
Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, is surveyed Text data is tried, by being contrasted with the voice annotation in step 52, the recognition performance of adaptive model is obtained.
Step S6, model packing, generation personal characteristics storehouse.
Wherein, multilayer deep neural network(DNN)Model is carried out using multilayer neural network for speaker's pronunciation characteristic Nonlinear fitting, has robustness high relative to conventional model, and noiseproof feature is strong, the characteristics of discrimination is high.Using speaking on a small quantity People's voice carries out adaptively, DNN models being made to more conform to speaker's feature for standard DNN models.
Embodiments of the present invention are only expressed above, it describes more specific and detailed, but and can not therefore understand For the limitation to patent of invention scope.It should be pointed out that for the person of ordinary skill of the art, not departing from this hair On the premise of bright design, various modifications and improvements can be made, these belong to protection scope of the present invention.Therefore, this hair The protection domain of bright patent should be determined by the appended claims.

Claims (3)

1. aviation cockpit environment self-adaption phonetic feature model training method, it is characterised in that include,
Step S1, the personal adaptive voice feature of collection:
Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input;
Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, and simulation aviation cabin ambient enters Row collection;
Step S13, extracts personal adaptive voice feature:
Step S131, obtains frame number speech data;
Step S132, every 400 sampled points of frame are designed as by frame number speech data;
Step S133, using 75 dimension Mel frequency marking coefficients(MFC)As speech characteristic parameter, per frame speech characteristic parameter 75 × 16bit;
Step S2 is marked there is provided personal adaptive voice:
Step S21, the personal adaptive text data of input;
Step S22, according to the standard criterion of pronunciation dictionary, by the content of text that text data is related to personal adaptively according to phoneme Status list is converted into the phoneme notation form of three-tone structure;
There is provided foundation characteristic model by step S3:
Foundation characteristic model is multilayer deep neural network(DNN)Model, input layer is phonetic feature supporting in step S1, defeated It is voice annotation supporting in step S2 to go out layer;
Step S4, using deep neural network(DNN)Adaptive algorithm, with reference to corresponding of personal adaptive voice feature People's adaptive voice mark updates foundation characteristic model, to generate adaptive model;And,
Step S6, model packing, generation personal characteristics storehouse.
2. aviation cockpit environment self-adaption phonetic feature model training method according to claim 1, it is characterised in that base Plinth characteristic model is trained by extensive speech data, it is preferable that the training speech data time>3000 hours.
3. aviation cockpit environment self-adaption phonetic feature model training method according to claim 1, it is characterised in that step Suddenly also there is step S5 between S4 and step S6,
Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition;
Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech is life Make word;
Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133(MFC)Feature;
Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, obtains test text Notebook data, by being contrasted with the voice annotation in step 52, obtains the recognition performance of adaptive model.
CN201710450397.8A 2017-06-15 2017-06-15 Aviation cockpit environment self-adaption phonetic feature model training method Pending CN107170444A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710450397.8A CN107170444A (en) 2017-06-15 2017-06-15 Aviation cockpit environment self-adaption phonetic feature model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710450397.8A CN107170444A (en) 2017-06-15 2017-06-15 Aviation cockpit environment self-adaption phonetic feature model training method

Publications (1)

Publication Number Publication Date
CN107170444A true CN107170444A (en) 2017-09-15

Family

ID=59818558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710450397.8A Pending CN107170444A (en) 2017-06-15 2017-06-15 Aviation cockpit environment self-adaption phonetic feature model training method

Country Status (1)

Country Link
CN (1) CN107170444A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112331207A (en) * 2020-09-30 2021-02-05 音数汇元(上海)智能科技有限公司 Service content monitoring method and device, electronic equipment and storage medium
CN112365883A (en) * 2020-10-29 2021-02-12 安徽江淮汽车集团股份有限公司 Cabin system voice recognition test method, device, equipment and storage medium
CN112634692A (en) * 2020-12-15 2021-04-09 成都职业技术学院 Emergency evacuation deduction training system for crew cabins

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096940A (en) * 2015-06-30 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for voice recognition
CN105529026A (en) * 2014-10-17 2016-04-27 现代自动车株式会社 Speech recognition device and speech recognition method
CN105869624A (en) * 2016-03-29 2016-08-17 腾讯科技(深圳)有限公司 Method and apparatus for constructing speech decoding network in digital speech recognition
CN105976812A (en) * 2016-04-28 2016-09-28 腾讯科技(深圳)有限公司 Voice identification method and equipment thereof
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105529026A (en) * 2014-10-17 2016-04-27 现代自动车株式会社 Speech recognition device and speech recognition method
CN105096940A (en) * 2015-06-30 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for voice recognition
CN105869624A (en) * 2016-03-29 2016-08-17 腾讯科技(深圳)有限公司 Method and apparatus for constructing speech decoding network in digital speech recognition
CN105976812A (en) * 2016-04-28 2016-09-28 腾讯科技(深圳)有限公司 Voice identification method and equipment thereof
CN106228980A (en) * 2016-07-21 2016-12-14 百度在线网络技术(北京)有限公司 Data processing method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112331207A (en) * 2020-09-30 2021-02-05 音数汇元(上海)智能科技有限公司 Service content monitoring method and device, electronic equipment and storage medium
CN112365883A (en) * 2020-10-29 2021-02-12 安徽江淮汽车集团股份有限公司 Cabin system voice recognition test method, device, equipment and storage medium
CN112365883B (en) * 2020-10-29 2023-12-26 安徽江淮汽车集团股份有限公司 Cabin system voice recognition test method, device, equipment and storage medium
CN112634692A (en) * 2020-12-15 2021-04-09 成都职业技术学院 Emergency evacuation deduction training system for crew cabins

Similar Documents

Publication Publication Date Title
EP3857543B1 (en) Conversational agent pipeline trained on synthetic data
WO2018153213A1 (en) Multi-language hybrid speech recognition method
CN107195296B (en) Voice recognition method, device, terminal and system
CN106251859B (en) Voice recognition processing method and apparatus
Matrouk et al. Speech fingerprint to identify isolated word person
CN103578471B (en) Speech identifying method and its electronic installation
CN110827805B (en) Speech recognition model training method, speech recognition method and device
CN106875942A (en) Acoustic model adaptive approach based on accent bottleneck characteristic
CN104575497B (en) A kind of acoustic model method for building up and the tone decoding method based on the model
Zuluaga-Gomez et al. Automatic speech recognition benchmark for air-traffic communications
CN114416934B (en) Multi-modal dialog generation model training method and device and electronic equipment
CN107170444A (en) Aviation cockpit environment self-adaption phonetic feature model training method
CN112289299A (en) Training method and device of speech synthesis model, storage medium and electronic equipment
Yağanoğlu Real time wearable speech recognition system for deaf persons
Caballero-Morales et al. Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition
CN110019741A (en) Request-answer system answer matching process, device, equipment and readable storage medium storing program for executing
US9805740B2 (en) Language analysis based on word-selection, and language analysis apparatus
CN113393828A (en) Training method of voice synthesis model, and voice synthesis method and device
Rawat et al. Digital life assistant using automated speech recognition
Šmídl et al. Semi-supervised training of DNN-based acoustic model for ATC speech recognition
CN108269574A (en) Voice signal processing method and device, storage medium and electronic equipment
TWI659411B (en) Multilingual mixed speech recognition method
CN111862961A (en) Method and device for recognizing voice
Akita et al. Generalized statistical modeling of pronunciation variations using variable-length phone context
Sreejith et al. Automatic prosodic labeling and broad class Phonetic Engine for Malayalam

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170915