CN107170444A - Aviation cockpit environment self-adaption phonetic feature model training method - Google Patents
Aviation cockpit environment self-adaption phonetic feature model training method Download PDFInfo
- Publication number
- CN107170444A CN107170444A CN201710450397.8A CN201710450397A CN107170444A CN 107170444 A CN107170444 A CN 107170444A CN 201710450397 A CN201710450397 A CN 201710450397A CN 107170444 A CN107170444 A CN 107170444A
- Authority
- CN
- China
- Prior art keywords
- adaptive
- model
- personal
- voice
- speech
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012549 training Methods 0.000 title claims abstract description 15
- 238000000034 method Methods 0.000 title claims abstract description 13
- 230000003044 adaptive effect Effects 0.000 claims abstract description 52
- 238000013528 artificial neural network Methods 0.000 claims abstract description 9
- 238000013095 identification testing Methods 0.000 claims abstract description 4
- 238000012856 packing Methods 0.000 claims abstract description 4
- 238000012360 testing method Methods 0.000 claims description 5
- 238000013461 design Methods 0.000 claims description 4
- 241001269238 Data Species 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 238000004088 simulation Methods 0.000 claims description 2
- 230000019771 cognition Effects 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
Abstract
The present invention discloses aviation cockpit environment self-adaption phonetic feature model training method, includes, step S1, the personal adaptive voice feature of collection;Step S2 is marked there is provided personal adaptive voice;There is provided foundation characteristic model by step S3;Step S4, using deep neural network(DNN)Adaptive algorithm, updates foundation characteristic model, to generate adaptive model with reference to the corresponding personal adaptive voice mark of personal adaptive voice feature;Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition;And, step S6, model packing, generation personal characteristics storehouse.The advantage of the invention is that:Personal adaptive voice feature is updated into foundation characteristic model, the higher adaptive model of recognition capability is generated, can effectively improve the discrimination of avionics voice products has remarkable result.
Description
Technical field
The present invention relates to field of speech recognition, especially aviation cockpit environment self-adaption phonetic feature model training method,
There is remarkable result for the discrimination for improving avionics voice products.
Background technology
With developing rapidly for electronic technology and aircraft technology, the man-machine system technical field based on cognition/perception is not
Come one of big field of avionics key technology ten, and speech recognition technology is based in cognition/perception man-machine system technology very
An important key technology.At present, existing speech recognition is designed primarily directed to received pronunciation, if the language of human pilot
Sound is not up to standard or with personal touch, and often discrimination is relatively low.Speech recognition technology how is set truly to help to drive
Can personnel complete the control to aircraft, the key of practical application is obtained as the technology.
The content of the invention
The present invention seeks to overcome the problem of discrimination is relatively low in the prior art, there is provided a kind of new aviation cabin ambient
Adaptive voice characteristic model training method.
In order to realize this purpose, technical scheme is as follows:Aviation cockpit environment self-adaption phonetic feature model
Training method, includes,
Step S1, the personal adaptive voice feature of collection:
Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input;
Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, and simulation aviation cabin ambient enters
Row collection;
Step S13, extracts personal adaptive voice feature:
Step S131, obtains frame number speech data;
Step S132, every 400 sampled points of frame are designed as by frame number speech data;
Step S133, using 75 dimension Mel frequency marking coefficients(MFC)As speech characteristic parameter, per frame speech characteristic parameter 75 ×
16bit;
Step S2 is marked there is provided personal adaptive voice:
Step S21, the personal adaptive text data of input;
Step S22, according to the standard criterion of pronunciation dictionary, by the content of text that text data is related to personal adaptively according to phoneme
Status list is converted into the phoneme notation form of three-tone structure;
There is provided foundation characteristic model by step S3:
Foundation characteristic model is multilayer deep neural network(DNN)Model, input layer is phonetic feature supporting in step S1, defeated
It is voice annotation supporting in step S2 to go out layer;
Step S4, using deep neural network(DNN)Adaptive algorithm, with reference to corresponding of personal adaptive voice feature
People's adaptive voice mark updates foundation characteristic model, to generate adaptive model;And,
Step S6, model packing, generation personal characteristics storehouse.
As the preferred scheme of aviation cockpit environment self-adaption phonetic feature model training method, foundation characteristic model passes through
Extensive speech data training, it is preferable that training speech data time>3000 hours.
As the preferred scheme of aviation cockpit environment self-adaption phonetic feature model training method, between step S4 and step S6
Also there is step S5,
Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition;
Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech is life
Make word;
Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133(MFC)Feature;
Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, obtains test text
Notebook data, by being contrasted with the voice annotation in step 52, obtains the recognition performance of adaptive model.
Compared with prior art, advantages of the present invention at least that:Personal adaptive voice feature is updated into foundation characteristic
Model, the higher adaptive model of generation recognition capability, can effectively improve the discrimination of avionics voice products has aobvious
Write effect.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of one embodiment of the invention.
Embodiment
The present invention is described in further detail below by specific embodiment combination accompanying drawing.
Fig. 1 is referred to, aviation cockpit environment self-adaption phonetic feature model training method is shown in the figure.This method according to
Secondary execution following steps:
Step S1, the personal adaptive voice feature of collection.
Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input.
Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, simulates aviation cockpit ring
Border is acquired.
Step S13, extracts personal adaptive voice feature:
Step S131, obtains frame number speech data.
Step S132, every 400 sampled points of frame are designed as by frame number speech data.
Step S133, using 75 dimension Mel frequency marking coefficients(MFC)As speech characteristic parameter, per frame speech characteristic parameter 75
×16bit。
Step S2 is marked there is provided personal adaptive voice.
Step S21, the personal adaptive text data of input.
Step S22, according to the standard criterion of pronunciation dictionary, the content of text that personal adaptive text data is related to according to
Phoneme state list is converted into the phoneme notation form of three-tone structure.
There is provided foundation characteristic model by step S3.
Foundation characteristic model is multilayer deep neural network(DNN)Model, input layer is special for voice supporting in step S1
Levy, output layer is voice annotation supporting in step S2.Foundation characteristic model is trained by extensive speech data(>3000 is small
When).
Step S4, using deep neural network(DNN)Adaptive algorithm, it is corresponding with reference to personal adaptive voice feature
Personal adaptive voice mark update foundation characteristic model, to generate adaptive model.
Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition.
Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech
For order word.
Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133(MFC)Feature.
Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, is surveyed
Text data is tried, by being contrasted with the voice annotation in step 52, the recognition performance of adaptive model is obtained.
Step S6, model packing, generation personal characteristics storehouse.
Wherein, multilayer deep neural network(DNN)Model is carried out using multilayer neural network for speaker's pronunciation characteristic
Nonlinear fitting, has robustness high relative to conventional model, and noiseproof feature is strong, the characteristics of discrimination is high.Using speaking on a small quantity
People's voice carries out adaptively, DNN models being made to more conform to speaker's feature for standard DNN models.
Embodiments of the present invention are only expressed above, it describes more specific and detailed, but and can not therefore understand
For the limitation to patent of invention scope.It should be pointed out that for the person of ordinary skill of the art, not departing from this hair
On the premise of bright design, various modifications and improvements can be made, these belong to protection scope of the present invention.Therefore, this hair
The protection domain of bright patent should be determined by the appended claims.
Claims (3)
1. aviation cockpit environment self-adaption phonetic feature model training method, it is characterised in that include,
Step S1, the personal adaptive voice feature of collection:
Step S11, simulates aviation cabin ambient, the personal adaptive voice data of input;
Step S12, is 16k sampling 16bit speech datas by personal adaptive voice design data, and simulation aviation cabin ambient enters
Row collection;
Step S13, extracts personal adaptive voice feature:
Step S131, obtains frame number speech data;
Step S132, every 400 sampled points of frame are designed as by frame number speech data;
Step S133, using 75 dimension Mel frequency marking coefficients(MFC)As speech characteristic parameter, per frame speech characteristic parameter 75 ×
16bit;
Step S2 is marked there is provided personal adaptive voice:
Step S21, the personal adaptive text data of input;
Step S22, according to the standard criterion of pronunciation dictionary, by the content of text that text data is related to personal adaptively according to phoneme
Status list is converted into the phoneme notation form of three-tone structure;
There is provided foundation characteristic model by step S3:
Foundation characteristic model is multilayer deep neural network(DNN)Model, input layer is phonetic feature supporting in step S1, defeated
It is voice annotation supporting in step S2 to go out layer;
Step S4, using deep neural network(DNN)Adaptive algorithm, with reference to corresponding of personal adaptive voice feature
People's adaptive voice mark updates foundation characteristic model, to generate adaptive model;And,
Step S6, model packing, generation personal characteristics storehouse.
2. aviation cockpit environment self-adaption phonetic feature model training method according to claim 1, it is characterised in that base
Plinth characteristic model is trained by extensive speech data, it is preferable that the training speech data time>3000 hours.
3. aviation cockpit environment self-adaption phonetic feature model training method according to claim 1, it is characterised in that step
Suddenly also there is step S5 between S4 and step S6,
Step S5, identification test, raising ability of the checking adaptive model for personal speech recognition;
Step 51, aviation cabin ambient, input test voice and its corresponding voice annotation are simulated, wherein, tested speech is life
Make word;
Step 52, tested speech is converted to and identical Mel frequency marking coefficients in step S133(MFC)Feature;
Step 53, Viterbi search is carried out in adaptive model, matching fraction highest is recognition result, obtains test text
Notebook data, by being contrasted with the voice annotation in step 52, obtains the recognition performance of adaptive model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710450397.8A CN107170444A (en) | 2017-06-15 | 2017-06-15 | Aviation cockpit environment self-adaption phonetic feature model training method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710450397.8A CN107170444A (en) | 2017-06-15 | 2017-06-15 | Aviation cockpit environment self-adaption phonetic feature model training method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107170444A true CN107170444A (en) | 2017-09-15 |
Family
ID=59818558
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710450397.8A Pending CN107170444A (en) | 2017-06-15 | 2017-06-15 | Aviation cockpit environment self-adaption phonetic feature model training method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107170444A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112331207A (en) * | 2020-09-30 | 2021-02-05 | 音数汇元(上海)智能科技有限公司 | Service content monitoring method and device, electronic equipment and storage medium |
CN112365883A (en) * | 2020-10-29 | 2021-02-12 | 安徽江淮汽车集团股份有限公司 | Cabin system voice recognition test method, device, equipment and storage medium |
CN112634692A (en) * | 2020-12-15 | 2021-04-09 | 成都职业技术学院 | Emergency evacuation deduction training system for crew cabins |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105096940A (en) * | 2015-06-30 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Method and device for voice recognition |
CN105529026A (en) * | 2014-10-17 | 2016-04-27 | 现代自动车株式会社 | Speech recognition device and speech recognition method |
CN105869624A (en) * | 2016-03-29 | 2016-08-17 | 腾讯科技(深圳)有限公司 | Method and apparatus for constructing speech decoding network in digital speech recognition |
CN105976812A (en) * | 2016-04-28 | 2016-09-28 | 腾讯科技(深圳)有限公司 | Voice identification method and equipment thereof |
CN106228980A (en) * | 2016-07-21 | 2016-12-14 | 百度在线网络技术(北京)有限公司 | Data processing method and device |
-
2017
- 2017-06-15 CN CN201710450397.8A patent/CN107170444A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105529026A (en) * | 2014-10-17 | 2016-04-27 | 现代自动车株式会社 | Speech recognition device and speech recognition method |
CN105096940A (en) * | 2015-06-30 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Method and device for voice recognition |
CN105869624A (en) * | 2016-03-29 | 2016-08-17 | 腾讯科技(深圳)有限公司 | Method and apparatus for constructing speech decoding network in digital speech recognition |
CN105976812A (en) * | 2016-04-28 | 2016-09-28 | 腾讯科技(深圳)有限公司 | Voice identification method and equipment thereof |
CN106228980A (en) * | 2016-07-21 | 2016-12-14 | 百度在线网络技术(北京)有限公司 | Data processing method and device |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112331207A (en) * | 2020-09-30 | 2021-02-05 | 音数汇元(上海)智能科技有限公司 | Service content monitoring method and device, electronic equipment and storage medium |
CN112365883A (en) * | 2020-10-29 | 2021-02-12 | 安徽江淮汽车集团股份有限公司 | Cabin system voice recognition test method, device, equipment and storage medium |
CN112365883B (en) * | 2020-10-29 | 2023-12-26 | 安徽江淮汽车集团股份有限公司 | Cabin system voice recognition test method, device, equipment and storage medium |
CN112634692A (en) * | 2020-12-15 | 2021-04-09 | 成都职业技术学院 | Emergency evacuation deduction training system for crew cabins |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3857543B1 (en) | Conversational agent pipeline trained on synthetic data | |
WO2018153213A1 (en) | Multi-language hybrid speech recognition method | |
CN107195296B (en) | Voice recognition method, device, terminal and system | |
CN106251859B (en) | Voice recognition processing method and apparatus | |
Matrouk et al. | Speech fingerprint to identify isolated word person | |
CN103578471B (en) | Speech identifying method and its electronic installation | |
CN110827805B (en) | Speech recognition model training method, speech recognition method and device | |
CN106875942A (en) | Acoustic model adaptive approach based on accent bottleneck characteristic | |
CN104575497B (en) | A kind of acoustic model method for building up and the tone decoding method based on the model | |
Zuluaga-Gomez et al. | Automatic speech recognition benchmark for air-traffic communications | |
CN114416934B (en) | Multi-modal dialog generation model training method and device and electronic equipment | |
CN107170444A (en) | Aviation cockpit environment self-adaption phonetic feature model training method | |
CN112289299A (en) | Training method and device of speech synthesis model, storage medium and electronic equipment | |
Yağanoğlu | Real time wearable speech recognition system for deaf persons | |
Caballero-Morales et al. | Evolutionary approach for integration of multiple pronunciation patterns for enhancement of dysarthric speech recognition | |
CN110019741A (en) | Request-answer system answer matching process, device, equipment and readable storage medium storing program for executing | |
US9805740B2 (en) | Language analysis based on word-selection, and language analysis apparatus | |
CN113393828A (en) | Training method of voice synthesis model, and voice synthesis method and device | |
Rawat et al. | Digital life assistant using automated speech recognition | |
Šmídl et al. | Semi-supervised training of DNN-based acoustic model for ATC speech recognition | |
CN108269574A (en) | Voice signal processing method and device, storage medium and electronic equipment | |
TWI659411B (en) | Multilingual mixed speech recognition method | |
CN111862961A (en) | Method and device for recognizing voice | |
Akita et al. | Generalized statistical modeling of pronunciation variations using variable-length phone context | |
Sreejith et al. | Automatic prosodic labeling and broad class Phonetic Engine for Malayalam |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170915 |