CN105702256A

CN105702256A - Numerical string voice identification method based on airborne equipment

Info

Publication number: CN105702256A
Application number: CN201410701895.1A
Authority: CN
Inventors: 李曜
Original assignee: Shanghai Aviation Electric Co Ltd
Current assignee: Shanghai Aviation Electric Co Ltd
Priority date: 2014-11-28
Filing date: 2014-11-28
Publication date: 2016-06-22
Anticipated expiration: 2034-11-28
Also published as: CN105702256B

Abstract

The invention discloses a numerical string voice identification method based on the airborne equipment. According to the method, a two-times-identification framework is employed at an identification stage to carry out secondary confirmation for the voice with a numerical string, in a secondary confirmation process, a numerical special-purpose model is utilized for identification, and the numerical string identification accuracy rate is improved, at an a result confirmation stage, a local voice re-input mode is employed to modify a wrongly-identified numerical string, the possibility that the input content is completely identified accurately is improved through reducing the identification content, and the operation time for modifying identification errors is reduced. Through a man-machine interaction flow of improving the numerical string identification performance and ameliorating the error identification result, the time that a pilot keeps one's head low to input a numerical string on a touch screen can be shortened as much as possible, so flight safety guarantee is improved, and thereby the control load of the pilot is reduced.

Description

A kind of digit string recognition methods based on airborne equipment

Technical field

The invention belongs to field of speech recognition, be specifically related to a kind of digit string recognition methods based on airborne equipment, be mainly used on airborne equipment and rapidly input numeric string and quickly revise the numeral identifying mistake。

Background technology

Cockpit narrow space, manipulation complexity, utilize the mode of speech recognition can improve the ergonomic of pilot manipulation aircraft, reduce the operational load of pilot。Speech recognition is a kind of technology that the audio digital signals mankind spoken is identified as the text message of content of speaking, utilize this technology the occasion to machine input information can be needed to be manually entered at some to replace with phonetic entry, provide the new paragon of a kind of man-machine interaction to the mankind。

Speech recognition belongs to the category of pattern recognition, and its functional realiey needs to use the model that precondition is good, and adopts a whole set of recognizer to complete to become speech recognition the process of word。Speech recognition does not ensure that all recognition results are very correct, and its recognition performance is limited to the quality of model training and the quality of recognizer Project Realization。When usual speech recognition is for airborne equipment, the content of required identification is a less text collection, identifies that difficulty is less, therefore can obtain good recognition performance。

Numeric string input is a general need in airborne field, and the information such as similar flying height, longitude and latitude, communication frequency is directed to substantial amounts of numeric string。But connected digits recognition is but a difficult point of speech recognition, because ten digital degree of aliasing are big, string number occur in identification process deletion error, inserting error, replacement mistake ratio regular meeting be greatly increased。If allowing pilot also need to after passing through phonetic entry string number revise, by the virtual numeric keypad touching on display screen, the numeral identifying mistake one by one, mistake is more many, the operation of amendment is more many, the time of cost is also more long, it is impossible to reach to be substituted by phonetic entry to manually enter the Expected Results reducing operational load and flight safety hidden danger。Therefore, for speech recognition application scenarios on airborne equipment, it is necessary to improve the effect of connected digits recognition。

Summary of the invention

It is an object of the invention to provide a kind of digit string recognition methods based on airborne equipment, the problem being primarily directed in speech recognition connected digits recognition poor-performing proposes a solution, it is especially considering that pilot's unhandy factor in narrow and small passenger cabin space, reduce them as far as possible to bow the time of operand word string input on touch display screen, thus promoting flight safety guarantee, alleviate pilot manipulation load。

To achieve these goals, the scheme of catchmenting of the present invention is as follows: a kind of digit string recognition methods based on airborne equipment, it is characterised in that said method comprising the steps of: A, instruction voice input；B, utilize speech recognition modeling to input voice carry out one time decoding and judge whether voice exists numeric string, as otherwise exported final recognition result, then enter C in this way；C, acquisition numeric string boundary information, obtain numeric string correspondence audio-frequency information by described boundary information；D, utilize digital private model to numeric string diaphone frequency information be recognized for, export secondary recognition result；E, export final recognition result。

Step D also includes: when exporting secondary recognition result, also include this step of results verification。Described results verification step: including: a, on the touchscreen output secondary recognition result, each of which numeric string is a unit；B, judge that numeric string is whether correct, as correct then end is revised, as incorrect, then enter c；C, location need substitute error number word string lay equal stress on new speech input described numeric string；D, utilize digital private model that this numeric string is identified, and replace the numeric string of mistake；E, output final result。When voice with numeric string is done secondary-confirmation, the boundary information of digit string can be calibrated according to the boundary information of the grammatical rules keyword of present instruction。

Wherein Wrong localization numeric string can be through the error number word string that touch screen is pointed out to need to substitute, when can also be numeric string described in phonetic entry, ending place at numeric string adds boundary information character, the error number word string substituted by judging boundary information character to judge whether to。

The present invention adopts the framework of twice identification that the voice with numeric string is done secondary-confirmation, uses digital private model to be identified, thus promoting the recognition accuracy of numeric string in secondary-confirmation process。Adopt the mode that Local speech re-enters to revise the numeric string identifying mistake in the results verification stage, improve, by reducing the mode identifying content, the probability that input content is all correctly validated, reduce amendment and identify the operating time of mistake。The present invention is by improving connected digits recognition performance and optimizing the man-machine interaction flow process of amendment wrong identification result, make pilot's subsequent operation when revising easier, reduce pilot as far as possible to bow the time of operand word string input on touch display screen, thus promoting flight safety guarantee, alleviate pilot manipulation load。

Accompanying drawing explanation

Fig. 1 is the FB(flow block) of the numeric string secondary-confirmation scheme of the present invention。

Fig. 2 is the FB(flow block) of the error result modification of the present invention。

Fig. 3 is the connected digits recognition overall plan flow chart of the present invention。

Below in conjunction with drawings and Examples, the present invention is elaborated。

Detailed description of the invention

The present invention improves the performance of digit string identification from two aspects。Voice with numeric string is done secondary-confirmation by the framework one, adopting twice identification, uses digital private model to be identified, thus promoting the recognition accuracy of numeric string in secondary-confirmation process。Two, adopt the mode that Local speech re-enters to revise the numeric string identifying mistake, improve, by reducing the mode identifying content, the probability that input content is all correctly validated, reduce amendment and identify the operating time of mistake。Such scheme is elaborated below with a specific example。

Assuming that pilot wants that the mode by phonetic entry arranges the longitude on navigation purpose ground, he needs to say herein below: " arrange destination's longitude, east longitude 135 degree 36 points 48 seconds。" (in order to be contrasted, foregoing is splitted into two phonetic orders by us, is " arranging destination's longitude " and " east longitude 135 degree 36 points and 48 seconds " respectively。)

After pilot finishes " arranging destination's longitude ", airborne voice identification system will identify that the particular content of this instruction, and according to the speech recognition schemes that the present invention proposes, system will check whether this instruction comprises numeric string。After confirming that this instruction does not comprise numeric string content, system will wait the input of next phonetic order。

After pilot continues to finish " east longitude 135 degree 36 points and 48 seconds ", airborne voice identification system identification goes out the particular content of this instruction, continues to judge whether this instruction comprises numeric string。Confirming that this instruction comprises numeric string content, system will find, by the boundary information of character/word/numeral each in recognition result, the voice data that numeric string is corresponding。The boundary information of recognition result can obtain in identification process and preserve, or again does forced alignment (ForceAlignment) acquisition after end of identification。After finding the voice data that numeric string is corresponding, these section audio data are done an identification by digital private model again that use precondition good, and the present invention is referred to as secondary-confirmation。Because there being the priori identifying that content is numeric string, identification range can be substantially reduced, and adds and employs the model being specifically designed for numeric string training, and recognition accuracy will increase than first pass identification。In " east longitude 135 degree 36 points 48 seconds " this instruction, one has three numeric strings, then three corresponding section audio data all can do the process of an above-mentioned secondary-confirmation again。In concrete airborne voice identification system, after obtaining the instruction that pilot " arranges destination's longitude ", the syntax format of subsequent instructions content will be produced to meet the expectation of priori by identification system, namely identifying that the already known subsequent instructions content of system would is that the form of " east/west longitude * * degree * * divides the * * second ", unknown or degree every minute and second the information of simply concrete numeric string content is likely imperfect。Under the premise having had this priori, identification system can use the grammatical rules keyword of settings in advance such as " warp " " degree " " dividing " " seconds " that recognition result is positioned, the boundary information confirming numeric string is assisted with the boundary information of these keywords, so that the voice data more complete and accurate that numeric string is corresponding, also helpful for follow-up secondary-confirmation recognition performance。

By above-mentioned secondary-confirmation process, the correct probability identified of " east longitude 135 degree 36 points 48 seconds " this instruction will be enhanced, but it still is possible to exist and identifies mistake, and mistake will occur on " 135 ", " 36 ", " 48 " these three numeric string more, it is more likely that one of them numeric string occurs in that identification mistake。If the mistake of amendment, common airborne equipment has two kinds of selections。One is again to recall virtual numeric keypad on touch display screen, by the digit deletion of mistake, and inputs correct numeral。One is weight multiplexed speech input " east longitude 135 degree 36 points and 48 seconds " this instruction, it is desirable to the identification of second time can obtain correct result, but second time identification has certain probability still cannot obtain right-on result。Both modes pilot all can be made to feel the subsequent operation of phonetic entry is excessively loaded down with trivial details, thus tending to adopt the mode manually entering numeric string not adopt the mode of phonetic entry at the very start。

The numeric string identifying mistake is modified by the mode that proposition Local speech of the present invention re-enters, and is based on " numeric string occurs that the probability of mistake will not be very big when connected digits recognition performance is secure " such a premise。Namely " east longitude 135 degree 36 points 48 seconds " big probability of this instruction will identify that correctly even if there being identification mistake, also substantially all mistake occurs in only one of which numeric string completely。Now, it is not necessary that whole piece instruction all re-entered once, the numeric string having only to will appear from mistake repeatedly inputs once。In concrete application, pilot can pass through the error number word string of finger point touching screen display, then starts recording, is again given an account of by this numeric string。Now identify that system will directly use digital private model that this section of voice is identified, and replaces the numeric string of mistake originally by recognition result。Word after numeric string without being clicked the numeric string of mistake by finger, but also can also be inputted together by pilot when phonetic entry again。Such as " 135 " occur identify mistake, pilot directly says " 135 degree " again, now identify system can according in recognition result occur " degree " word judgement make new advances the numeric string identified should replace script recognition result in which numeric string。The accuracy rate identified is higher no matter to adopt which kind of mode, the mode that this Local speech re-enters to can ensure that second time, requires also more simple and convenient to the subsequent operation of pilot simultaneously。

The speech recognition modeling related in the present invention and numeral special purpose model are prior art, do not repeat them here。

Claims

1. the digit string recognition methods based on airborne equipment, it is characterised in that said method comprising the steps of: A, instruction voice input；B, utilize speech recognition modeling to input voice carry out one time decoding and judge whether voice exists numeric string, as otherwise exported final recognition result, then enter C in this way；C, acquisition numeric string boundary information, obtain numeric string correspondence audio-frequency information by described boundary information；D, utilize digital private model to numeric string diaphone frequency information be recognized for, export secondary recognition result；E, export final recognition result。

2. digit string recognition methods as claimed in claim 1, it is characterised in that also include in step D: when exporting secondary recognition result, also include this step of results verification。

3. digit string recognition methods as claimed in claim 2, it is characterised in that described results verification step: including: a, on the touchscreen output secondary recognition result, each of which numeric string is a unit；B, judge that numeric string is whether correct, as correct then end is revised, as incorrect, then enter c；C, location need substitute error number word string lay equal stress on new speech input described numeric string；D, utilize digital private model that this numeric string is identified, and replace the numeric string of mistake；E, output final result。

4. digit string recognition methods as claimed in claim 2, it is characterized in that Wrong localization numeric string points out the error number word string needing to substitute by touch screen, or described in phonetic entry during numeric string, ending place at numeric string adds boundary information character, the error number word string substituted by judging boundary information character to judge whether to。