CN107274891A

CN107274891A - A kind of AR interface alternation method and system based on speech recognition engine

Info

Publication number: CN107274891A
Application number: CN201710368669.XA
Authority: CN
Inventors: 胡德志; 孙碧亮; 袁超飞
Original assignee: Wuhan Bao Bao Software Co Ltd
Current assignee: Wuhan Bao Bao Software Co Ltd
Priority date: 2017-05-23
Filing date: 2017-05-23
Publication date: 2017-10-20

Abstract

The present invention relates to a kind of AR interface alternation method and system based on speech recognition engine, its method comprises the following steps, S1, and phonetic order is generated using speech recognition engine；The coordinate position of dummy object in S2, identification AR interface, and the dummy object in the AR interfaces is performed corresponding interbehavior according to the phonetic order at the coordinate position.A kind of AR interface alternation methods based on speech recognition engine of the present invention solve the problem of voice of player can not be identified existing augmented reality, enrich people and the interactivity of game in AR applications, it can not only be interacted by picture with dummy object, the control function to the certain behavior of dummy object can also be reached by voice, the interest and playability of AR game is enhanced.

Description

A kind of AR interface alternation method and system based on speech recognition engine

Technical field

The present invention relates to AR technical fields, be specifically related to a kind of AR interface alternations method based on speech recognition engine and System.

Background technology

The playing method of game in terms of existing (Augmented Reality, abbreviation AR) on augmented reality is to play Family performs the behavior operation to virtual role by the button on operation equipment interface, and this playing method can not be rapidly achieved finger Wave the purpose of dummy object on AR interfaces；Meanwhile, identification of the existing speech recognition technology to large vocabulary is unable to reach very high Accuracy of identification, and be that can not allow the higher false drop rate of appearance of phonetic order in AR game, if there is higher False drop rate can reduce the playability of game.

The content of the invention

The technical problems to be solved by the invention be to provide a kind of AR interface alternations method based on speech recognition engine and System, solves the problem of voice of player can not be identified existing augmented reality, and can not only improve AR game should Playability, also enriches the interaction between people and virtual reality, enhances the interest of AR technical products.

The technical scheme that the present invention solves above-mentioned technical problem is as follows：A kind of AR interface alternations based on speech recognition engine Method, comprises the following steps,

S1, phonetic order is generated using speech recognition engine；

The coordinate position of dummy object in S2, identification AR interface, and the dummy object in the AR interfaces is sat described Corresponding interbehavior is performed according to the phonetic order at cursor position.

The beneficial effects of the invention are as follows：A kind of AR interface alternation methods based on speech recognition engine of the present invention utilize voice Recognize engine generation phonetic order；The coordinate position of dummy object in AR interfaces is recognized, and makes the virtual object in the AR interfaces Body performs corresponding interbehavior at the coordinate position according to the phonetic order；It this method solve existing augmented reality The problem of voice of player can not be identified technology, enriches people and the interactivity of game in AR applications, can not only lead to Cross picture to interact with dummy object, moreover it is possible to reach the control function to the certain behavior of dummy object by voice, enhance The interest and playability of AR game.

On the basis of above-mentioned technical proposal, the present invention can also do following improvement.

Further, the S1 specifically,

S11, carries out off-line learning to the vocabulary for needing speech recognition, obtains speech recognition library；

S12, the speech recognition library is imported into speech recognition engine,

S13, voice signal is inputted into the speech recognition engine, and by the voice signal and the speech recognition Data in storehouse are matched, and the speech recognition engine generates corresponding phonetic order according to matching result.

Further, the S11 specifically,

S111, RP, generation instruction audio file are carried out to the vocabulary for needing speech recognition；

S112, is trained, and extract institute by speech recognition engine off-line learning instrument to the instruction audio file The characteristic information in instruction audio file is stated, instruction text file is generated；

S113, is counted to the instruction text file, and extracts what is successively occurred between different instruction text Statistical relationship；

S114, builds speech recognition modeling, and export the initial precision ginseng of speech recognition modeling according to the statistical relationship Number,

S115, is tested the speech recognition modeling, and adjusted according to test result repeatedly using parameters precision parameter The initial precision parameter is saved, final precision parameter is drawn；

S116, generation speech recognition library is combined with the speech recognition modeling by the final precision parameter.

Beneficial effect using above-mentioned further scheme is：A kind of AR interface alternations based on speech recognition engine of the present invention Method is based on speech recognition engine, and first the audio file of the phonetic order to needing off-line learning carries out training repeatedly and to knowing The debugging repeatedly of other parameter, to get the speech recognition library of high-accuracy, improves the precision of speech recognition.

Further, the speech recognition engine is based on the exploitation of PocketSphinx speech recognition systems.

Beneficial effect using above-mentioned further scheme is：The voice developed based on PocketSphinx speech recognition systems Identification engine is the speech recognition engine of an amount of calculation and volume all very littles, and its accuracy of identification to small vocabulary is very high , and to the destruction very little of performance, reaction quickly, can further solve existing augmented reality to the voice of player without The problem of method is identified.

Further, in the S13, before the voice signal is matched with the data in the speech recognition library also Including：Filtration treatment is carried out to the voice signal.

Beneficial effect using above-mentioned further scheme is：To voice signal carry out filtration treatment after again with speech recognition library In data matched, the noise in voice signal can be removed, it is to avoid interference, improve matching accuracy rate.

Based on a kind of above-mentioned AR interface alternation methods based on speech recognition engine, the present invention also provides a kind of based on voice Recognize the AR interface alternation systems of engine.

A kind of AR interface alternation systems based on speech recognition engine, including speech recognition engine and AR engines,

The speech recognition engine, it is used to generate phonetic order；

The AR engines, it is used for the coordinate position for recognizing dummy object in AR interfaces, and makes the void in the AR interfaces Intend object and corresponding interbehavior is performed according to the phonetic order at the coordinate position.

The beneficial effects of the invention are as follows：A kind of AR interface alternation systems based on speech recognition engine of the present invention believe voice Phonetic order is generated number in speech recognition engine and is sent in AR engines, recognizes that tracking obtains AR circle by combining AR engines The coordinate position of dummy object in face, corresponding behavior control is carried out to dummy object in AR interfaces；It this method solve existing increasing The problem of voice of player can not be identified strong reality technology, enriches people and the interactivity of game in AR applications, not only It can be interacted by picture with dummy object, moreover it is possible to which the control function to the certain behavior of dummy object is reached by voice, Enhance the interest and playability of AR game.

Further, the speech recognition engine specifically for,

Off-line learning is carried out to the vocabulary for needing speech recognition, speech recognition library is obtained；

The speech recognition library is imported into speech recognition engine,

Voice signal is inputted into the speech recognition engine, and by the voice signal and the speech recognition library Data matched, the speech recognition engine generates corresponding phonetic order according to matching result.

Further, the speech recognition engine specifically for,

RP, generation instruction audio file are carried out to the vocabulary for needing speech recognition；

The instruction audio file is trained by speech recognition engine off-line learning instrument, and extracts the instruction Characteristic information in audio file, generates instruction text file；

The instruction text file is counted, and extracts the statistics successively occurred between different instruction text and is closed System；

Speech recognition modeling is built according to the statistical relationship, and exports the initial precision parameter of speech recognition modeling,

The speech recognition modeling is tested repeatedly using parameters precision parameter, and according to test result regulation Initial precision parameter, draws final precision parameter；

The final precision parameter is combined into generation speech recognition library with the speech recognition modeling.

Beneficial effect using above-mentioned further scheme is：A kind of AR interface alternations based on speech recognition engine of the present invention System is based on speech recognition engine, and first the audio file of the phonetic order to needing off-line learning carries out training repeatedly and to knowing The debugging repeatedly of other parameter, to get the speech recognition library of high-accuracy, improves the precision of speech recognition.

Further, in the speech recognition engine, the voice signal is entered with the data in the speech recognition library Also include before row matching：Filtration treatment is carried out to the voice signal.

Brief description of the drawings

Fig. 1 is a kind of overall flow figure of the AR interface alternation methods based on speech recognition engine of the present invention；

Fig. 2 is the flow of generation phonetic order in a kind of AR interface alternation methods based on speech recognition engine of the present invention Figure；

Fig. 3 in a kind of AR interface alternation methods based on speech recognition engine of the present invention to needing the vocabulary of speech recognition Carry out the flow chart that off-line learning obtains speech recognition library；

Fig. 4 is a kind of structured flowchart of the AR interface alternation systems based on speech recognition engine of the present invention.

Embodiment

The principle and feature of the present invention are described below in conjunction with accompanying drawing, the given examples are served only to explain the present invention, and It is non-to be used to limit the scope of the present invention.

As shown in figure 1, a kind of AR interface alternation methods based on speech recognition engine, comprise the following steps,

S1, phonetic order is generated using speech recognition engine；

In this specific embodiment, the S1 specifically, as shown in Fig. 2

S12, the speech recognition library is imported into speech recognition engine,

In this specific embodiment, in the S11, off-line learning is carried out to the vocabulary for needing speech recognition and obtains voice knowledge Other storehouse is concretely comprised the following steps, as shown in figure 3,

In this specific embodiment, the speech recognition engine is based on the exploitation of PocketSphinx speech recognition systems.

In this specific embodiment, in the S13, the voice signal is entered with the data in the speech recognition library Also include carrying out filtration treatment to the voice signal before row matching.

A kind of AR interface alternations method based on speech recognition engine of the present invention is illustrated so that AR plays as an example below.

Specifically for example：

In a AR game applications using the method exploitation of the present invention, it is possible to achieve the soldier of player is simply referred to The function of operation is waved, the enemy that order soldier preferentially attacks front, the enemy on the preferential attack left side, preferential attack the right is followed successively by Enemy, defend backward this 4 instruction, and hold AR stage properties (as exploitation peashooter) experience AR game player can not be quick The corresponding button by manual operation mobile terminal reach soldier's function of commander.Utilize the language of the method exploitation of the present invention Sound identification engine, which quickly can be responded accurately, to be attacked forward, is attacked, is attacked to the right to the left, this 4 simple languages are defendd backward Sound is instructed, and accomplishes that this point is accomplished by, and first passes through carry out RP of the tester to the vocabulary of phonetic order, generation instruction Audio file；The instruction audio file is trained by speech recognition engine off-line learning instrument, and extracts the finger The characteristic information in audio file is made, instruction text file is generated；The instruction text file is counted, and extracts difference The statistical relationship successively occurred between instruction text file；Speech recognition modeling is built according to the statistical relationship, and exports language The initial precision parameter of sound identification model, is tested the speech recognition modeling repeatedly using parameters precision parameter, and root The initial precision parameter is adjusted according to test result, final precision parameter is drawn；By the final precision parameter and the voice Identification model combines generation speech recognition library.

Study is obtained speech recognition library to imported into speech recognition engine, the phonetic order that people sends entered so as to realize The function of row identification；Its specific method is that the corresponding phonetic order of saying of player's standard is such as attacked forward, and mobile device is obtained To corresponding voice data, it is input in speech recognition engine, progress is gone after the dry corresponding filtration treatment of grade, voice data Matched with the data in speech recognition library, obtain being sent to after matching result in AR engines, it is responded by AR engines.

AR systems are followed the trail of by the identification to designated pictures stably to get dummy object into real world Positional information, the incoming voice identification result of speech recognition engine carries out being converted into the commander behaviour to the soldier that plays after analysis judgement Instruct.

The method of the present invention make it that augmented reality obtains speech identifying function system, enhancing people and virtual reality it Between interaction, enhance AR game interest and playability；If player is when holding peashooter experience AR game, it is no longer necessary to As traditional game, it is necessary to the behavior operation to virtual role is performed by operating the button on the interface of mobile terminal, but The behavior that virtual role in being played to AR is performed by the movement of itself and phonetic order is operated.

As shown in figure 4, a kind of AR interface alternation systems based on speech recognition engine, including speech recognition engine and AR draw Hold up,

The speech recognition engine, it is used to generate phonetic order；

In this specific embodiment, the speech recognition engine specifically for,

The speech recognition library is imported into speech recognition engine,

In this specific embodiment, the speech recognition engine specifically for,

In this specific embodiment, in the speech recognition engine, by the voice signal and the speech recognition library In data matched before also include：Filtration treatment is carried out to the voice signal.

A kind of AR interface alternation systems based on speech recognition engine of the present invention are by voice signal in speech recognition engine Generation phonetic order is simultaneously sent in AR engines, recognizes that tracking obtains the coordinate of dummy object in AR interfaces by combining AR engines Position, corresponding behavior control is carried out to dummy object in AR interfaces；Existing augmented reality be this method solve to player's The problem of voice can not be identified, enriches people and the interactivity of game in AR applications, not only can by picture with it is virtual Object is interacted, moreover it is possible to reach the control function to the certain behavior of dummy object by voice, enhances the entertaining of AR game Property and playability.

The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims

1. a kind of AR interface alternation methods based on speech recognition engine, it is characterised in that：Comprise the following steps,

S1, phonetic order is generated using speech recognition engine；

The coordinate position of dummy object in S2, identification AR interface, and make the dummy object in the AR interfaces in the coordinate bit Put place and corresponding interbehavior is performed according to the phonetic order.

2. a kind of AR interface alternation methods based on speech recognition engine according to claim 1, it is characterised in that：It is described S1 specifically,

S12, the speech recognition library is imported into speech recognition engine,

S13, voice signal is inputted into the speech recognition engine, and by the voice signal and the speech recognition library Data matched, the speech recognition engine generates corresponding phonetic order according to matching result.

3. a kind of AR interface alternation methods based on speech recognition engine according to claim 2, it is characterised in that：It is described S11 specifically,

S112, is trained, and extract the finger by speech recognition engine off-line learning instrument to the instruction audio file The characteristic information in audio file is made, instruction text file is generated；

S113, is counted, and extract the statistics successively occurred between different instruction text to the instruction text file Relation；

S114, speech recognition modeling is built according to the statistical relationship, and exports the initial precision parameter of speech recognition modeling,

S115, is tested the speech recognition modeling repeatedly using parameters precision parameter, and adjusts institute according to test result Initial precision parameter is stated, final precision parameter is drawn；

4. a kind of AR interface alternation methods based on speech recognition engine according to any one of claims 1 to 3, its feature It is：The speech recognition engine is based on the exploitation of PocketSphinx speech recognition systems.

5. a kind of AR interface alternation methods based on speech recognition engine according to Claims 2 or 3, it is characterised in that： In the S13, also include before the voice signal is matched with the data in the speech recognition library：To the voice Signal carries out filtration treatment.

6. a kind of AR interface alternation systems based on speech recognition engine, it is characterised in that：Draw including speech recognition engine and AR Hold up,

The speech recognition engine, it is used to generate phonetic order；

The AR engines, it is used for the coordinate position for recognizing dummy object in AR interfaces, and makes the virtual object in the AR interfaces Body performs corresponding interbehavior at the coordinate position according to the phonetic order.

7. a kind of AR interface alternation systems based on speech recognition engine according to claim 6, it is characterised in that：It is described Speech recognition engine specifically for,

The speech recognition library is imported into speech recognition engine,

Voice signal is inputted into the speech recognition engine, and by the number in the voice signal and the speech recognition library According to being matched, the speech recognition engine generates corresponding phonetic order according to matching result.

8. a kind of AR interface alternation systems based on speech recognition engine according to claim 7, it is characterised in that：It is described Speech recognition engine specifically for,

The instruction audio file is trained by speech recognition engine off-line learning instrument, and extracts the instruction audio Characteristic information in file, generates instruction text file；

The instruction text file is counted, and extracts the statistical relationship successively occurred between different instruction text；

The speech recognition modeling is tested repeatedly using parameters precision parameter, and it is described initial according to test result regulation Precision parameter, draws final precision parameter；

9. a kind of AR interface alternation systems based on speech recognition engine according to any one of claim 6 to 8, its feature It is：The speech recognition engine is based on the exploitation of PocketSphinx speech recognition systems.

10. a kind of AR interface alternation systems based on speech recognition engine according to claim 7 or 8, it is characterised in that： In the speech recognition engine, also include before the voice signal is matched with the data in the speech recognition library： Filtration treatment is carried out to the voice signal.