CN108363556A - A kind of method and system based on voice Yu augmented reality environmental interaction - Google Patents
A kind of method and system based on voice Yu augmented reality environmental interaction Download PDFInfo
- Publication number
- CN108363556A CN108363556A CN201810090559.6A CN201810090559A CN108363556A CN 108363556 A CN108363556 A CN 108363556A CN 201810090559 A CN201810090559 A CN 201810090559A CN 108363556 A CN108363556 A CN 108363556A
- Authority
- CN
- China
- Prior art keywords
- augmented reality
- operational order
- voice data
- scene
- subenvironment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/167—Audio in a user interface, e.g. using voice commands for navigating, audio feedback
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/003—Navigation within 3D models or images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
This application provides a kind of method and system based on voice Yu augmented reality environmental interaction, the method includes obtaining the voice data of user, obtain the corresponding operational order of the voice data;According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.The interactive efficiency of augmented reality environment can be improved by voice and augmented reality environmental interaction.
Description
【Technical field】
This application involves automation field more particularly to a kind of method based on voice and augmented reality environmental interaction and
System.
【Background technology】
Augmented reality (Augmented Reality, abbreviation AR) is a kind of position calculating camera image in real time
Set and angle and plus respective image, video, 3D models technology, the target of augmented reality is on the screen virtual generation
Boundary is sleeved on real world and carries out interaction.
Universal with mobile phone mobile device and handheld mobile device, the augmented reality (AR environment) based on mobile device is more
More to be recognized by user.
But the interactive means of the augmented reality environment based on mobile device are single, and gesture interaction or movement is only supported to set
Standby included GPS+ posture Sensor abilities, are interacted using gesture interaction or mobile device posture, will increase unnecessary action, shadow
Ring interactive efficiency.
【Invention content】
The many aspects of the application provide a kind of method and system based on voice Yu augmented reality environmental interaction, for carrying
The interactive efficiency of high augmented reality environment.
The one side of the application provides a kind of method based on voice Yu augmented reality environmental interaction, including:
The voice data for obtaining user, obtains the corresponding operational order of the voice data;
According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method obtains user's
Voice data, obtaining the corresponding operational order of the voice data includes:
Start audio monitoring service, the voice data of monitoring users;
Speech recognition is carried out to the voice data, obtains the corresponding identification text of the voice data;
Semantic analysis is carried out to the identification text, obtains the corresponding operational order of the identification text.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, to the identification
Text carries out semantic analysis, and obtaining the corresponding operational order of the identification text includes:
The identification text is accurately matched in preset operational order, searches corresponding operational order;With/
Or,
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, when the key
When word and at least two operational order successful match, according to the further selection of user, corresponding operational order is obtained.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the enhancing are existing
Real environment includes:Preset augmented reality subenvironment scene;Alternatively, carrying out feature point by the reality scene obtained to camera
Analyse obtained augmented reality subenvironment scene.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, according to the behaviour
It instructs, carrying out processing to augmented reality environment includes:
According to the operational order, it is existing that corresponding enhancing is carried out to the augmented reality information in augmented reality subenvironment scene
Real control operation.
The another aspect of the application provides a kind of system based on voice Yu augmented reality environmental interaction, including:
Operational order acquisition module, the voice data for obtaining user obtain the corresponding operation of the voice data and refer to
It enables;
Augmented reality processing module, for according to the operational order, augmented reality processing to be carried out to augmented reality environment,
Show the augmented reality handling result.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the operation refer to
Acquisition module is enabled, is specifically included:
Voice acquisition submodule, the voice data for starting user;
It is corresponding to obtain the voice data for carrying out speech recognition to the voice data for speech recognition submodule
Identify text;
It is corresponding to obtain the identification text for carrying out semantic analysis to the identification text for semantic analysis submodule
Operational order.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, described semantic point
Submodule is analysed, is specifically used for:
The identification text is accurately matched in preset operational order, searches corresponding operational order;With/
Or,
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, described semantic point
Submodule is analysed, is specifically used for:
When the keyword and at least two operational order successful match, according to the further selection of user, obtain pair
The operational order answered.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the enhancing are existing
Real environment includes:Preset augmented reality subenvironment scene;Alternatively, carrying out feature point by the reality scene obtained to camera
Analyse obtained augmented reality subenvironment scene.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the enhancing are existing
Real processing module, is specifically used for:
According to the operational order, it is existing that corresponding enhancing is carried out to the augmented reality information in augmented reality subenvironment scene
Real control operation.
Another aspect of the present invention, provides a kind of computer equipment, including memory, processor and is stored in the storage
On device and the computer program that can run on the processor, the processor are realized as previously discussed when executing described program
Method.
Another aspect of the present invention provides a kind of computer readable storage medium, is stored thereon with computer program, described
Method as described above is realized when program is executed by processor.
By the technical solution it is found that the embodiment of the present application can improve the interactive efficiency of augmented reality environment.
【Description of the drawings】
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is some realities of the application
Example is applied, it for those of ordinary skill in the art, without having to pay creative labor, can also be attached according to these
Figure obtains other attached drawings.
Fig. 1 is that the flow based on voice and the method for augmented reality environmental interaction that one embodiment of the application provides is illustrated
Figure;
Fig. 2 is the structural representation based on voice Yu the system of augmented reality environmental interaction that one embodiment of the application provides
Figure;
Fig. 3 shows the frame of the exemplary computer system/server 012 suitable for being used for realizing embodiment of the present invention
Figure.
【Specific implementation mode】
To keep the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, technical solutions in the embodiments of the present application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
The whole other embodiments obtained without creative efforts, shall fall in the protection scope of this application.
Fig. 1 is the schematic diagram based on voice Yu the method for augmented reality environmental interaction that one embodiment of the application provides, such as
Shown in Fig. 1, include the following steps:
Step S11, the voice data for obtaining user, obtains the corresponding operational order of the voice data;
Step S12, according to the operational order, augmented reality environment is handled, shows the augmented reality processing
As a result.
The present embodiment the method can be executed by the control device of augmented reality, the device can by software and/or
Hardware is realized, and is integrated in the mobile terminal with augmented reality function.Wherein, mobile terminal includes but is not limited to hand
The equipment that the users such as machine, tablet computer hold.
In a kind of preferred implementation of step S11,
Preferably, the voice data for obtaining user, it includes following sub-step to obtain the corresponding operational order of the voice data
Suddenly:
Sub-step S111, start audio monitoring service, the voice data of monitoring users;
Preferably, audio select equipment can be handheld device, such as the MIC of mobile phone or tablet computer.Wherein, it monitors and uses
The voice data at family.Wherein, the voice data of monitoring users can be the voice data of real-time monitoring users, can also be complete
The voice data of monitoring users after being operated at the next item up.For example, it may be after opening augmented reality function monitoring users language
Sound data, or complete the voice data of monitoring users after the display of augmented reality content.
Preferably, if current scene is default augmented reality subenvironment scene, user can be guided to input preset language
Sound operational order.For example, the augmented reality subenvironment scene is automobile 3D model subenvironment scenes, then in the scene, display
Such as " rotating model ", " scale-up model ", " reducing model " prompt, user can be according to the fixation of above-mentioned prompt input format
Voice, recognition accuracy are higher.Wherein, it is by the specific of the control device of augmented reality to preset augmented reality subenvironment scene
Entrance enters, for example, having preset multiple entrances such as automobile 3D models, personage's 3D models on the APP of control device, user clicks special
It is incorporated into mouth, that is, enters default augmented reality subenvironment scene, the display automobile 3D moulds in default augmented reality subenvironment scene
Type.
Sub-step S112, speech recognition is carried out to the voice data, obtains the corresponding identification text of the voice data;
Preferably, automatic speech recognition (Automatic Speech Recognition, ASR) is called to service, to user
Voice data parsed, obtain the corresponding voice recognition result of the voice, institute's speech recognition result is that voice corresponds to
Identification text.
Some existing speech recognition technologies may be used in the process of the speech recognition, include mainly:To voice data
Feature extraction is carried out, is decoded, is being solved using the characteristic and trained in advance acoustic model and language model of extraction
It can determine that the corresponding syntactic units of voice data, syntactic units such as phoneme or syllable are obtained according to decoding result when code
The corresponding identification text of current speech.
Sub-step S113, semantic analysis is carried out to the identification text, obtains the corresponding operational order of the identification text.
Preferably due in default augmented reality subenvironment scene, user can be to format according to guiding input
Therefore fixed voice can accurately match the identification text in preset operational order, search corresponding operation
Instruction.
Preferably for other augmented reality subenvironment scenes other than default augmented reality subenvironment scene, Yong Huye
Accurate can be carried out to the identification text in preset operational order with the fixation voice of input format, therefore
Match, searches corresponding operational order.
If not finding the operational order of the identification accurate matching of texts, the identification text is segmented
Processing generates keyword;According to the keyword, the operation with the Keywords matching is searched in preset operational order and is referred to
It enables.
Preferably, it can be based on semantics recognition technology, the identification text is matched with preset operational order.Example
Such as, the identification text is handled based on semantics recognition technology with preset operational order, and calculates phase between the two
Like degree, if similarity between the two is more than similarity threshold, it is determined that successful match;Otherwise, it determines matching is unsuccessful.This reality
It applies in example and similarity threshold is not especially limited, if similarity threshold can be 0.8.
Preferably, when the keyword and at least two operational order successful match, according to the further selection of user,
Obtain corresponding operational order.For example, according to multiple operational orders of successful match, a variety of choosings are provided in augmented reality environment
It selects, the selection operation made by user, further corresponding operational order.
In a kind of preferred implementation of step S12,
According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.
Preferably, the augmented reality environment includes:Preset augmented reality subenvironment scene;Alternatively, by camera shooting
The reality scene that head obtains carries out the augmented reality subenvironment scene that signature analysis obtains.
Preferably, in preset augmented reality subenvironment scene, referred to according to the fixed operation of formatting input by user
It enables, executes predetermined registration operation, for example, in preset automobile 3D models augmented reality subenvironment scene, to shown automobile 3D
Model such as is rotated, is amplified, being reduced at the operations.
Preferably, signature analysis is carried out by the reality scene that is obtained to camera, when camera captures certain objects,
Corresponding augmented reality subenvironment scene is then loaded, for example, when camera captures certain advertisement position, then corresponding advertisement is loaded and increases
Strong reality subenvironment scene.According to the operational order, the augmented reality information in augmented reality subenvironment scene is carried out pair
The augmented reality control operation answered.For example, user can input the control instruction of " repeating playing ", control advertisement augmented reality is certainly
Advertisement augmented reality information in environment scene is repeated playing;The control instruction of " rotation " can also be inputted, advertisement is controlled
Augmented reality is rotated from the advertisement augmented reality information in environment scene, and most suitable viewing angle viewing advertisement is selected to increase
Strong reality information.
Preferably, when camera does not capture certain objects, then entrance acquiescence augmented reality subenvironment scene, is waited for use
The operational order at family, for example, voice input by user is that " please recommend the sand of my a suitable my family space and decoration style collocation
Hair " carries out word segmentation processing to the identification text, generates keyword " space ", " style ", " sofa ";According to the keyword,
It finds and the operational order of the Keywords matching " display sofa ".Then sand is shown in current augmented reality subenvironment scene
The augmented reality information of hair.User can be inputted by the voice of more rounds and is adjusted to the augmented reality information of sofa, e.g.,
Change sofa type, change sofa color, change sofa size, change sofa angle etc..
Preferably, according to the operational order, after handling augmented reality environment, by treated, augmented reality is believed
Breath is plotted in the picture frame or video flowing of camera acquisition.
Specifically, using computer graphics disposal technology, AR information is drawn on picture frame or video flowing.
Treated augmented reality information and picture frame or video flowing are subjected to Rendering operations, finally obtained for exporting
Picture frame or video flowing;
Obtained picture frame will be rendered or video flowing is plotted in the memory for display;
The picture frame or video flowing in memory are would be mapped out, shows the screen of the mobile terminal with augmented reality function
On.
According to this embodiment, it can by voice and augmented reality environmental interaction, the interaction of augmented reality environment is improved
Efficiency.
It should be noted that for each method embodiment above-mentioned, for simple description, therefore it is all expressed as a series of
Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because
According to the application, certain steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know
It knows, embodiment described in this description belongs to preferred embodiment, involved action and module not necessarily the application
It is necessary.
It is the introduction about embodiment of the method above, below by way of device embodiment, to scheme of the present invention into traveling
One step explanation.
Fig. 2 is the structural representation based on voice Yu the system of augmented reality environmental interaction that one embodiment of the application provides
Figure, as shown in Fig. 2, including:
Operational order acquisition module 21, the voice data for obtaining user obtain the corresponding operation of the voice data
Instruction;
Augmented reality processing module 22 shows institute for according to the operational order, handling augmented reality environment
State augmented reality handling result.
System described in the present embodiment can be the control device of augmented reality to execute, the device can by software and/or
Hardware is realized, and is integrated in the mobile terminal with augmented reality function.Wherein, mobile terminal includes but is not limited to hand
The equipment that the users such as machine, tablet computer hold.
In a kind of preferred implementation of operational order acquisition module 21,
Preferably, the voice data for obtaining user, it includes following submodule to obtain the corresponding operational order of the voice data
Block:
Voice acquisition submodule 211, for starting audio monitoring service, the voice data of monitoring users;
Preferably, audio select equipment can be handheld device, such as the MIC of mobile phone or tablet computer.Wherein, it monitors and uses
The voice data at family.Wherein, the voice data of monitoring users can be the voice data of real-time monitoring users, can also be complete
The voice data of monitoring users after being operated at the next item up.For example, it may be after opening augmented reality function monitoring users language
Sound data, or complete the voice data of monitoring users after the display of augmented reality content.
Preferably, if current scene is default augmented reality subenvironment scene, user can be guided to input preset language
Sound operational order.For example, the augmented reality subenvironment scene is automobile 3D model subenvironment scenes, then in the scene, display
Such as " rotating model ", " scale-up model ", " reducing model " prompt, user can be according to the fixation of above-mentioned prompt input format
Voice, recognition accuracy are higher.Wherein, it is by the specific of the control device of augmented reality to preset augmented reality subenvironment scene
Entrance enters, for example, having preset multiple entrances such as automobile 3D models, personage's 3D models on the APP of control device, user clicks special
It is incorporated into mouth, that is, enters default augmented reality subenvironment scene, the display automobile 3D moulds in default augmented reality subenvironment scene
Type.
Speech recognition submodule 212 obtains the voice data and corresponds to for carrying out speech recognition to the voice data
Identification text;
Preferably, automatic speech recognition (Automatic Speech Recognition, ASR) is called to service, to user
Voice data parsed, obtain the corresponding voice recognition result of the voice, institute's speech recognition result is that voice corresponds to
Identification text.
Some existing speech recognition technologies may be used in the process of the speech recognition, include mainly:To voice data
Feature extraction is carried out, is decoded, is being solved using the characteristic and trained in advance acoustic model and language model of extraction
It can determine that the corresponding syntactic units of voice data, syntactic units such as phoneme or syllable are obtained according to decoding result when code
The corresponding identification text of current speech.
Semantic analysis submodule 213 obtains the identification text and corresponds to for carrying out semantic analysis to the identification text
Operational order.
Preferably due in default augmented reality subenvironment scene, user is the fixation formatted according to guiding input
Therefore voice can accurately match the identification text in preset operational order, search corresponding operation and refer to
It enables.
Preferably for other augmented reality subenvironment scenes other than default augmented reality subenvironment scene, Yong Huye
Accurate can be carried out to the identification text in preset operational order with the fixation voice of input format, therefore
Match, searches corresponding operational order.
If not finding the operational order of the identification accurate matching of texts, the identification text is segmented
Processing generates keyword;According to the keyword, the operation with the Keywords matching is searched in preset operational order and is referred to
It enables.
Preferably, it can be based on semantics recognition technology, the identification text is matched with preset operational order.Example
Such as, the identification text is handled based on semantics recognition technology with preset operational order, and calculates phase between the two
Like degree, if similarity between the two is more than similarity threshold, it is determined that successful match;Otherwise, it determines matching is unsuccessful.This reality
It applies in example and similarity threshold is not especially limited, if similarity threshold can be 0.8.
Preferably, when the keyword and at least two operational order successful match, according to the further selection of user,
Obtain corresponding operational order.For example, according to multiple operational orders of successful match, a variety of choosings are provided in augmented reality environment
It selects, the selection operation made by user, further corresponding operational order.
In a kind of preferred implementation of augmented reality processing module 22,
Augmented reality processing module 22 is handled augmented reality environment according to the operational order, shows the increasing
Strong reality handling result.
Preferably, the augmented reality environment includes:Preset augmented reality subenvironment scene;Alternatively, by camera shooting
The reality scene that head obtains carries out the augmented reality subenvironment scene that signature analysis obtains.
Preferably, in preset augmented reality subenvironment scene, referred to according to the fixed operation of formatting input by user
It enables, executes predetermined registration operation, for example, in preset automobile 3D models augmented reality subenvironment scene, to shown automobile 3D
Model such as is rotated, is amplified, being reduced at the operations.
Preferably, signature analysis is carried out by the reality scene that is obtained to camera, when camera captures certain objects,
Corresponding augmented reality subenvironment scene is then loaded, for example, when camera captures certain advertisement position, then corresponding advertisement is loaded and increases
Strong reality subenvironment scene.According to the operational order, the augmented reality information in augmented reality subenvironment scene is carried out pair
The augmented reality control operation answered.For example, user can input the control instruction of " repeating playing ", control advertisement augmented reality is certainly
Advertisement augmented reality information in environment scene is repeated playing;The control instruction of " rotation " can also be inputted, advertisement is controlled
Augmented reality is rotated from the advertisement augmented reality information in environment scene, and most suitable viewing angle viewing advertisement is selected to increase
Strong reality information.
Preferably, when camera does not capture certain objects, then entrance acquiescence augmented reality subenvironment scene, is waited for use
The operational order at family, for example, voice input by user is that " please recommend the sand of my a suitable my family space and decoration style collocation
Hair " carries out word segmentation processing to the identification text, generates keyword " space ", " style ", " sofa ";According to the keyword,
It finds and the operational order of the Keywords matching " display sofa ".Then sand is shown in current augmented reality subenvironment scene
The augmented reality information of hair.User can be inputted by the voice of more rounds and is adjusted to the augmented reality information of sofa, e.g.,
Change sofa type, change sofa color, change sofa size, change sofa angle etc..
Preferably, according to the operational order, after handling augmented reality environment, by treated, augmented reality is believed
Breath is plotted in the picture frame or video flowing of camera acquisition.
Specifically, using computer graphics disposal technology, AR information is drawn on picture frame or video flowing.
Treated augmented reality information and picture frame or video flowing are subjected to Rendering operations, finally obtained for exporting
Picture frame or video flowing;
Obtained picture frame will be rendered or video flowing is plotted in the memory for display;
The picture frame or video flowing in memory are would be mapped out, shows the screen of the mobile terminal with augmented reality function
On.
According to this embodiment, it can by voice and augmented reality environmental interaction, the interaction of augmented reality environment is improved
Efficiency.
In the described embodiment, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed method and apparatus can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be the INDIRECT COUPLING or logical by some interfaces, device or unit
Letter connection can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple
In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme
's.
In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, it can also
It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.The integrated list
The form that hardware had both may be used in member is realized, can also be realized in the form of hardware adds SFU software functional unit.
Fig. 3 shows the frame of the exemplary computer system/server 012 suitable for being used for realizing embodiment of the present invention
Figure.The computer system/server 012 that Fig. 3 is shown is only an example, function that should not be to the embodiment of the present invention and use
Range band carrys out any restrictions.
As shown in figure 3, computer system/server 012 is showed in the form of universal computing device.Computer system/clothes
The component of business device 012 can include but is not limited to:One or more processor or processing unit 016, system storage
028, the bus 018 of connection different system component (including system storage 028 and processing unit 016).
Bus 018 indicates one or more in a few class bus structures, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using the arbitrary bus structures in a variety of bus structures.It lifts
For example, these architectures include but not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Computer system/server 012 typically comprises a variety of computer system readable media.These media can be appointed
The usable medium what can be accessed by computer system/server 012, including volatile and non-volatile media, movably
With immovable medium.
System storage 028 may include the computer system readable media of form of volatile memory, such as deposit at random
Access to memory (RAM) 030 and/or cache memory 032.Computer system/server 012 may further include other
Removable/nonremovable, volatile/non-volatile computer system storage medium.Only as an example, storage system 034 can
For reading and writing immovable, non-volatile magnetic media (Fig. 3 do not show, commonly referred to as " hard disk drive ").Although in Fig. 3
It is not shown, can provide for the disc driver to moving non-volatile magnetic disk (such as " floppy disk ") read-write, and pair can
The CD drive that mobile anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) is read and write.In these situations
Under, each driver can be connected by one or more data media interfaces with bus 018.Memory 028 may include
There is one group of (for example, at least one) program module, these program modules to be configured at least one program product, the program product
To execute the function of various embodiments of the present invention.
Program/utility 040 with one group of (at least one) program module 042, can be stored in such as memory
In 028, such program module 042 includes --- but being not limited to --- operating system, one or more application program, other
Program module and program data may include the realization of network environment in each or certain combination in these examples.Journey
Sequence module 042 usually executes function and/or method in embodiment described in the invention.
Computer system/server 012 can also with one or more external equipments 014 (such as keyboard, sensing equipment,
Display 024 etc.) communication, in the present invention, computer system/server 012 is communicated with outside radar equipment, can also be with
One or more enable a user to the equipment interacted with the computer system/server 012 communication, and/or with make the meter
Any equipment that calculation machine systems/servers 012 can be communicated with one or more of the other computing device (such as network interface card, modulation
Demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 022.Also, computer system/clothes
Being engaged in device 012 can also be by network adapter 020 and one or more network (such as LAN (LAN), wide area network (WAN)
And/or public network, such as internet) communication.As shown in figure 3, network adapter 020 by bus 018 and computer system/
Other modules of server 012 communicate.It should be understood that although being not shown in Fig. 3, computer system/server 012 can be combined
Using other hardware and/or software module, including but not limited to:Microcode, device driver, redundant processing unit, external magnetic
Dish driving array, RAID system, tape drive and data backup storage system etc..
Processing unit 016 is stored in the program in system storage 028 by operation, described in the invention to execute
Function in embodiment and/or method.
Above-mentioned computer program can be set in computer storage media, i.e., the computer storage media is encoded with
Computer program, the program by one or more computers when being executed so that one or more computers execute in the present invention
State method flow shown in embodiment and/or device operation.
With time, the development of technology, medium meaning is more and more extensive, and the route of transmission of computer program is no longer limited by
Tangible medium, can also directly be downloaded from network etc..The arbitrary combination of one or more computer-readable media may be used.
Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer-readable storage medium
Matter for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or
The arbitrary above combination of person.The more specific example (non exhaustive list) of computer readable storage medium includes:There are one tools
Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM),
Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light
Memory device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer readable storage medium can
With to be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or
Person is in connection.
Computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated,
Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including --- but
It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be
Any computer-readable medium other than computer readable storage medium, which can send, propagate or
Transmission for by instruction execution system, device either device use or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
It can be write with one or more programming languages or combinations thereof for executing the computer that operates of the present invention
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
Further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion
Divide and partly executes or executed on a remote computer or server completely on the remote computer on the user computer.
Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or
Wide area network (WAN) is connected to subscriber computer, or, it may be connected to outer computer (such as provided using Internet service
Quotient is connected by internet).
Finally it should be noted that:Above example is only to illustrate the technical solution of the application, rather than its limitations;Although
The application is described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that:It still may be used
With technical scheme described in the above embodiments is modified or equivalent replacement of some of the technical features;
And these modifications or replacements, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (14)
1. a kind of method based on voice Yu augmented reality environmental interaction, which is characterized in that include the following steps:
The voice data for obtaining user, obtains the corresponding operational order of the voice data;
According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.
2. according to the method described in claim 1, it is characterized in that, the voice data of acquisition user, obtains the voice data
Corresponding operational order includes:
Start audio monitoring service, the voice data of monitoring users;
Speech recognition is carried out to the voice data, obtains the corresponding identification text of the voice data;
Semantic analysis is carried out to the identification text, obtains the corresponding operational order of the identification text.
3. according to the method described in claim 2, it is characterized in that, to identification text progress semantic analysis, obtain described
Identify that the corresponding operational order of text includes:
The identification text is accurately matched in preset operational order, searches corresponding operational order;And/or
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
4. according to the method described in claim 3, it is characterized in that,
When the keyword and at least two operational order successful match, according to the further selection of user, obtain corresponding
Operational order.
5. according to the method described in claim 1, it is characterized in that, the augmented reality environment includes:Preset augmented reality
Subenvironment scene;Alternatively, carrying out the augmented reality subenvironment field that signature analysis obtains by the reality scene obtained to camera
Scape.
6. according to the method described in claim 1, it is characterized in that, according to the operational order, augmented reality environment is carried out
Processing includes:
According to the operational order, corresponding augmented reality control is carried out to the augmented reality information in augmented reality subenvironment scene
System operation.
7. a kind of system based on voice Yu augmented reality environmental interaction, which is characterized in that including:
Operational order acquisition module, the voice data for obtaining user obtain the corresponding operational order of the voice data;
Augmented reality processing module, for according to the operational order, augmented reality processing, display to be carried out to augmented reality environment
The augmented reality handling result.
8. system according to claim 7, which is characterized in that the operational order acquisition module specifically includes:
Voice acquisition submodule, the voice data for starting user;
Speech recognition submodule obtains the corresponding identification of the voice data for carrying out speech recognition to the voice data
Text;
Semantic analysis submodule obtains the corresponding operation of the identification text for carrying out semantic analysis to the identification text
Instruction.
9. system according to claim 8, which is characterized in that the semantic analysis submodule is specifically used for:
The identification text is accurately matched in preset operational order, searches corresponding operational order;And/or
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
10. system according to claim 9, which is characterized in that the semantic analysis submodule is specifically used for:
When the keyword and at least two operational order successful match, according to the further selection of user, obtain corresponding
Operational order.
11. system according to claim 7, which is characterized in that
The augmented reality environment includes:Preset augmented reality subenvironment scene;Alternatively, passing through the reality obtained to camera
Scene carries out the augmented reality subenvironment scene that signature analysis obtains.
12. system according to claim 7, which is characterized in that the augmented reality processing module is specifically used for:
According to the operational order, corresponding augmented reality control is carried out to the augmented reality information in augmented reality subenvironment scene
System operation.
13. a kind of computer equipment, including memory, processor and it is stored on the memory and can be on the processor
The computer program of operation, which is characterized in that the processor is realized when executing described program as any in claim 1~6
Method described in.
14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that described program is handled
Such as method according to any one of claims 1 to 6 is realized when device executes.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810090559.6A CN108363556A (en) | 2018-01-30 | 2018-01-30 | A kind of method and system based on voice Yu augmented reality environmental interaction |
US16/177,060 US11397559B2 (en) | 2018-01-30 | 2018-10-31 | Method and system based on speech and augmented reality environment interaction |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810090559.6A CN108363556A (en) | 2018-01-30 | 2018-01-30 | A kind of method and system based on voice Yu augmented reality environmental interaction |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108363556A true CN108363556A (en) | 2018-08-03 |
Family
ID=63007317
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810090559.6A Pending CN108363556A (en) | 2018-01-30 | 2018-01-30 | A kind of method and system based on voice Yu augmented reality environmental interaction |
Country Status (2)
Country | Link |
---|---|
US (1) | US11397559B2 (en) |
CN (1) | CN108363556A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109065055A (en) * | 2018-09-13 | 2018-12-21 | 三星电子(中国)研发中心 | Method, storage medium and the device of AR content are generated based on sound |
CN111966321A (en) * | 2020-08-24 | 2020-11-20 | Oppo广东移动通信有限公司 | Volume adjusting method, AR device and storage medium |
CN112735413A (en) * | 2020-12-25 | 2021-04-30 | 浙江大华技术股份有限公司 | Instruction analysis method based on camera device, electronic equipment and storage medium |
WO2022111282A1 (en) * | 2020-11-24 | 2022-06-02 | International Business Machines Corporation | Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11361676B2 (en) * | 2019-06-14 | 2022-06-14 | International Business Machines Corporation, Armonk, Ny | Augmented reality techniques for simultaneously learning multiple languages |
US11798550B2 (en) | 2020-03-26 | 2023-10-24 | Snap Inc. | Speech-based selection of augmented reality content |
CN111583946A (en) * | 2020-04-30 | 2020-08-25 | 厦门快商通科技股份有限公司 | Voice signal enhancement method, device and equipment |
US11769500B2 (en) * | 2020-06-30 | 2023-09-26 | Snap Inc. | Augmented reality-based translation of speech in association with travel |
CN114371804A (en) * | 2021-12-03 | 2022-04-19 | 国家能源集团新能源技术研究院有限公司 | Electronic drawing browsing method and system |
CN114861653B (en) * | 2022-05-17 | 2023-08-22 | 马上消费金融股份有限公司 | Language generation method, device, equipment and storage medium for virtual interaction |
CN116719420B (en) * | 2023-08-09 | 2023-11-21 | 世优(北京)科技有限公司 | User action recognition method and system based on virtual reality |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1410298A (en) * | 2001-09-25 | 2003-04-16 | 公信电子股份有限公司 | Voice control method and device for controlling voice instruction by single key |
CN102520788A (en) * | 2011-11-16 | 2012-06-27 | 歌尔声学股份有限公司 | Voice identification control method |
CN103257703A (en) * | 2012-02-20 | 2013-08-21 | 联想(北京)有限公司 | Augmented reality device and method |
CN103632664A (en) * | 2012-08-20 | 2014-03-12 | 联想(北京)有限公司 | A method for speech recognition and an electronic device |
CN103793063A (en) * | 2014-03-11 | 2014-05-14 | 哈尔滨工业大学 | Multi-channel augmented reality system |
CN105117195A (en) * | 2015-09-09 | 2015-12-02 | 百度在线网络技术(北京)有限公司 | Method and device for guiding voice input |
CN105468142A (en) * | 2015-11-16 | 2016-04-06 | 上海璟世数字科技有限公司 | Interaction method and system based on augmented reality technique, and terminal |
US20160124501A1 (en) * | 2014-10-31 | 2016-05-05 | The United States Of America As Represented By The Secretary Of The Navy | Secured mobile maintenance and operator system including wearable augmented reality interface, voice command interface, and visual recognition systems and related methods |
CN106200930A (en) * | 2016-06-28 | 2016-12-07 | 广东欧珀移动通信有限公司 | The control method of a kind of augmented reality, device and mobile terminal |
CN106558310A (en) * | 2016-10-14 | 2017-04-05 | 北京百度网讯科技有限公司 | Virtual reality sound control method and device |
Family Cites Families (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10824310B2 (en) * | 2012-12-20 | 2020-11-03 | Sri International | Augmented reality virtual personal assistant for external representation |
US10430985B2 (en) * | 2014-03-14 | 2019-10-01 | Magic Leap, Inc. | Augmented reality systems and methods utilizing reflections |
CN104102412B (en) * | 2014-07-24 | 2017-12-12 | 央数文化(上海)股份有限公司 | A kind of hand-held reading device and method thereof based on augmented reality |
KR20160144665A (en) * | 2015-06-09 | 2016-12-19 | 에스케이플래닛 주식회사 | User equipment for recognizing object and displaying database matching result, control method thereof and computer readable medium having computer program recorded therefor |
US20170169611A1 (en) * | 2015-12-09 | 2017-06-15 | Lenovo (Singapore) Pte. Ltd. | Augmented reality workspace transitions based on contextual environment |
US20170337747A1 (en) * | 2016-05-20 | 2017-11-23 | Patrick M. HULL | Systems and methods for using an avatar to market a product |
US10298587B2 (en) * | 2016-06-20 | 2019-05-21 | International Business Machines Corporation | Peer-to-peer augmented reality handlers |
US20190258318A1 (en) * | 2016-06-28 | 2019-08-22 | Huawei Technologies Co., Ltd. | Terminal for controlling electronic device and processing method thereof |
US10042604B2 (en) * | 2016-07-01 | 2018-08-07 | Metrik LLC | Multi-dimensional reference element for mixed reality environments |
US10297085B2 (en) * | 2016-09-28 | 2019-05-21 | Intel Corporation | Augmented reality creations with interactive behavior and modality assignments |
US10297254B2 (en) * | 2016-10-03 | 2019-05-21 | Google Llc | Task initiation using long-tail voice commands by weighting strength of association of the tasks and their respective commands based on user feedback |
US11348475B2 (en) * | 2016-12-09 | 2022-05-31 | The Boeing Company | System and method for interactive cognitive task assistance |
US10360732B2 (en) * | 2017-03-23 | 2019-07-23 | Intel Corporation | Method and system of determining object positions for image processing using wireless network angle of transmission |
US10304239B2 (en) * | 2017-07-20 | 2019-05-28 | Qualcomm Incorporated | Extended reality virtual assistant |
US10553031B2 (en) * | 2017-12-06 | 2020-02-04 | Microsoft Technology Licensing, Llc | Digital project file presentation |
US10937240B2 (en) * | 2018-01-04 | 2021-03-02 | Intel Corporation | Augmented reality bindings of physical objects and virtual objects |
-
2018
- 2018-01-30 CN CN201810090559.6A patent/CN108363556A/en active Pending
- 2018-10-31 US US16/177,060 patent/US11397559B2/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1410298A (en) * | 2001-09-25 | 2003-04-16 | 公信电子股份有限公司 | Voice control method and device for controlling voice instruction by single key |
CN102520788A (en) * | 2011-11-16 | 2012-06-27 | 歌尔声学股份有限公司 | Voice identification control method |
CN103257703A (en) * | 2012-02-20 | 2013-08-21 | 联想(北京)有限公司 | Augmented reality device and method |
CN103632664A (en) * | 2012-08-20 | 2014-03-12 | 联想(北京)有限公司 | A method for speech recognition and an electronic device |
CN103793063A (en) * | 2014-03-11 | 2014-05-14 | 哈尔滨工业大学 | Multi-channel augmented reality system |
US20160124501A1 (en) * | 2014-10-31 | 2016-05-05 | The United States Of America As Represented By The Secretary Of The Navy | Secured mobile maintenance and operator system including wearable augmented reality interface, voice command interface, and visual recognition systems and related methods |
CN105117195A (en) * | 2015-09-09 | 2015-12-02 | 百度在线网络技术(北京)有限公司 | Method and device for guiding voice input |
CN105468142A (en) * | 2015-11-16 | 2016-04-06 | 上海璟世数字科技有限公司 | Interaction method and system based on augmented reality technique, and terminal |
CN106200930A (en) * | 2016-06-28 | 2016-12-07 | 广东欧珀移动通信有限公司 | The control method of a kind of augmented reality, device and mobile terminal |
CN106558310A (en) * | 2016-10-14 | 2017-04-05 | 北京百度网讯科技有限公司 | Virtual reality sound control method and device |
Non-Patent Citations (1)
Title |
---|
陈金华: "《智慧学习环境构建》", 1 September 2013 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109065055A (en) * | 2018-09-13 | 2018-12-21 | 三星电子(中国)研发中心 | Method, storage medium and the device of AR content are generated based on sound |
CN109065055B (en) * | 2018-09-13 | 2020-12-11 | 三星电子(中国)研发中心 | Method, storage medium, and apparatus for generating AR content based on sound |
CN111966321A (en) * | 2020-08-24 | 2020-11-20 | Oppo广东移动通信有限公司 | Volume adjusting method, AR device and storage medium |
WO2022111282A1 (en) * | 2020-11-24 | 2022-06-02 | International Business Machines Corporation | Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
GB2616765A (en) * | 2020-11-24 | 2023-09-20 | Ibm | AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
US11978444B2 (en) | 2020-11-24 | 2024-05-07 | International Business Machines Corporation | AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command |
CN112735413A (en) * | 2020-12-25 | 2021-04-30 | 浙江大华技术股份有限公司 | Instruction analysis method based on camera device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
US11397559B2 (en) | 2022-07-26 |
US20190235833A1 (en) | 2019-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108363556A (en) | A kind of method and system based on voice Yu augmented reality environmental interaction | |
US11100934B2 (en) | Method and apparatus for voiceprint creation and registration | |
JP7029613B2 (en) | Interfaces Smart interactive control methods, appliances, systems and programs | |
CN108877791B (en) | Voice interaction method, device, server, terminal and medium based on view | |
CN107481720B (en) | Explicit voiceprint recognition method and device | |
CN109036396A (en) | A kind of exchange method and system of third-party application | |
US11164571B2 (en) | Content recognizing method and apparatus, device, and computer storage medium | |
WO2019021088A1 (en) | Navigating video scenes using cognitive insights | |
CN108683937A (en) | Interactive voice feedback method, system and the computer-readable medium of smart television | |
CN110245348A (en) | A kind of intension recognizing method and system | |
CN104282302A (en) | Apparatus and method for recognizing voice and text | |
CN110232340A (en) | Establish the method, apparatus of video classification model and visual classification | |
CN109785829A (en) | A kind of customer service householder method and system based on voice control | |
CN109446907A (en) | A kind of method, apparatus of Video chat, equipment and computer storage medium | |
CN107463929A (en) | Processing method, device, equipment and the computer-readable recording medium of speech data | |
CN108495160A (en) | Intelligent control method, system, equipment and storage medium | |
CN107862035A (en) | Network read method, device, Intelligent flat and the storage medium of minutes | |
CN109800410A (en) | A kind of list generation method and system based on online chatting record | |
CN108268602A (en) | Analyze method, apparatus, equipment and the computer storage media of text topic point | |
CN111341307A (en) | Voice recognition method and device, electronic equipment and storage medium | |
CN113763925B (en) | Speech recognition method, device, computer equipment and storage medium | |
CN115422932A (en) | Word vector training method and device, electronic equipment and storage medium | |
CN107944448A (en) | A kind of image asynchronous edit methods and device | |
JP6944920B2 (en) | Smart interactive processing methods, equipment, equipment and computer storage media | |
CN113655933A (en) | Text labeling method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180803 |