CN108363556A - A kind of method and system based on voice Yu augmented reality environmental interaction - Google Patents

A kind of method and system based on voice Yu augmented reality environmental interaction Download PDF

Info

Publication number
CN108363556A
CN108363556A CN201810090559.6A CN201810090559A CN108363556A CN 108363556 A CN108363556 A CN 108363556A CN 201810090559 A CN201810090559 A CN 201810090559A CN 108363556 A CN108363556 A CN 108363556A
Authority
CN
China
Prior art keywords
augmented reality
operational order
voice data
scene
subenvironment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810090559.6A
Other languages
Chinese (zh)
Inventor
谢高喜
滕禹桥
任大韫
姚淼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810090559.6A priority Critical patent/CN108363556A/en
Publication of CN108363556A publication Critical patent/CN108363556A/en
Priority to US16/177,060 priority patent/US11397559B2/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/167Audio in a user interface, e.g. using voice commands for navigating, audio feedback
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/003Navigation within 3D models or images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Abstract

This application provides a kind of method and system based on voice Yu augmented reality environmental interaction, the method includes obtaining the voice data of user, obtain the corresponding operational order of the voice data;According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.The interactive efficiency of augmented reality environment can be improved by voice and augmented reality environmental interaction.

Description

A kind of method and system based on voice Yu augmented reality environmental interaction
【Technical field】
This application involves automation field more particularly to a kind of method based on voice and augmented reality environmental interaction and System.
【Background technology】
Augmented reality (Augmented Reality, abbreviation AR) is a kind of position calculating camera image in real time Set and angle and plus respective image, video, 3D models technology, the target of augmented reality is on the screen virtual generation Boundary is sleeved on real world and carries out interaction.
Universal with mobile phone mobile device and handheld mobile device, the augmented reality (AR environment) based on mobile device is more More to be recognized by user.
But the interactive means of the augmented reality environment based on mobile device are single, and gesture interaction or movement is only supported to set Standby included GPS+ posture Sensor abilities, are interacted using gesture interaction or mobile device posture, will increase unnecessary action, shadow Ring interactive efficiency.
【Invention content】
The many aspects of the application provide a kind of method and system based on voice Yu augmented reality environmental interaction, for carrying The interactive efficiency of high augmented reality environment.
The one side of the application provides a kind of method based on voice Yu augmented reality environmental interaction, including:
The voice data for obtaining user, obtains the corresponding operational order of the voice data;
According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method obtains user's Voice data, obtaining the corresponding operational order of the voice data includes:
Start audio monitoring service, the voice data of monitoring users;
Speech recognition is carried out to the voice data, obtains the corresponding identification text of the voice data;
Semantic analysis is carried out to the identification text, obtains the corresponding operational order of the identification text.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, to the identification Text carries out semantic analysis, and obtaining the corresponding operational order of the identification text includes:
The identification text is accurately matched in preset operational order, searches corresponding operational order;With/ Or,
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, when the key When word and at least two operational order successful match, according to the further selection of user, corresponding operational order is obtained.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the enhancing are existing Real environment includes:Preset augmented reality subenvironment scene;Alternatively, carrying out feature point by the reality scene obtained to camera Analyse obtained augmented reality subenvironment scene.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, according to the behaviour It instructs, carrying out processing to augmented reality environment includes:
According to the operational order, it is existing that corresponding enhancing is carried out to the augmented reality information in augmented reality subenvironment scene Real control operation.
The another aspect of the application provides a kind of system based on voice Yu augmented reality environmental interaction, including:
Operational order acquisition module, the voice data for obtaining user obtain the corresponding operation of the voice data and refer to It enables;
Augmented reality processing module, for according to the operational order, augmented reality processing to be carried out to augmented reality environment, Show the augmented reality handling result.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the operation refer to Acquisition module is enabled, is specifically included:
Voice acquisition submodule, the voice data for starting user;
It is corresponding to obtain the voice data for carrying out speech recognition to the voice data for speech recognition submodule Identify text;
It is corresponding to obtain the identification text for carrying out semantic analysis to the identification text for semantic analysis submodule Operational order.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, described semantic point Submodule is analysed, is specifically used for:
The identification text is accurately matched in preset operational order, searches corresponding operational order;With/ Or,
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, described semantic point Submodule is analysed, is specifically used for:
When the keyword and at least two operational order successful match, according to the further selection of user, obtain pair The operational order answered.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the enhancing are existing Real environment includes:Preset augmented reality subenvironment scene;Alternatively, carrying out feature point by the reality scene obtained to camera Analyse obtained augmented reality subenvironment scene.
The aspect and any possible implementation manners as described above, it is further provided a kind of realization method, the enhancing are existing Real processing module, is specifically used for:
According to the operational order, it is existing that corresponding enhancing is carried out to the augmented reality information in augmented reality subenvironment scene Real control operation.
Another aspect of the present invention, provides a kind of computer equipment, including memory, processor and is stored in the storage On device and the computer program that can run on the processor, the processor are realized as previously discussed when executing described program Method.
Another aspect of the present invention provides a kind of computer readable storage medium, is stored thereon with computer program, described Method as described above is realized when program is executed by processor.
By the technical solution it is found that the embodiment of the present application can improve the interactive efficiency of augmented reality environment.
【Description of the drawings】
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is some realities of the application Example is applied, it for those of ordinary skill in the art, without having to pay creative labor, can also be attached according to these Figure obtains other attached drawings.
Fig. 1 is that the flow based on voice and the method for augmented reality environmental interaction that one embodiment of the application provides is illustrated Figure;
Fig. 2 is the structural representation based on voice Yu the system of augmented reality environmental interaction that one embodiment of the application provides Figure;
Fig. 3 shows the frame of the exemplary computer system/server 012 suitable for being used for realizing embodiment of the present invention Figure.
【Specific implementation mode】
To keep the purpose, technical scheme and advantage of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, technical solutions in the embodiments of the present application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art The whole other embodiments obtained without creative efforts, shall fall in the protection scope of this application.
Fig. 1 is the schematic diagram based on voice Yu the method for augmented reality environmental interaction that one embodiment of the application provides, such as Shown in Fig. 1, include the following steps:
Step S11, the voice data for obtaining user, obtains the corresponding operational order of the voice data;
Step S12, according to the operational order, augmented reality environment is handled, shows the augmented reality processing As a result.
The present embodiment the method can be executed by the control device of augmented reality, the device can by software and/or Hardware is realized, and is integrated in the mobile terminal with augmented reality function.Wherein, mobile terminal includes but is not limited to hand The equipment that the users such as machine, tablet computer hold.
In a kind of preferred implementation of step S11,
Preferably, the voice data for obtaining user, it includes following sub-step to obtain the corresponding operational order of the voice data Suddenly:
Sub-step S111, start audio monitoring service, the voice data of monitoring users;
Preferably, audio select equipment can be handheld device, such as the MIC of mobile phone or tablet computer.Wherein, it monitors and uses The voice data at family.Wherein, the voice data of monitoring users can be the voice data of real-time monitoring users, can also be complete The voice data of monitoring users after being operated at the next item up.For example, it may be after opening augmented reality function monitoring users language Sound data, or complete the voice data of monitoring users after the display of augmented reality content.
Preferably, if current scene is default augmented reality subenvironment scene, user can be guided to input preset language Sound operational order.For example, the augmented reality subenvironment scene is automobile 3D model subenvironment scenes, then in the scene, display Such as " rotating model ", " scale-up model ", " reducing model " prompt, user can be according to the fixation of above-mentioned prompt input format Voice, recognition accuracy are higher.Wherein, it is by the specific of the control device of augmented reality to preset augmented reality subenvironment scene Entrance enters, for example, having preset multiple entrances such as automobile 3D models, personage's 3D models on the APP of control device, user clicks special It is incorporated into mouth, that is, enters default augmented reality subenvironment scene, the display automobile 3D moulds in default augmented reality subenvironment scene Type.
Sub-step S112, speech recognition is carried out to the voice data, obtains the corresponding identification text of the voice data;
Preferably, automatic speech recognition (Automatic Speech Recognition, ASR) is called to service, to user Voice data parsed, obtain the corresponding voice recognition result of the voice, institute's speech recognition result is that voice corresponds to Identification text.
Some existing speech recognition technologies may be used in the process of the speech recognition, include mainly:To voice data Feature extraction is carried out, is decoded, is being solved using the characteristic and trained in advance acoustic model and language model of extraction It can determine that the corresponding syntactic units of voice data, syntactic units such as phoneme or syllable are obtained according to decoding result when code The corresponding identification text of current speech.
Sub-step S113, semantic analysis is carried out to the identification text, obtains the corresponding operational order of the identification text.
Preferably due in default augmented reality subenvironment scene, user can be to format according to guiding input Therefore fixed voice can accurately match the identification text in preset operational order, search corresponding operation Instruction.
Preferably for other augmented reality subenvironment scenes other than default augmented reality subenvironment scene, Yong Huye Accurate can be carried out to the identification text in preset operational order with the fixation voice of input format, therefore Match, searches corresponding operational order.
If not finding the operational order of the identification accurate matching of texts, the identification text is segmented Processing generates keyword;According to the keyword, the operation with the Keywords matching is searched in preset operational order and is referred to It enables.
Preferably, it can be based on semantics recognition technology, the identification text is matched with preset operational order.Example Such as, the identification text is handled based on semantics recognition technology with preset operational order, and calculates phase between the two Like degree, if similarity between the two is more than similarity threshold, it is determined that successful match;Otherwise, it determines matching is unsuccessful.This reality It applies in example and similarity threshold is not especially limited, if similarity threshold can be 0.8.
Preferably, when the keyword and at least two operational order successful match, according to the further selection of user, Obtain corresponding operational order.For example, according to multiple operational orders of successful match, a variety of choosings are provided in augmented reality environment It selects, the selection operation made by user, further corresponding operational order.
In a kind of preferred implementation of step S12,
According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.
Preferably, the augmented reality environment includes:Preset augmented reality subenvironment scene;Alternatively, by camera shooting The reality scene that head obtains carries out the augmented reality subenvironment scene that signature analysis obtains.
Preferably, in preset augmented reality subenvironment scene, referred to according to the fixed operation of formatting input by user It enables, executes predetermined registration operation, for example, in preset automobile 3D models augmented reality subenvironment scene, to shown automobile 3D Model such as is rotated, is amplified, being reduced at the operations.
Preferably, signature analysis is carried out by the reality scene that is obtained to camera, when camera captures certain objects, Corresponding augmented reality subenvironment scene is then loaded, for example, when camera captures certain advertisement position, then corresponding advertisement is loaded and increases Strong reality subenvironment scene.According to the operational order, the augmented reality information in augmented reality subenvironment scene is carried out pair The augmented reality control operation answered.For example, user can input the control instruction of " repeating playing ", control advertisement augmented reality is certainly Advertisement augmented reality information in environment scene is repeated playing;The control instruction of " rotation " can also be inputted, advertisement is controlled Augmented reality is rotated from the advertisement augmented reality information in environment scene, and most suitable viewing angle viewing advertisement is selected to increase Strong reality information.
Preferably, when camera does not capture certain objects, then entrance acquiescence augmented reality subenvironment scene, is waited for use The operational order at family, for example, voice input by user is that " please recommend the sand of my a suitable my family space and decoration style collocation Hair " carries out word segmentation processing to the identification text, generates keyword " space ", " style ", " sofa ";According to the keyword, It finds and the operational order of the Keywords matching " display sofa ".Then sand is shown in current augmented reality subenvironment scene The augmented reality information of hair.User can be inputted by the voice of more rounds and is adjusted to the augmented reality information of sofa, e.g., Change sofa type, change sofa color, change sofa size, change sofa angle etc..
Preferably, according to the operational order, after handling augmented reality environment, by treated, augmented reality is believed Breath is plotted in the picture frame or video flowing of camera acquisition.
Specifically, using computer graphics disposal technology, AR information is drawn on picture frame or video flowing.
Treated augmented reality information and picture frame or video flowing are subjected to Rendering operations, finally obtained for exporting Picture frame or video flowing;
Obtained picture frame will be rendered or video flowing is plotted in the memory for display;
The picture frame or video flowing in memory are would be mapped out, shows the screen of the mobile terminal with augmented reality function On.
According to this embodiment, it can by voice and augmented reality environmental interaction, the interaction of augmented reality environment is improved Efficiency.
It should be noted that for each method embodiment above-mentioned, for simple description, therefore it is all expressed as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can be performed in other orders or simultaneously.Secondly, those skilled in the art should also know It knows, embodiment described in this description belongs to preferred embodiment, involved action and module not necessarily the application It is necessary.
It is the introduction about embodiment of the method above, below by way of device embodiment, to scheme of the present invention into traveling One step explanation.
Fig. 2 is the structural representation based on voice Yu the system of augmented reality environmental interaction that one embodiment of the application provides Figure, as shown in Fig. 2, including:
Operational order acquisition module 21, the voice data for obtaining user obtain the corresponding operation of the voice data Instruction;
Augmented reality processing module 22 shows institute for according to the operational order, handling augmented reality environment State augmented reality handling result.
System described in the present embodiment can be the control device of augmented reality to execute, the device can by software and/or Hardware is realized, and is integrated in the mobile terminal with augmented reality function.Wherein, mobile terminal includes but is not limited to hand The equipment that the users such as machine, tablet computer hold.
In a kind of preferred implementation of operational order acquisition module 21,
Preferably, the voice data for obtaining user, it includes following submodule to obtain the corresponding operational order of the voice data Block:
Voice acquisition submodule 211, for starting audio monitoring service, the voice data of monitoring users;
Preferably, audio select equipment can be handheld device, such as the MIC of mobile phone or tablet computer.Wherein, it monitors and uses The voice data at family.Wherein, the voice data of monitoring users can be the voice data of real-time monitoring users, can also be complete The voice data of monitoring users after being operated at the next item up.For example, it may be after opening augmented reality function monitoring users language Sound data, or complete the voice data of monitoring users after the display of augmented reality content.
Preferably, if current scene is default augmented reality subenvironment scene, user can be guided to input preset language Sound operational order.For example, the augmented reality subenvironment scene is automobile 3D model subenvironment scenes, then in the scene, display Such as " rotating model ", " scale-up model ", " reducing model " prompt, user can be according to the fixation of above-mentioned prompt input format Voice, recognition accuracy are higher.Wherein, it is by the specific of the control device of augmented reality to preset augmented reality subenvironment scene Entrance enters, for example, having preset multiple entrances such as automobile 3D models, personage's 3D models on the APP of control device, user clicks special It is incorporated into mouth, that is, enters default augmented reality subenvironment scene, the display automobile 3D moulds in default augmented reality subenvironment scene Type.
Speech recognition submodule 212 obtains the voice data and corresponds to for carrying out speech recognition to the voice data Identification text;
Preferably, automatic speech recognition (Automatic Speech Recognition, ASR) is called to service, to user Voice data parsed, obtain the corresponding voice recognition result of the voice, institute's speech recognition result is that voice corresponds to Identification text.
Some existing speech recognition technologies may be used in the process of the speech recognition, include mainly:To voice data Feature extraction is carried out, is decoded, is being solved using the characteristic and trained in advance acoustic model and language model of extraction It can determine that the corresponding syntactic units of voice data, syntactic units such as phoneme or syllable are obtained according to decoding result when code The corresponding identification text of current speech.
Semantic analysis submodule 213 obtains the identification text and corresponds to for carrying out semantic analysis to the identification text Operational order.
Preferably due in default augmented reality subenvironment scene, user is the fixation formatted according to guiding input Therefore voice can accurately match the identification text in preset operational order, search corresponding operation and refer to It enables.
Preferably for other augmented reality subenvironment scenes other than default augmented reality subenvironment scene, Yong Huye Accurate can be carried out to the identification text in preset operational order with the fixation voice of input format, therefore Match, searches corresponding operational order.
If not finding the operational order of the identification accurate matching of texts, the identification text is segmented Processing generates keyword;According to the keyword, the operation with the Keywords matching is searched in preset operational order and is referred to It enables.
Preferably, it can be based on semantics recognition technology, the identification text is matched with preset operational order.Example Such as, the identification text is handled based on semantics recognition technology with preset operational order, and calculates phase between the two Like degree, if similarity between the two is more than similarity threshold, it is determined that successful match;Otherwise, it determines matching is unsuccessful.This reality It applies in example and similarity threshold is not especially limited, if similarity threshold can be 0.8.
Preferably, when the keyword and at least two operational order successful match, according to the further selection of user, Obtain corresponding operational order.For example, according to multiple operational orders of successful match, a variety of choosings are provided in augmented reality environment It selects, the selection operation made by user, further corresponding operational order.
In a kind of preferred implementation of augmented reality processing module 22,
Augmented reality processing module 22 is handled augmented reality environment according to the operational order, shows the increasing Strong reality handling result.
Preferably, the augmented reality environment includes:Preset augmented reality subenvironment scene;Alternatively, by camera shooting The reality scene that head obtains carries out the augmented reality subenvironment scene that signature analysis obtains.
Preferably, in preset augmented reality subenvironment scene, referred to according to the fixed operation of formatting input by user It enables, executes predetermined registration operation, for example, in preset automobile 3D models augmented reality subenvironment scene, to shown automobile 3D Model such as is rotated, is amplified, being reduced at the operations.
Preferably, signature analysis is carried out by the reality scene that is obtained to camera, when camera captures certain objects, Corresponding augmented reality subenvironment scene is then loaded, for example, when camera captures certain advertisement position, then corresponding advertisement is loaded and increases Strong reality subenvironment scene.According to the operational order, the augmented reality information in augmented reality subenvironment scene is carried out pair The augmented reality control operation answered.For example, user can input the control instruction of " repeating playing ", control advertisement augmented reality is certainly Advertisement augmented reality information in environment scene is repeated playing;The control instruction of " rotation " can also be inputted, advertisement is controlled Augmented reality is rotated from the advertisement augmented reality information in environment scene, and most suitable viewing angle viewing advertisement is selected to increase Strong reality information.
Preferably, when camera does not capture certain objects, then entrance acquiescence augmented reality subenvironment scene, is waited for use The operational order at family, for example, voice input by user is that " please recommend the sand of my a suitable my family space and decoration style collocation Hair " carries out word segmentation processing to the identification text, generates keyword " space ", " style ", " sofa ";According to the keyword, It finds and the operational order of the Keywords matching " display sofa ".Then sand is shown in current augmented reality subenvironment scene The augmented reality information of hair.User can be inputted by the voice of more rounds and is adjusted to the augmented reality information of sofa, e.g., Change sofa type, change sofa color, change sofa size, change sofa angle etc..
Preferably, according to the operational order, after handling augmented reality environment, by treated, augmented reality is believed Breath is plotted in the picture frame or video flowing of camera acquisition.
Specifically, using computer graphics disposal technology, AR information is drawn on picture frame or video flowing.
Treated augmented reality information and picture frame or video flowing are subjected to Rendering operations, finally obtained for exporting Picture frame or video flowing;
Obtained picture frame will be rendered or video flowing is plotted in the memory for display;
The picture frame or video flowing in memory are would be mapped out, shows the screen of the mobile terminal with augmented reality function On.
According to this embodiment, it can by voice and augmented reality environmental interaction, the interaction of augmented reality environment is improved Efficiency.
In the described embodiment, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed method and apparatus can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of division of logic function, formula that in actual implementation, there may be another division manner, such as multiple units or component can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be the INDIRECT COUPLING or logical by some interfaces, device or unit Letter connection can be electrical, machinery or other forms.
The unit illustrated as separating component may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, you can be located at a place, or may be distributed over multiple In network element.Some or all of unit therein can be selected according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the application can be integrated in a processing unit, it can also It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.The integrated list The form that hardware had both may be used in member is realized, can also be realized in the form of hardware adds SFU software functional unit.
Fig. 3 shows the frame of the exemplary computer system/server 012 suitable for being used for realizing embodiment of the present invention Figure.The computer system/server 012 that Fig. 3 is shown is only an example, function that should not be to the embodiment of the present invention and use Range band carrys out any restrictions.
As shown in figure 3, computer system/server 012 is showed in the form of universal computing device.Computer system/clothes The component of business device 012 can include but is not limited to:One or more processor or processing unit 016, system storage 028, the bus 018 of connection different system component (including system storage 028 and processing unit 016).
Bus 018 indicates one or more in a few class bus structures, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using the arbitrary bus structures in a variety of bus structures.It lifts For example, these architectures include but not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Computer system/server 012 typically comprises a variety of computer system readable media.These media can be appointed The usable medium what can be accessed by computer system/server 012, including volatile and non-volatile media, movably With immovable medium.
System storage 028 may include the computer system readable media of form of volatile memory, such as deposit at random Access to memory (RAM) 030 and/or cache memory 032.Computer system/server 012 may further include other Removable/nonremovable, volatile/non-volatile computer system storage medium.Only as an example, storage system 034 can For reading and writing immovable, non-volatile magnetic media (Fig. 3 do not show, commonly referred to as " hard disk drive ").Although in Fig. 3 It is not shown, can provide for the disc driver to moving non-volatile magnetic disk (such as " floppy disk ") read-write, and pair can The CD drive that mobile anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) is read and write.In these situations Under, each driver can be connected by one or more data media interfaces with bus 018.Memory 028 may include There is one group of (for example, at least one) program module, these program modules to be configured at least one program product, the program product To execute the function of various embodiments of the present invention.
Program/utility 040 with one group of (at least one) program module 042, can be stored in such as memory In 028, such program module 042 includes --- but being not limited to --- operating system, one or more application program, other Program module and program data may include the realization of network environment in each or certain combination in these examples.Journey Sequence module 042 usually executes function and/or method in embodiment described in the invention.
Computer system/server 012 can also with one or more external equipments 014 (such as keyboard, sensing equipment, Display 024 etc.) communication, in the present invention, computer system/server 012 is communicated with outside radar equipment, can also be with One or more enable a user to the equipment interacted with the computer system/server 012 communication, and/or with make the meter Any equipment that calculation machine systems/servers 012 can be communicated with one or more of the other computing device (such as network interface card, modulation Demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 022.Also, computer system/clothes Being engaged in device 012 can also be by network adapter 020 and one or more network (such as LAN (LAN), wide area network (WAN) And/or public network, such as internet) communication.As shown in figure 3, network adapter 020 by bus 018 and computer system/ Other modules of server 012 communicate.It should be understood that although being not shown in Fig. 3, computer system/server 012 can be combined Using other hardware and/or software module, including but not limited to:Microcode, device driver, redundant processing unit, external magnetic Dish driving array, RAID system, tape drive and data backup storage system etc..
Processing unit 016 is stored in the program in system storage 028 by operation, described in the invention to execute Function in embodiment and/or method.
Above-mentioned computer program can be set in computer storage media, i.e., the computer storage media is encoded with Computer program, the program by one or more computers when being executed so that one or more computers execute in the present invention State method flow shown in embodiment and/or device operation.
With time, the development of technology, medium meaning is more and more extensive, and the route of transmission of computer program is no longer limited by Tangible medium, can also directly be downloaded from network etc..The arbitrary combination of one or more computer-readable media may be used. Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer-readable storage medium Matter for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or The arbitrary above combination of person.The more specific example (non exhaustive list) of computer readable storage medium includes:There are one tools Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light Memory device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer readable storage medium can With to be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or Person is in connection.
Computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated, Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium other than computer readable storage medium, which can send, propagate or Transmission for by instruction execution system, device either device use or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
It can be write with one or more programming languages or combinations thereof for executing the computer that operates of the present invention Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, Further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partly executes or executed on a remote computer or server completely on the remote computer on the user computer. Be related in the situation of remote computer, remote computer can pass through the network of any kind --- including LAN (LAN) or Wide area network (WAN) is connected to subscriber computer, or, it may be connected to outer computer (such as provided using Internet service Quotient is connected by internet).
Finally it should be noted that:Above example is only to illustrate the technical solution of the application, rather than its limitations;Although The application is described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that:It still may be used With technical scheme described in the above embodiments is modified or equivalent replacement of some of the technical features; And these modifications or replacements, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (14)

1. a kind of method based on voice Yu augmented reality environmental interaction, which is characterized in that include the following steps:
The voice data for obtaining user, obtains the corresponding operational order of the voice data;
According to the operational order, augmented reality environment is handled, shows the augmented reality handling result.
2. according to the method described in claim 1, it is characterized in that, the voice data of acquisition user, obtains the voice data Corresponding operational order includes:
Start audio monitoring service, the voice data of monitoring users;
Speech recognition is carried out to the voice data, obtains the corresponding identification text of the voice data;
Semantic analysis is carried out to the identification text, obtains the corresponding operational order of the identification text.
3. according to the method described in claim 2, it is characterized in that, to identification text progress semantic analysis, obtain described Identify that the corresponding operational order of text includes:
The identification text is accurately matched in preset operational order, searches corresponding operational order;And/or
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
4. according to the method described in claim 3, it is characterized in that,
When the keyword and at least two operational order successful match, according to the further selection of user, obtain corresponding Operational order.
5. according to the method described in claim 1, it is characterized in that, the augmented reality environment includes:Preset augmented reality Subenvironment scene;Alternatively, carrying out the augmented reality subenvironment field that signature analysis obtains by the reality scene obtained to camera Scape.
6. according to the method described in claim 1, it is characterized in that, according to the operational order, augmented reality environment is carried out Processing includes:
According to the operational order, corresponding augmented reality control is carried out to the augmented reality information in augmented reality subenvironment scene System operation.
7. a kind of system based on voice Yu augmented reality environmental interaction, which is characterized in that including:
Operational order acquisition module, the voice data for obtaining user obtain the corresponding operational order of the voice data;
Augmented reality processing module, for according to the operational order, augmented reality processing, display to be carried out to augmented reality environment The augmented reality handling result.
8. system according to claim 7, which is characterized in that the operational order acquisition module specifically includes:
Voice acquisition submodule, the voice data for starting user;
Speech recognition submodule obtains the corresponding identification of the voice data for carrying out speech recognition to the voice data Text;
Semantic analysis submodule obtains the corresponding operation of the identification text for carrying out semantic analysis to the identification text Instruction.
9. system according to claim 8, which is characterized in that the semantic analysis submodule is specifically used for:
The identification text is accurately matched in preset operational order, searches corresponding operational order;And/or
Word segmentation processing is carried out to the identification text, generates keyword, searches the operational order with the Keywords matching.
10. system according to claim 9, which is characterized in that the semantic analysis submodule is specifically used for:
When the keyword and at least two operational order successful match, according to the further selection of user, obtain corresponding Operational order.
11. system according to claim 7, which is characterized in that
The augmented reality environment includes:Preset augmented reality subenvironment scene;Alternatively, passing through the reality obtained to camera Scene carries out the augmented reality subenvironment scene that signature analysis obtains.
12. system according to claim 7, which is characterized in that the augmented reality processing module is specifically used for:
According to the operational order, corresponding augmented reality control is carried out to the augmented reality information in augmented reality subenvironment scene System operation.
13. a kind of computer equipment, including memory, processor and it is stored on the memory and can be on the processor The computer program of operation, which is characterized in that the processor is realized when executing described program as any in claim 1~6 Method described in.
14. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that described program is handled Such as method according to any one of claims 1 to 6 is realized when device executes.
CN201810090559.6A 2018-01-30 2018-01-30 A kind of method and system based on voice Yu augmented reality environmental interaction Pending CN108363556A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810090559.6A CN108363556A (en) 2018-01-30 2018-01-30 A kind of method and system based on voice Yu augmented reality environmental interaction
US16/177,060 US11397559B2 (en) 2018-01-30 2018-10-31 Method and system based on speech and augmented reality environment interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810090559.6A CN108363556A (en) 2018-01-30 2018-01-30 A kind of method and system based on voice Yu augmented reality environmental interaction

Publications (1)

Publication Number Publication Date
CN108363556A true CN108363556A (en) 2018-08-03

Family

ID=63007317

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810090559.6A Pending CN108363556A (en) 2018-01-30 2018-01-30 A kind of method and system based on voice Yu augmented reality environmental interaction

Country Status (2)

Country Link
US (1) US11397559B2 (en)
CN (1) CN108363556A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109065055A (en) * 2018-09-13 2018-12-21 三星电子(中国)研发中心 Method, storage medium and the device of AR content are generated based on sound
CN111966321A (en) * 2020-08-24 2020-11-20 Oppo广东移动通信有限公司 Volume adjusting method, AR device and storage medium
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium
WO2022111282A1 (en) * 2020-11-24 2022-06-02 International Business Machines Corporation Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11361676B2 (en) * 2019-06-14 2022-06-14 International Business Machines Corporation, Armonk, Ny Augmented reality techniques for simultaneously learning multiple languages
US11798550B2 (en) 2020-03-26 2023-10-24 Snap Inc. Speech-based selection of augmented reality content
CN111583946A (en) * 2020-04-30 2020-08-25 厦门快商通科技股份有限公司 Voice signal enhancement method, device and equipment
US11769500B2 (en) * 2020-06-30 2023-09-26 Snap Inc. Augmented reality-based translation of speech in association with travel
CN114371804A (en) * 2021-12-03 2022-04-19 国家能源集团新能源技术研究院有限公司 Electronic drawing browsing method and system
CN114861653B (en) * 2022-05-17 2023-08-22 马上消费金融股份有限公司 Language generation method, device, equipment and storage medium for virtual interaction
CN116719420B (en) * 2023-08-09 2023-11-21 世优(北京)科技有限公司 User action recognition method and system based on virtual reality

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1410298A (en) * 2001-09-25 2003-04-16 公信电子股份有限公司 Voice control method and device for controlling voice instruction by single key
CN102520788A (en) * 2011-11-16 2012-06-27 歌尔声学股份有限公司 Voice identification control method
CN103257703A (en) * 2012-02-20 2013-08-21 联想(北京)有限公司 Augmented reality device and method
CN103632664A (en) * 2012-08-20 2014-03-12 联想(北京)有限公司 A method for speech recognition and an electronic device
CN103793063A (en) * 2014-03-11 2014-05-14 哈尔滨工业大学 Multi-channel augmented reality system
CN105117195A (en) * 2015-09-09 2015-12-02 百度在线网络技术(北京)有限公司 Method and device for guiding voice input
CN105468142A (en) * 2015-11-16 2016-04-06 上海璟世数字科技有限公司 Interaction method and system based on augmented reality technique, and terminal
US20160124501A1 (en) * 2014-10-31 2016-05-05 The United States Of America As Represented By The Secretary Of The Navy Secured mobile maintenance and operator system including wearable augmented reality interface, voice command interface, and visual recognition systems and related methods
CN106200930A (en) * 2016-06-28 2016-12-07 广东欧珀移动通信有限公司 The control method of a kind of augmented reality, device and mobile terminal
CN106558310A (en) * 2016-10-14 2017-04-05 北京百度网讯科技有限公司 Virtual reality sound control method and device

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10824310B2 (en) * 2012-12-20 2020-11-03 Sri International Augmented reality virtual personal assistant for external representation
US10430985B2 (en) * 2014-03-14 2019-10-01 Magic Leap, Inc. Augmented reality systems and methods utilizing reflections
CN104102412B (en) * 2014-07-24 2017-12-12 央数文化(上海)股份有限公司 A kind of hand-held reading device and method thereof based on augmented reality
KR20160144665A (en) * 2015-06-09 2016-12-19 에스케이플래닛 주식회사 User equipment for recognizing object and displaying database matching result, control method thereof and computer readable medium having computer program recorded therefor
US20170169611A1 (en) * 2015-12-09 2017-06-15 Lenovo (Singapore) Pte. Ltd. Augmented reality workspace transitions based on contextual environment
US20170337747A1 (en) * 2016-05-20 2017-11-23 Patrick M. HULL Systems and methods for using an avatar to market a product
US10298587B2 (en) * 2016-06-20 2019-05-21 International Business Machines Corporation Peer-to-peer augmented reality handlers
US20190258318A1 (en) * 2016-06-28 2019-08-22 Huawei Technologies Co., Ltd. Terminal for controlling electronic device and processing method thereof
US10042604B2 (en) * 2016-07-01 2018-08-07 Metrik LLC Multi-dimensional reference element for mixed reality environments
US10297085B2 (en) * 2016-09-28 2019-05-21 Intel Corporation Augmented reality creations with interactive behavior and modality assignments
US10297254B2 (en) * 2016-10-03 2019-05-21 Google Llc Task initiation using long-tail voice commands by weighting strength of association of the tasks and their respective commands based on user feedback
US11348475B2 (en) * 2016-12-09 2022-05-31 The Boeing Company System and method for interactive cognitive task assistance
US10360732B2 (en) * 2017-03-23 2019-07-23 Intel Corporation Method and system of determining object positions for image processing using wireless network angle of transmission
US10304239B2 (en) * 2017-07-20 2019-05-28 Qualcomm Incorporated Extended reality virtual assistant
US10553031B2 (en) * 2017-12-06 2020-02-04 Microsoft Technology Licensing, Llc Digital project file presentation
US10937240B2 (en) * 2018-01-04 2021-03-02 Intel Corporation Augmented reality bindings of physical objects and virtual objects

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1410298A (en) * 2001-09-25 2003-04-16 公信电子股份有限公司 Voice control method and device for controlling voice instruction by single key
CN102520788A (en) * 2011-11-16 2012-06-27 歌尔声学股份有限公司 Voice identification control method
CN103257703A (en) * 2012-02-20 2013-08-21 联想(北京)有限公司 Augmented reality device and method
CN103632664A (en) * 2012-08-20 2014-03-12 联想(北京)有限公司 A method for speech recognition and an electronic device
CN103793063A (en) * 2014-03-11 2014-05-14 哈尔滨工业大学 Multi-channel augmented reality system
US20160124501A1 (en) * 2014-10-31 2016-05-05 The United States Of America As Represented By The Secretary Of The Navy Secured mobile maintenance and operator system including wearable augmented reality interface, voice command interface, and visual recognition systems and related methods
CN105117195A (en) * 2015-09-09 2015-12-02 百度在线网络技术(北京)有限公司 Method and device for guiding voice input
CN105468142A (en) * 2015-11-16 2016-04-06 上海璟世数字科技有限公司 Interaction method and system based on augmented reality technique, and terminal
CN106200930A (en) * 2016-06-28 2016-12-07 广东欧珀移动通信有限公司 The control method of a kind of augmented reality, device and mobile terminal
CN106558310A (en) * 2016-10-14 2017-04-05 北京百度网讯科技有限公司 Virtual reality sound control method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈金华: "《智慧学习环境构建》", 1 September 2013 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109065055A (en) * 2018-09-13 2018-12-21 三星电子(中国)研发中心 Method, storage medium and the device of AR content are generated based on sound
CN109065055B (en) * 2018-09-13 2020-12-11 三星电子(中国)研发中心 Method, storage medium, and apparatus for generating AR content based on sound
CN111966321A (en) * 2020-08-24 2020-11-20 Oppo广东移动通信有限公司 Volume adjusting method, AR device and storage medium
WO2022111282A1 (en) * 2020-11-24 2022-06-02 International Business Machines Corporation Ar (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
GB2616765A (en) * 2020-11-24 2023-09-20 Ibm AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
US11978444B2 (en) 2020-11-24 2024-05-07 International Business Machines Corporation AR (augmented reality) based selective sound inclusion from the surrounding while executing any voice command
CN112735413A (en) * 2020-12-25 2021-04-30 浙江大华技术股份有限公司 Instruction analysis method based on camera device, electronic equipment and storage medium

Also Published As

Publication number Publication date
US11397559B2 (en) 2022-07-26
US20190235833A1 (en) 2019-08-01

Similar Documents

Publication Publication Date Title
CN108363556A (en) A kind of method and system based on voice Yu augmented reality environmental interaction
US11100934B2 (en) Method and apparatus for voiceprint creation and registration
JP7029613B2 (en) Interfaces Smart interactive control methods, appliances, systems and programs
CN108877791B (en) Voice interaction method, device, server, terminal and medium based on view
CN107481720B (en) Explicit voiceprint recognition method and device
CN109036396A (en) A kind of exchange method and system of third-party application
US11164571B2 (en) Content recognizing method and apparatus, device, and computer storage medium
WO2019021088A1 (en) Navigating video scenes using cognitive insights
CN108683937A (en) Interactive voice feedback method, system and the computer-readable medium of smart television
CN110245348A (en) A kind of intension recognizing method and system
CN104282302A (en) Apparatus and method for recognizing voice and text
CN110232340A (en) Establish the method, apparatus of video classification model and visual classification
CN109785829A (en) A kind of customer service householder method and system based on voice control
CN109446907A (en) A kind of method, apparatus of Video chat, equipment and computer storage medium
CN107463929A (en) Processing method, device, equipment and the computer-readable recording medium of speech data
CN108495160A (en) Intelligent control method, system, equipment and storage medium
CN107862035A (en) Network read method, device, Intelligent flat and the storage medium of minutes
CN109800410A (en) A kind of list generation method and system based on online chatting record
CN108268602A (en) Analyze method, apparatus, equipment and the computer storage media of text topic point
CN111341307A (en) Voice recognition method and device, electronic equipment and storage medium
CN113763925B (en) Speech recognition method, device, computer equipment and storage medium
CN115422932A (en) Word vector training method and device, electronic equipment and storage medium
CN107944448A (en) A kind of image asynchronous edit methods and device
JP6944920B2 (en) Smart interactive processing methods, equipment, equipment and computer storage media
CN113655933A (en) Text labeling method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180803