CN112905007A - Virtual reality equipment and voice-assisted interaction method - Google Patents

Virtual reality equipment and voice-assisted interaction method Download PDF

Info

Publication number
CN112905007A
CN112905007A CN202110120704.2A CN202110120704A CN112905007A CN 112905007 A CN112905007 A CN 112905007A CN 202110120704 A CN202110120704 A CN 202110120704A CN 112905007 A CN112905007 A CN 112905007A
Authority
CN
China
Prior art keywords
virtual
virtual object
current
user
attribute
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110120704.2A
Other languages
Chinese (zh)
Inventor
王冰
张大钊
桑伟
董逸晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Visual Technology Co Ltd
Original Assignee
Hisense Visual Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Visual Technology Co Ltd filed Critical Hisense Visual Technology Co Ltd
Priority to CN202110120704.2A priority Critical patent/CN112905007A/en
Publication of CN112905007A publication Critical patent/CN112905007A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides virtual reality equipment and a voice-assisted interaction method, which can analyze a user instruction according to voice interaction information input by a user and acquire a virtual object list of a current scene, so that the virtual object list is updated in real time according to an execution result of the user instruction after the user instruction is executed on a virtual object. The virtual object list comprises the names of the virtual objects in the current virtual scene and the current attribute information of each virtual object. According to the method, the interaction action can be executed on the object in the virtual scene through voice input, the virtual object list is updated in real time, the user can execute operation on the object in the virtual space under the condition of not using external equipment, and the interaction flexibility of the virtual reality equipment is improved.

Description

Virtual reality equipment and voice-assisted interaction method
Technical Field
The application relates to the technical field of virtual reality, in particular to virtual reality equipment and a voice-assisted interaction method.
Background
Virtual Reality (VR) technology is a display technology that simulates a Virtual environment by a computer, thereby giving a person a sense of environmental immersion. A virtual reality device is a device that uses virtual display technology to present a virtual picture to a user. Generally, a virtual reality device includes two display screens for presenting virtual picture contents, corresponding to left and right eyes of a user, respectively. When the contents displayed by the two display screens are respectively from the images of the same object from different visual angles, the stereoscopic viewing experience can be brought to the user.
The virtual reality equipment can interact with the user, and the user can input the control command in different modes according to different interaction modes supported by the virtual reality equipment. For example, a user may employ a keyboard, a three-dimensional mouse, force feedback gloves, a depth camera, or the like to enable human-machine interaction with a virtual reality device.
However, due to the limited interactive actions that these devices can support, their application requires strict limitations on the user interface layout of the virtual reality device during human-computer interaction of the virtual reality device, and is not flexible in operation. In addition, the keyboard and other devices have the defects of complex connection relation with the virtual reality device, inconvenience in carrying and the like, so that the keyboard and other devices are difficult to widely apply to the virtual reality device.
Disclosure of Invention
The application provides virtual reality equipment and a voice-assisted interaction method, and aims to solve the problem that a traditional interaction mode is inflexible to operate.
In one aspect, the present application provides a virtual reality device, comprising: a display, a voice input device, and a controller. Wherein the display is used for displaying a user interface; the voice input device is used for detecting voice interaction information; the controller is configured to perform the following program steps:
acquiring voice interaction information input by a user;
responding to the voice interaction information, and acquiring a virtual object list of the current scene, wherein the virtual object list comprises virtual object names and current attribute information of virtual objects;
analyzing a user instruction from the voice interaction information;
and executing the user instruction on the virtual object in the current scene, and updating the attribute information according to the execution result of the user instruction.
On the other hand, the application also provides a voice-assisted interaction method, which is applied to the virtual reality equipment and comprises the following steps:
acquiring voice interaction information input by a user;
responding to the voice interaction information, and acquiring a virtual object list of the current scene, wherein the virtual object list comprises virtual object names and current attribute information of virtual objects;
analyzing a user instruction from the voice interaction information;
and executing the user instruction on the virtual object in the current scene, and updating the attribute information according to the execution result of the user instruction.
According to the technical scheme, the virtual reality equipment and the voice-assisted interaction method can analyze the user instruction according to the voice interaction information input by the user, and acquire the virtual object list of the current scene, so that the virtual object list is updated in real time according to the execution result of the user instruction after the user instruction is executed on the virtual object. The virtual object list comprises the names of the virtual objects in the current virtual scene and the current attribute information of each virtual object. According to the method, the interaction action can be executed on the object in the virtual scene through voice input, the virtual object list is updated in real time, the user can execute operation on the object in the virtual space under the condition of not using external equipment, and the interaction flexibility of the virtual reality equipment is improved.
Drawings
In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a display system including a virtual reality device in an embodiment of the present application;
FIG. 2 is a schematic diagram of a VR scene global interface in an embodiment of the application;
FIG. 3 is a schematic diagram of a recommended content area of a global interface in an embodiment of the present application;
FIG. 4 is a schematic diagram of an application shortcut operation entry area of a global interface in an embodiment of the present application;
FIG. 5 is a schematic diagram of a suspension of a global interface in an embodiment of the present application;
FIG. 6 is a diagram illustrating a VR frame in an embodiment of the present application;
FIG. 7 is a schematic view of a virtual scene in an embodiment of the present application;
FIG. 8 is a flowchart illustrating a voice-assisted interaction method according to an embodiment of the present application;
FIG. 9 is a schematic diagram illustrating a connection relationship between a virtual reality device and a voice input device according to an embodiment of the present application;
FIG. 10 is a schematic view of an opened wooden box according to an embodiment of the present application;
FIG. 11 is a schematic flow chart illustrating the generation of a user command in an embodiment of the present application;
FIG. 12 is a schematic diagram illustrating a voice interaction process in an embodiment of the present application;
FIG. 13 is a flowchart illustrating updating attribute information according to an embodiment of the present application;
fig. 14 is a schematic flowchart of updating a virtual object list in a wearing process in an embodiment of the present application;
fig. 15 is a schematic flowchart of updating a virtual object list through an interactive operation in an embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the exemplary embodiments of the present application clearer, the technical solutions in the exemplary embodiments of the present application will be clearly and completely described below with reference to the drawings in the exemplary embodiments of the present application, and it is obvious that the described exemplary embodiments are only a part of the embodiments of the present application, but not all the embodiments.
All other embodiments, which can be derived by a person skilled in the art from the exemplary embodiments shown in the present application without inventive effort, shall fall within the scope of protection of the present application. Moreover, while the disclosure herein has been presented in terms of exemplary one or more examples, it is to be understood that each aspect of the disclosure can be utilized independently and separately from other aspects of the disclosure to provide a complete disclosure.
It should be understood that the terms "first," "second," "third," and the like in the description and in the claims of the present application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used are interchangeable under appropriate circumstances and can be implemented in sequences other than those illustrated or otherwise described herein with respect to the embodiments of the application, for example.
Furthermore, the terms "comprises" and "comprising," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or device that comprises a list of elements is not necessarily limited to those elements explicitly listed, but may include other elements not expressly listed or inherent to such product or device.
The term "module," as used herein, refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the functionality associated with that element.
Reference throughout this specification to "embodiments," "some embodiments," "one embodiment," or "an embodiment," etc., means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in various embodiments," "in some embodiments," "in at least one other embodiment," or "in an embodiment," or the like, throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Thus, the particular features, structures, or characteristics shown or described in connection with one embodiment may be combined, in whole or in part, with the features, structures, or characteristics of one or more other embodiments, without limitation. Such modifications and variations are intended to be included within the scope of the present application.
In the embodiment of the present application, the virtual Reality device 500 generally refers to a display device that can be worn on the face of a user to provide an immersive experience for the user, including but not limited to VR glasses, Augmented Reality (AR) devices, VR game devices, mobile computing devices, other wearable computers, and the like. The virtual reality device 500 may operate independently or may be connected to other intelligent display devices as an external device, where the display devices may be smart televisions, computers, tablet computers, servers, and the like.
The virtual reality device 500 may be worn behind the face of the user, and display a media image to provide close-range images for the eyes of the user, so as to provide an immersive experience. To present the asset display, virtual reality device 500 may include a number of components for displaying the display and facial wear. Taking VR glasses as an example, the virtual reality device 500 may include a housing, temples, an optical system, a display assembly, a posture detection circuit, an interface circuit, and the like. In practical application, the optical system, the display component, the posture detection circuit and the interface circuit can be arranged in the shell to present a specific display picture; the two sides of the shell are connected with the temples so as to be worn on the face of a user.
When the gesture detection circuit is used, gesture detection elements such as a gravity acceleration sensor and a gyroscope are arranged in the gesture detection circuit, when the head of a user moves or rotates, the gesture of the user can be detected, detected gesture data are transmitted to a processing element such as a controller, and the processing element can adjust specific picture content in the display assembly according to the detected gesture data.
It should be noted that the manner in which the specific screen content is presented varies according to the type of the virtual reality device 500. For example, as shown in fig. 1, for a part of thin and light VR glasses, a built-in controller generally does not directly participate in a control process of displaying content, but sends gesture data to an external device, such as a computer, and the external device processes the gesture data, determines specific picture content to be displayed in the external device, and then returns the specific picture content to the VR glasses, so as to display a final picture in the VR glasses.
In some embodiments, the virtual reality device 500 may access the display device 200, and a network-based display system is constructed between the virtual reality device 500 and the server 400, so that data interaction may be performed among the virtual reality device 500, the display device 200, and the server 400 in real time, for example, the display device 200 may obtain media data from the server 400 and play the media data, and transmit specific picture content to the virtual reality device 500 for display.
The display device 200 may be a liquid crystal display, an OLED display, a projection display device, among others. The particular display device type, size, resolution, etc. are not limiting, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired. The display apparatus 200 may provide a broadcast receiving television function and may additionally provide an intelligent network television function of a computer support function, including but not limited to a network television, an intelligent television, an Internet Protocol Television (IPTV), and the like.
The display device 200 and the virtual reality device 500 also perform data communication with the server 400 by a plurality of communication methods. The display device 200 and the virtual reality device 500 may be allowed to be communicatively connected through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display apparatus 200. Illustratively, the display device 200 receives software program updates, or accesses a remotely stored digital media library, by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The server 400 may be a cluster or a plurality of clusters, and may include one or more types of servers. Other web service contents such as video on demand and advertisement services are provided through the server 400.
In the course of data interaction, the user may operate the display apparatus 200 through the mobile terminal 100A and the remote controller 100B. The mobile terminal 100A and the remote controller 100B may communicate with the display device 200 in a direct wireless connection manner or in an indirect connection manner. That is, in some embodiments, the mobile terminal 100A and the remote controller 100B may communicate with the display device 200 through a direct connection manner such as bluetooth, infrared, or the like. When transmitting the control instruction, the mobile terminal 100A and the remote controller 100B may directly transmit the control instruction data to the display device 200 through bluetooth or infrared.
In other embodiments, the mobile terminal 100A and the remote controller 100B may also access the same wireless network with the display apparatus 200 through a wireless router to establish indirect connection communication with the display apparatus 200 through the wireless network. When sending the control command, the mobile terminal 100A and the remote controller 100B may send the control command data to the wireless router first, and then forward the control command data to the display device 200 through the wireless router.
In some embodiments, the user may also use the mobile terminal 100A and the remote controller 100B to directly interact with the virtual reality device 500, for example, the mobile terminal 100A and the remote controller 100B may be used as handles in a virtual reality scene to implement functions such as somatosensory interaction.
In some embodiments, the display components of the virtual reality device 500 include a display screen and drive circuitry associated with the display screen. In order to present a specific picture and bring about a stereoscopic effect, two display screens may be included in the display assembly, corresponding to the left and right eyes of the user, respectively. When the 3D effect is presented, the picture contents displayed in the left screen and the right screen are slightly different, and a left camera and a right camera of the 3D film source in the shooting process can be respectively displayed. Because the user can observe the picture content by the left and right eyes, the user can observe a display picture with strong stereoscopic impression when wearing the glasses.
The optical system in the virtual reality device 500 is an optical module consisting of a plurality of lenses. The optical system is arranged between the eyes of a user and the display screen, and can increase the optical path through the refraction of the lens on the optical signal and the polarization effect of the polaroid on the lens, so that the content displayed by the display assembly can be clearly displayed in the visual field range of the user. Meanwhile, in order to adapt to the eyesight of different users, the optical system also supports focusing, namely, the position of one or more of the lenses is adjusted through the focusing assembly, the mutual distance between the lenses is changed, the optical path is changed, and the definition of a picture is adjusted.
The interface circuit of the virtual reality device 500 may be configured to transmit interactive data, and in addition to the above-mentioned transmission of the gesture data and the display content data, in practical applications, the virtual reality device 500 may further connect to other display devices or peripherals through the interface circuit, so as to implement more complex functions by performing data interaction with the connection device. For example, the virtual reality device 500 may be connected to a display device through an interface circuit, so as to output a displayed screen to the display device in real time for display. As another example, the virtual reality device 500 may also be connected to a handle via an interface circuit, and the handle may be operated by a user's hand, thereby performing related operations in the VR user interface.
Wherein the VR user interface may be presented as a plurality of different types of UI layouts according to user operations. For example, the user interface may include a global UI, as shown in fig. 2, after the AR/VR terminal is started, the global UI may be displayed in a display screen of the AR/VR terminal or a display of the display device. The global UI may include a recommended content area 1, a business class extension area 2, an application shortcut operation entry area 3, and a suspended matter area 4.
The recommended content area 1 is used for configuring the TAB columns of different classifications; media resources, special subjects and the like can be selected and configured in the column; the media assets can include services with media asset contents such as 2D movies, education courses, tourism, 3D, 360-degree panorama, live broadcast, 4K movies, program application, games, tourism and the like, and the columns can select different template styles and can support simultaneous recommendation and arrangement of the media assets and the titles, as shown in FIG. 3.
The service class extension area 2 supports extension classes configuring different classes. And if the new service type exists, supporting the configuration of an independent TAB and displaying the corresponding page content. The expanded classification in the service classification expanded area 2 can also perform sequencing adjustment and offline service operation on the expanded classification. In some embodiments, the service class extension area 2 may include the content of: movie & TV, education, tourism, application, my. In some embodiments, the business category extension area 2 is configured to expose a large business category TAB and support more categories for configuration, which is illustrated in support of configuration, as shown in fig. 3.
The application shortcut operation entry area 3 can specify that pre-installed applications are displayed in front for operation recommendation, and support to configure a special icon style to replace a default icon, wherein the pre-installed applications can be specified in a plurality. In some embodiments, the application shortcut operation entry area 3 further includes a left-hand movement control and a right-hand movement control for moving the option target, for selecting different icons, as shown in fig. 4.
The suspended matter region 4 may be configured above the left oblique side or above the right oblique side of the fixed region, may be configured as an alternative character, or is configured as a jump link. For example, the flotage jumps to an application or displays a designated function page after receiving the confirmation operation, as shown in fig. 5. In some embodiments, the suspension may not be configured with jump links, and is used solely for image presentation.
In some embodiments, the global UI further comprises a status bar at the top for displaying time, network connection status, power status, and more shortcut entries. After the handle of the AR/VR terminal is used, namely the icon is selected by the handheld controller, the icon displays a character prompt comprising left and right expansion, and the selected icon is stretched and expanded left and right according to the position.
For example, after the search icon is selected, the search icon displays the characters including "search" and the original icon, and after the icon or the characters are further clicked, the search icon jumps to a search page; for another example, clicking the favorite icon jumps to the favorite TAB, clicking the history icon default location display history page, clicking the search icon jumps to the global search page, clicking the message icon jumps to the message page.
In some embodiments, the interaction may be performed through a peripheral, e.g., a handle of the AR/VR terminal may operate a user interface of the AR/VR terminal, including a return button; a main page key, and the long press of the main page key can realize the reset function; volume up-down buttons; and the touch area can realize the functions of clicking, sliding, pressing and holding a focus and dragging.
The user may enter different scene interfaces through the global interface, for example, as shown in FIG. 6, the user may enter the browse interface at a "browse interface" entry in the global interface, or initiate the browse interface by selecting any of the assets in the global interface. In the browsing interface, the virtual reality device 500 may create a 3D scene through the Unity 3D engine and render specific screen content in the 3D scene.
In the browsing interface, a user can watch specific media asset content, and in order to obtain better viewing experience, different virtual scene controls can be further arranged in the browsing interface so as to cooperate with the media asset content to present specific scenes or realize real-time interaction. For example, in a browsing interface, a panel may be set in a Unity 3D scene to present picture content, and be matched with other home virtual controls to achieve the effect of a cinema screen.
The virtual reality device 500 may present the operation UI content in a browsing interface. For example, a list UI may be displayed in front of the display panel in the Unity 3D scene, a media asset icon stored locally by the current virtual reality device 500 may be displayed in the list UI, or a network media asset icon playable in the virtual reality device 500 may be displayed. The user can select any icon in the list UI, and the selected media assets can be displayed in real time in the display panel.
In some embodiments of the present application, the virtual scene is a virtual space created by the virtual reality device 500 by executing a 3D scene rendering algorithm. The virtual reality device 500 may add various virtual objects, such as a virtual camera, a 3D model, a media playing module, etc., to the virtual scene. Through scene rendering, the virtual reality apparatus 500 can present screen content required in various user interfaces in a virtual scene.
For example, as shown in fig. 7, the virtual reality device 500 may render a virtual cinema environment through a unity 3D engine, and in a rendered virtual scene, a media playing module such as a screen, a virtual 3D model such as a seat and a sound device may be added. In the film watching process, the screen can play media asset pictures, and the seat and the sound equipment can be used for simulating cinema scenes to provide users with the film watching experience in a cinema.
In addition, in the virtual scene, two virtual cameras may be further provided, and the two virtual cameras may shoot the virtual scene to generate a display image according to a virtual object in the virtual scene, and output the display image to the two displays of the virtual reality device 500 respectively, so that the shot image is displayed on the two displays of the virtual reality device 500 respectively, thereby achieving an effect of simulating a cinema. The virtual camera can establish a connection relation with the pose sensor of the virtual reality device 500, so that when the user wears the virtual reality device 500, the shooting angle of the virtual camera can be adjusted according to the head steering and the position, different angles of virtual objects in a virtual scene can be shot conveniently, and the actual viewing effect can be simulated.
It should be noted that the virtual scene may provide a specific application interface presented when a certain application is run for the virtual reality device 500, for example, the above-mentioned virtual cinema interface, and may also provide an operation interface for an operating system of the virtual reality device 500, for example, the above-mentioned global UI interface. Different user interfaces need to construct different virtual scenes, and different virtual articles are added in the virtual scenes.
In a partial virtual scene, the virtual object may also be subjected to an operational action. For example, for the virtual scene constructed by the global UI interface, the user can adjust the positions and sizes of the recommended content area 1, the business classification extension area 2, the application shortcut operation entry area 3, and the suspended matter area 4 in the global UI interface through interactive operation.
To enable performing operational actions on the virtual object, the virtual reality device 500 may provide an interactive portal for a user to input interactive actions. The interaction action input by the user can correspond to different operation actions according to the interaction rule set in the operating system. For example, the user controls the focus cursor to move through the 3D mouse, presses the left button of the mouse for a long time when moving to the suspension area, and moves the position of the mouse, so as to drag the suspension area to the target position. Obviously, according to different interaction modes supported by the virtual reality device 500, the mode of the user inputting the interaction action is different, and the corresponding operation action content is also different.
However, since the virtual reality device 500 needs to be worn on the user's face in use, it is inconvenient for the user to perform a mutual action. Moreover, for common interaction, it is usually necessary to satisfy the motion detection of the key and the moving position. Therefore, the virtual reality device 500 itself cannot meet the requirement of the interactive action, and the interactive action can be realized only by externally connecting a handle, a keyboard, a three-dimensional mouse, a force feedback glove, a depth camera and other devices.
For the external device, not only the use cost is increased, but also the external device has different types and different interaction actions supported by the virtual reality device 500, so that the operating system of the virtual reality device 500 itself cannot configure all the external devices, and additional function configuration needs to be performed on the virtual reality device 500.
Moreover, since the virtual scene is a three-dimensional space, the virtual objects have not only a planar positional relationship on the interface but also a spatial front-back positional relationship. For example, when a user is using a VR device, it is necessary to select a virtual object in the field of view with a handle control, and since it is relatively difficult to select a virtual object to be controlled in a three-dimensional space, and when a virtual object model is repeated, it is necessary to change the position or the view to perform a selection operation, such an interaction manner is easily misaligned, which causes great inconvenience to the user.
Compared with a two-position space, the three-dimensional space has a far and near dimension, which causes the selection operation of the virtual article model close to the virtual camera position to be different from that of the far virtual article model. The problem that the cursor is difficult to touch due to the fact that the model close to the user in the visual field is large and the model far away from the user is small, or the cursor shakes after the user selects the model, can cause discomfort for the user.
In order to facilitate the user to perform an interactive operation, in some embodiments of the present application, a virtual reality device 500 is provided, and the virtual reality device 500 may be internally or externally connected with a voice input device, and is configured to receive a voice signal input by the user and convert the voice signal into voice interaction information. The voice interaction information may be used to perform an operation action on a virtual object in a virtual scene to implement voice interaction, as shown in fig. 8, a specific interaction manner is as follows:
s1: and acquiring voice interaction information input by a user.
In the process of executing voice interaction, a user can input various voices through the voice input device, and the voice input device can convert sound signals corresponding to the voices into electric signals which can be processed by a computer and send the electric signals to the controller to execute subsequent processing.
The voice input means may be a microphone device in which the virtual reality device 500 is built. For example, a microphone may be directly disposed in the virtual reality device 500, and the microphone may detect sounds made by the user during wearing in real time, and generate an electrical signal containing sound content information. The voice input device may also be a microphone device externally connected to the virtual reality device 500, and an external device interface, such as a 3.5mm earphone/microphone interface, may be provided on the virtual reality device 500, and in the using process, a user may insert a microphone into the interface, detect a sound signal in real time through the microphone, convert the detected sound signal into an electrical signal, and send the electrical signal to the virtual reality device 500.
The voice input device can be connected with the virtual reality equipment 500 through the same equipment, so that the voice input device is externally connected with the virtual reality equipment 500. For example, as shown in fig. 9, the virtual reality device 500 may be connected to the display device 200, and a microphone may also be connected to the display device 200, so that voice interaction information is detected by the microphone and transmitted to the display device 200, and subsequent processing is performed by the display device 200 or forwarded to the virtual reality device 500 by the display device 200.
S2: and responding to the voice interaction information, and acquiring a virtual object list of the current scene.
After obtaining the voice interaction information input by the user, the controller may obtain a virtual item list of the current virtual scene, where the virtual object list includes a virtual object name and current attribute information of the virtual object. The virtual object name may be a noun phrase uniformly defined according to the application range of the virtual scene, for example, "switch", "link", and the like. The attribute information is a state of the virtual object in the current use case. For example, the attribute information may include "visible (isVisible)", "operable (isooperable)" and the like. Obviously, when the virtual scene needs high-precision control, more attribute information items can be added, for example, the "operable" attribute is refined into attribute information such as "whether to rotate", "whether to move left, right, up, down, front and back", "whether to zoom in/out", and the like.
The virtual object list may be automatically established when the virtual scene is built, that is, in some embodiments, the virtual reality device 500 may receive a control instruction input by a user for entering the virtual scene, and traverse virtual object names and current attribute information of all virtual objects in the current virtual scene in response to the control instruction, so as to generate the virtual object list according to the virtual object names and the current attribute information. For example, when a user wears the VR device to enter a virtual scene, the VR device may traverse all virtual objects included in the current virtual scene, and obtain attribute information corresponding to the virtual objects to establish a virtual object list.
The virtual object list can also update the attribute information in real time according to the wearing process of the user. For example, as the position of the user moves, part of the virtual object far from the virtual camera may be blocked by the virtual object near the virtual camera, and thus the "visible" attribute information of the virtual object far from the virtual camera may be updated from "visible" (isVisible true) to "invisible", that is, isVisible false.
S3: and analyzing the user instruction from the voice interaction information.
While the virtual object list is being obtained, the virtual reality apparatus 500 may also parse the user instruction from the voice interaction information. When receiving the voice input, the virtual reality device 500 may perform voice recognition on the input voice, and obtain a voice recognition result after the recognition is successful, so as to analyze the user instruction. Generally, a user instruction can be divided into two parts, namely an operation Action (Action) and an operation object name (Title). The operation action represents a specific interaction behavior which a user wants to perform, and the operation corresponding name is the name of the specified interaction target object. For example, as shown in fig. 10, when the user inputs the voice content "open the wooden box", the corresponding operation object name is "wooden box", and the operation action is "open".
Obviously, in order to accurately execute the corresponding user instruction, the voice interaction information may be further processed in the process of analyzing the voice interaction information. The processing procedure may be to screen the voice interaction information and determine the voice information that may be applied to the current virtual scene. For example, after the operation object name is analyzed, matching may be performed in the virtual object list using the operation object name to filter out operation object names that do not exist in the virtual object list.
For a virtual scene constructed by some operation interfaces, sometimes, an app jump can be performed on an operation executed on a certain virtual article, and for such an interface, an operation object name and an operation action can be directly determined by using an interface word technology. For example, when the user inputs the speech content "play" i is not the drug spirit "under the media asset selection interface, since" i is not the drug spirit "is a movie name, it may not adopt the name but adopt the name of" link 1 "in the virtual object list. At this time, the name of the operation object is determined to be "link 1" by the interface word technology, and the operation action is "play".
S4: and executing the user instruction on the virtual object in the current scene, and updating the attribute information according to the execution result of the user instruction.
After the user instruction is parsed, the virtual reality device 500 may execute the user instruction on the virtual object in the current scene, that is, execute the operation action for the operation object. For example, as shown in fig. 10, when the user inputs a voice instruction to "open a wooden box", the virtual reality device 500 may perform a click action on a virtual object named "wooden box" to deform the virtual object into an "open" shape.
For part of application scenes, in the process of executing the user instruction on the virtual object, the attribute information in the virtual object list can be updated according to the execution result of the user instruction. For example, when the user inputs a voice command to "lock the position of the suspended object", the virtual reality device 500 locks the position of the virtual object of the "suspended object" on the one hand, and updates the attribute information of the "suspended object" in the virtual object list on the other hand. That is, the modification attribute information is "isOperable ═ false".
As can be seen from the above technical solutions, in the above embodiments, by adding a voice input device to the virtual reality apparatus 500, the virtual reality apparatus 500 can detect voice interaction information of a user during use, and perform an interaction action on a virtual object according to the voice interaction information. Meanwhile, updating the attribute information in the virtual object list in real time according to the execution result of the interactive action so as to continuously execute the voice interaction. Through voice interaction, the user can execute operation on the object in the virtual space under the condition of not using complex external equipment, and the interaction flexibility of the virtual reality equipment is improved.
In some embodiments, as shown in fig. 11, in order to be able to parse the user instruction from the voice interaction information, after obtaining the voice interaction information, the virtual reality device 500 may further perform the following steps:
s301: converting the voice interaction information into character information;
s302: inputting the text information into a semantic recognition model;
s303: and acquiring an output result of the semantic model according to the text information to generate the user instruction.
In this embodiment, a speech-to-text application may be built into the virtual reality device 500. After the voice interaction information is acquired, the voice interaction information can be converted into a text form through the voice-to-text application. Obviously, the content of the text information obtained through the speech-to-text application process is the same as the content of the speech input. That is, the text information obtained by the user at this time is the text information biased to spoken language. For example, when the user inputs a voice "open a wooden box", the obtained text information is "open a wooden box", and when the user inputs a spoken-biased voice information "i want to see what is in a wooden box", the obtained text information is also "i want to see what is in a wooden box".
It can be seen that the user instructions parsed therefrom may also be the same for different forms of speech information, and thus for accurate parsing of the user instructions. After being converted into the text information, the virtual reality device 500 may further input the text information obtained by the conversion into a semantic recognition model, so as to output the text information as a specific instruction form including an operation object name and an execution action through the semantic recognition model.
The semantic recognition model is a machine learning model obtained through training of a training text, and can be a neural network model capable of outputting classification probability or a processing model based on the neural network model and a related processing program. For the virtual reality device 500, since the interaction modes performed in different application environments are different, training texts may be set for the application environments and input into the neural network model for model training. The training text may be provided with classification labels so that when the model outputs a result, the model parameters are adjusted by back propagation according to the classification labels. And obtaining a semantic recognition model after a certain number of training samples.
After the character information is input into the semantic recognition model, the semantic recognition model outputs the classification probability of the keywords corresponding to the voice interaction information according to the character information, and therefore the semantics in the voice interaction information are obtained. For example, for the text information "i want to see what is in the wooden box", the text information may be subjected to processes such as word segmentation, classification probability calculation, conversion, and the like after the input of the semantic recognition model, and the semantic recognition result of the text information "open wooden box" may be output.
After the semantic recognition result is obtained, a user instruction can be generated according to the semantic recognition result. The output result is a structured text including an operation object name and an operation action, and the corresponding user instruction may be a control instruction composed of the operation object name and the operation action. For example, the user instruction corresponding to "open wooden box" is "/openbox". This user instruction may be executed by the system of the virtual reality device 500 to complete the interactive action of opening the wooden box.
As can be seen from the above technical solutions, in the embodiment, the voice interaction information input by the user can be processed through the semantic recognition model, so that when the user inputs the spoken voice content, the voice content can be converted into a user instruction in a specific form, so that the virtual reality device 500 can implement an interaction action for more voice contents, and the flexibility of the voice interaction process is improved.
After the user instruction is parsed from the voice interaction information, the virtual reality device 500 may perform an interaction action according to the user instruction. In practical applications, the virtual reality device 500 may use voice interaction in a variety of scenarios, and in different application scenarios, the virtual reality device 500 may support different voice interaction modes. So that some voice interaction information can be used in the current application scenario but not in another scenario. Therefore, in order to enable the user to input usable voice information, in some embodiments, as shown in fig. 12, the step of executing the user instruction on the virtual object in the current scene further includes:
s411: matching the operation object name in the user instruction in the virtual object list;
s412: if the virtual object list contains the virtual object name which is the same as the name of the operation object, executing the operation action in the user instruction on the current virtual object;
s413: and if the virtual object list does not contain the virtual object name which is the same as the name of the operation object, inputting the voice interaction information into a semantic recognition model to execute semantic analysis.
Virtual scenes constructed by different application scenes are different, and virtual objects contained in the virtual scenes are also different. For example, a virtual scene corresponding to the global UI interface may include virtual objects of display controls such as a recommended content area, a business classification extension area, an application shortcut operation entry area, and a suspension area; the virtual scene corresponding to the virtual cinema application includes virtual objects such as a display control (screen), a 3D model (seat, and sound), and the like. Obviously, under the global UI interface, the input voice interaction information cannot be completely adapted to the virtual cinema application environment.
For this reason, in this embodiment, different virtual object lists may be acquired in different application scenes, and all virtual objects in the current virtual scene may be included in the acquired virtual object lists. After the voice interaction information input by the user is obtained and the user instruction is analyzed, the name of the operation object specified in the user instruction can be matched in the virtual object list to determine whether the current virtual object list contains the name of the virtual object to be controlled.
For example, a user in a global UI interface scenario, a controller of the virtual reality device 500 may list virtual objects of the global UI interface. After the user inputs the voice interaction information of "moving the suspended object", the virtual reality device 500 may parse the user instruction from the voice information, and obtain the operation corresponding name of "suspended object". At this time, the virtual reality apparatus 500 may match whether or not an entry having a virtual object name of "hover" is included in the virtual object list of the global UI interface. And if the virtual object list contains the virtual object name which is the same as the name of the operation object, executing the operation action in the user instruction on the current virtual object, namely executing the 'moving' operation action on the virtual object corresponding to the 'suspended object'.
If the virtual object list does not contain the virtual object name which is the same as the name of the operation object, the user can be reminded that the current scene does not contain the operation object through the prompt interface. For example, in a global UI interface scenario, after a user inputs voice interaction information of "open a box", since there is no virtual object named "box" in the virtual object list of the current scenario, a prompt interface may be displayed, and prompt text such as "do not have a box in the current scenario, please re-input" may be included in the prompt interface.
In the matching process, if the virtual object list does not contain the virtual object name identical to the operation object name, it may be that the operation object name recognized by the virtual reality device 500 is different from the virtual object name in the virtual object list because the voice interaction information input by the user is too spoken. Therefore, in order to improve the accuracy of voice recognition, when the virtual object list is determined not to contain the virtual object name identical to the operation object name, the voice interaction information can be input into the semantic recognition model to perform semantic analysis. The semantic parsing process may be the same as the processing process in the above embodiments, and is not described herein again.
If the virtual object list contains the virtual object name which is the same as the operation object name, the interactive operation can be executed on the virtual object. In some embodiments, as shown in fig. 12, the virtual reality device 500 may further perform the following procedural steps before performing the interaction:
s421: extracting the state attribute of the current virtual object from the virtual object list;
s422: if the state attribute is a visible state, executing the operation action on the current virtual object;
s423: and if the state attribute is in an invisible state, controlling the display to display a prompt interface for prompting that the current virtual object is not in a visible range.
Before the interactive action is executed on the virtual object, the state attribute of the current virtual object can be extracted in the virtual object list, so that the operation action is executed according to the state attribute. The state attribute of the virtual object is an attribute for indicating a visible state of the virtual object, and the state attribute of the virtual object may include a visible state and an invisible state according to the current field of view of the virtual reality device 500 and the position of the virtual object in the virtual scene.
The virtual scene is a three-dimensional scene, and the picture presented by the virtual reality device 500 is a virtual object image of the virtual camera within the coverage range of the visual field of the virtual scene, that is, the virtual object within the visual field is visible, and the virtual object outside the visual field is not visible. In addition, in the same view field, a virtual object farther from the virtual camera may be blocked by a virtual object closer to the virtual camera, that is, the blocked virtual object is also in an invisible state.
Obviously, since the user can perform an accurate interactive operation conveniently only when the virtual object is in the visible state, if the state attribute is in the visible state, the operation action is performed on the current virtual object. For example, when the user inputs the voice interaction information "open a box", the virtual reality device 500 may first acquire the state attribute in the current virtual object list, and if the state attribute is true, it is determined that a virtual object "box" exists within the field of view of the current virtual reality device 500, and thus an "open" operation may be performed on the virtual object.
After extracting the state attribute of the current virtual object, if the state attribute is in an invisible state, a prompt interface may be displayed in the display of the virtual reality device 500 to prompt the user that the currently operated object is not in the field of view and the interactive operation cannot be performed on the virtual object. Obviously, the prompt page may include text or pattern content for prompting that the current virtual object is not within the visible range. For example, when the user inputs the voice interaction information "open a box", the virtual reality device 500 may first acquire the state attribute in the current virtual object list, and if the state attribute is false, it is determined that the virtual object "box" does not exist in the field of view of the current virtual reality device 500, and therefore, prompt content may be displayed, and if "the object is not in the field of view," the angle is adjusted to try again. "
In some cases, to facilitate the operation, depending on the type of interface, some virtual objects in the virtual scene may be locked, for example, in the global UI interface, the application shortcut operation entry area may be fixed at a specific position in the interface and cannot be moved or zoomed by the user. In this regard, when the user inputs a control instruction for moving or zooming the region, the responsive interactive action cannot be performed. The realization of the function can detect whether the current virtual object can be operated or not when the state attribute of the virtual object is extracted to be the visible state, so that a user can input a correct control instruction.
That is, in some embodiments, as shown in fig. 12, if the status attribute is a visible status, the virtual reality device 500 may further perform the following program steps:
s431: extracting the operation attribute of the current virtual object from the virtual object list;
s432: if the operation attribute supports the operation action, executing the operation action on the current virtual object;
s433: and if the operation attribute does not support the operation action, controlling the display to display a prompt interface for prompting that the current virtual object cannot be controlled.
In this embodiment, after determining that the state attribute of the current operation object is in the visible state, the operation attribute of the current virtual object may be extracted from the virtual object list, so as to determine whether the current virtual object supports performing the interaction action specified in the voice interaction information according to the operation attribute.
The operation attribute of the virtual object can be used for reflecting whether the current virtual object can be executed with operation actions, including operation actions supported and operation actions not supported. Since the operation action can be presented in various forms according to the user input, a corresponding variety of operation attributes can be set according to each operation form. For example, when the operation action is movement, the operation attribute may include movement support and movement non-support; when the operation action is rotation, the operation attribute may include rotation support and rotation non-support. In practical applications, the virtual reality device 500 may set different operation attributes according to different types of user interfaces.
After extracting the operation attribute, the virtual reality device 500 may determine whether the operation attribute of the virtual object in the current state supports the operation action, and if the operation attribute supports the operation action, perform the operation action on the current virtual object. For example, when the user inputs the voice content as "open wooden box", if the virtual object operation attribute isptoreable ═ true named "wooden box" can be determined through the virtual object list, and thus the current virtual object supports the "open" operation, the "open" operation is performed on the current virtual object "wooden box".
Similarly, if the operation attribute of the current virtual object supports the operation action, and the operation attribute does not support the operation action as a result of the judgment, the user can be prompted to input correct voice interaction information through the display prompting interface. For example, a user inputs a voice "rotate an application shortcut operation entry" in the global UI interface, and since a virtual object corresponding to the application shortcut operation entry is locked and does not support a rotation operation, an operation attribute extracted from the virtual object list is ispoperable ═ false, and at this time, a prompt interface may be displayed through the display. The prompt interface can display the characters that the object can not be controlled at present and other instructions are input.
It should be noted that, for a virtual object that does not support the current operation action, the user may also input voice interaction information for contacting the locked state as needed, so that the current virtual object can support the current operation action. For example, when the user inputs "move the wooden box" and the current "wooden box" is locked and does not support the moving operation, a prompt interface may be displayed and a prompt content "the wooden box is locked and cannot be moved, please touch the lock or input other instructions first" may be presented, at this time, the user may input the voice interaction information of "release the locking state of the wooden box" again, thereby modifying the operation attribute of the wooden box to "isMovable" true ".
As can be seen from the above technical solutions, in the above embodiments, the virtual reality device 500 may first perform name matching before performing the interactive action, and if the name matching is successful, match the state attribute and the operation attribute, that is, read whether the current attribute information isVisible and isoprable is true from the virtual object list, and meanwhile, if the conditions are met, perform the operation of the user. If only name matching is successful, but the state attribute isVisible is false, prompting the user that the current object is invisible according to the situation; if the name is successfully matched, and the isVisible is true, but the operation attribute is isOpreable is false, prompting the user that the current object is not controllable; and if the name matching is unsuccessful, directly executing semantic analysis. By the judgment of the attribute information, the user can be ensured to input more correct voice interaction information, and the operation convenience and flexibility of the interaction process are improved.
In some embodiments, in order to perform the interactive operation subsequently, the virtual reality device 500 may update the virtual object list in real time, that is, as shown in fig. 13, after the step of performing the operation action on the current virtual object, the virtual reality device 500 may further perform the following steps:
s441: extracting the operation action execution result of the current virtual object;
s442: acquiring current pose information of the virtual reality equipment;
s443: calculating the visible range of the virtual reality equipment in the current virtual scene according to the pose information;
s444: and detecting whether the current virtual object is in the visual range or not so as to update the state attribute of the current virtual object.
In this embodiment, since the attribute information of the virtual object may include the state attribute and the operation attribute, each time the interactive operation is performed on the virtual object, the state attribute and the operation attribute in the virtual object list may be updated according to the execution result of the operation action. The operation attribute can determine the executed attribute content according to the operation action. For example, when an "unlock" operation is performed on any virtual object, the operation attribute is changed from isposable to false to isposable to true. Accordingly, the virtual reality apparatus 500 can directly modify the attribute information in the virtual object list according to the execution result of the operation action.
For the state attribute, since it is possible to send a change if the virtual object is visible after the operation action is performed on the virtual object, after the operation action execution result of the current virtual object is extracted, the position of the current virtual object in the virtual scene may be obtained through the execution result, and at the same time, the current pose information of the virtual reality device 500 is obtained, so as to determine the view range of the virtual camera in the virtual scene in the current state.
The pose information may be acquired by pose detection means such as a gravitational acceleration sensor and a gyroscope built in the virtual reality device 500. Obviously, different pose information may determine that the virtual camera has different visual ranges in the current virtual scene. And the state attribute isvisual of the virtual object located in the visible range is true, so the virtual reality device 500 may update the state attribute of the current virtual object by detecting whether the current virtual object is in the visible range.
For example, when the user inputs voice interaction information of "moving a wooden box", the virtual reality device 500 may perform a moving operation on the voice wooden box to change its position in the virtual scene. And if the wooden box moves from the visible range to the outside of the visible range after the moving operation, modifying the state attribute isVisible true of the wooden box into isVisible false in the virtual object list so as to finish updating the virtual object list.
In some embodiments, the virtual object list may be used in practical applications, and the attribute information may be updated not only according to the voice interaction information input by the user, but also according to the wearing state and the operation action of the user in real time. That is, the virtual reality apparatus 500 may detect the use status of the user in real time and update the status attribute and the operation attribute in the list according to the use status.
For the state attribute, as shown in fig. 14, the virtual reality device 500 may monitor the action of the user when wearing by detecting pose information in real time. If the pose information is changed, the visual range of the virtual reality equipment in the virtual scene under the changed pose information can be obtained; and traversing the virtual object within the visual range to update the state attribute of the virtual object within the visual range.
For example, when the user wears the VR device, equivalent to setting a virtual Camera (Camera) associated with the pose detection means in the virtual space, the virtual Camera may follow the rotation or movement while the user turns around, or walks, while the entire three-dimensional virtual space is unchanged. Therefore, if a virtual object is within the visible range of the virtual camera and is not occluded with the user action, the state attribute of the virtual object in the virtual object list is isVisible ═ true; if the virtual object is not in the visual range of the virtual camera or is occluded with the user action, the status attribute of the object in the virtual object list is changed to isVisable.
For the operation attribute, as shown in fig. 15, the virtual reality device 500 may execute an operation action in an interactive operation instruction on the virtual object by acquiring the interactive operation instruction input by the user and responding to the interactive operation instruction; and if the operation attribute of the virtual object is modified by the operation action, updating the operation attribute of the virtual object.
For example, the virtual reality apparatus 500 is to give the virtual object a property of being manipulated or not again after the virtual object performs the operation each time. If the movement or execution action of the object is not limited, such as whether the object has an operable attribute or not, or whether the controllability of the object is limited, and the like, if the object meets the operable attribute, the operable attribute isoperatable is true; if the object does not satisfy the controllable attribute, the operation attribute is isOperable, namely false.
It should be noted that, in this embodiment, the interactive operation instruction input by the user may be an interactive operation instruction input by other manners besides the voice interactive instruction, for example, an interactive operation instruction input by the user through an external device such as a handle, a keyboard, and a 3D mouse. Such interactive instructions may also perform operational actions on the virtual objects in the virtual scene, thereby altering the attribute information of the virtual objects. Obviously, the interactive operation instruction input by the user can also change the state attribute of the virtual object, so the state attribute of the virtual object can be detected after the operation action is executed, the mathematical operation related to the isVisible attribute of the virtual object can be used for sending ray detection from the shooting direction of the virtual camera, and because the OnBecameVisible and OnBecameInvisible interfaces are sealed in the Unity development, the state attribute can be directly obtained through the interfaces.
In addition, in the use of the VR device, each scene (sequence) uniquely corresponds to a group of virtual object lists, the virtual object lists may include name phrases of all virtual objects in the virtual scene space, and each word in the phrases may have a plurality of attribute parameters, such as isVisible, ispoprable, and the like. In some embodiments, an execution parameter may be further set in the virtual object list, and the execution parameter may perform different actions according to the type of the virtual object, so as to obtain an execution result.
For example, if the type of the virtual object is a function such as a switch or a jump button, the virtual object may correspond to a link; and if the type of the virtual object is the display function control, the corresponding remote control attribute. The correspondence relationship has uniqueness, and the virtual reality device 500 may modularize all object name relationships to generate a virtual object list. So that the matching is performed each time the user inputs voice interaction information and corresponding interaction is performed.
It is also possible for some commonly used interactive instructions, such as: forward one step, backward one step, turn left 90 °, exit, switch, etc. are additionally set in order to quickly perform such interaction. For example, for an operation from a user perspective or a default operation of the virtual reality device 500, a preset instruction entry may be added to the list by setting a fixed instruction, and the corresponding instruction may be corresponded to the preset instruction entry. Therefore, the processes of voice analysis, focusing on interactive objects, operation response and the like after successful matching are avoided, and the response speed and the execution efficiency are improved.
Based on the virtual reality device 500, in some embodiments of the present application, a voice-assisted interaction method is further provided, where the voice-assisted interaction method is configurable in a controller of the virtual reality device 500, so that the virtual reality device 500 can interact with each other through voice, as shown in fig. 8, the voice-assisted interaction method includes the following steps:
s1: acquiring voice interaction information input by a user;
s2: responding to the voice interaction information, and acquiring a virtual object list of the current scene, wherein the virtual object list comprises virtual object names and current attribute information of virtual objects;
s3: analyzing a user instruction from the voice interaction information;
s4: and executing the user instruction on the virtual object in the current scene, and updating the attribute information according to the execution result of the user instruction.
According to the technical scheme, the voice-assisted interaction method provided by the embodiment can analyze the user instruction according to the voice interaction information input by the user, and acquire the virtual object list of the current scene, so that the virtual object list is updated in real time according to the execution result of the user instruction after the user instruction is executed on the virtual object. The virtual object list comprises the names of the virtual objects in the current virtual scene and the current attribute information of each virtual object. According to the method, the interaction action can be executed on the object in the virtual scene through voice input, the virtual object list is updated in real time, the user can execute operation on the object in the virtual space under the condition of not using external equipment, and the interaction flexibility of the virtual reality equipment is improved.
The embodiments provided in the present application are only a few examples of the general concept of the present application, and do not limit the scope of the present application. Any other embodiments extended according to the scheme of the present application without inventive efforts will be within the scope of protection of the present application for a person skilled in the art.

Claims (10)

1. A virtual reality device, comprising:
a display configured to display a user interface;
a voice input device configured to detect voice interaction information;
a controller configured to:
acquiring voice interaction information input by a user;
responding to the voice interaction information, and acquiring a virtual object list of the current scene, wherein the virtual object list comprises virtual object names and current attribute information of virtual objects;
analyzing a user instruction from the voice interaction information;
and executing the user instruction on the virtual object in the current scene, and updating the attribute information according to the execution result of the user instruction.
2. The virtual reality device of claim 1, wherein in the step of parsing the user instruction from the voice interaction information, the controller is further configured to:
converting the voice interaction information into character information;
inputting the character information into a semantic recognition model, wherein the semantic recognition model is a machine learning model obtained through training of a training text;
and acquiring an output result of the semantic model according to the character information to generate the user instruction, wherein the output result is a structured text comprising an operation object name and an operation action.
3. The virtual reality device of claim 1, wherein in the step of executing the user instruction on a virtual object in a current scene, the controller is further configured to:
matching the operation object name in the user instruction in the virtual object list;
if the virtual object list contains the virtual object name which is the same as the name of the operation object, executing the operation action in the user instruction on the current virtual object;
and if the virtual object list does not contain the virtual object name which is the same as the name of the operation object, inputting the voice interaction information into a semantic recognition model to execute semantic analysis.
4. The virtual reality device of claim 3, wherein if the virtual object list contains a virtual object name that is the same as the operand name, the controller is further configured to:
extracting the state attribute of the current virtual object from the virtual object list;
if the state attribute is a visible state, executing the operation action on the current virtual object;
and if the state attribute is in an invisible state, controlling the display to display a prompt interface for prompting that the current virtual object is not in a visible range.
5. The virtual reality device of claim 4, wherein if the state attribute is a visible state, the controller is further configured to:
extracting the operation attribute of the current virtual object from the virtual object list;
if the operation attribute supports the operation action, executing the operation action on the current virtual object;
and if the operation attribute does not support the operation action, controlling the display to display a prompt interface for prompting that the current virtual object cannot be controlled.
6. The virtual reality device of claim 5, wherein after the step of performing the operational action on the current virtual object, the controller is further configured to:
extracting the operation action execution result of the current virtual object, wherein the execution result comprises the position of the current virtual object in the virtual scene after the operation action is executed;
acquiring current pose information of the virtual reality equipment;
calculating the visible range of the virtual reality equipment in the current virtual scene according to the pose information;
and detecting whether the current virtual object is in the visual range or not so as to update the state attribute of the current virtual object.
7. The virtual reality device of claim 1, wherein the controller is further configured to:
receiving a control instruction which is input by a user and used for entering a virtual scene;
traversing the virtual object names and the current attribute information of all virtual objects in the current virtual scene in response to the control instruction;
and generating a virtual object list according to the virtual object name and the current attribute information.
8. The virtual reality device of claim 7, wherein after the step of generating a list of virtual objects from the virtual object name and current attribute information, the controller is further configured to:
detecting pose information in real time;
if the pose information is changed, acquiring the visible range of the virtual reality equipment in the virtual scene under the changed pose information;
traversing the virtual object within the visual range to update the state attribute of the virtual object within the visual range.
9. The virtual reality device of claim 1, wherein the controller is further configured to:
acquiring an interactive operation instruction input by a user;
responding to the interactive operation instruction, and executing operation actions in the interactive operation instruction on the virtual object;
and if the operation attribute of the virtual object is modified by the operation action, updating the operation attribute of the virtual object.
10. A voice-assisted interaction method is applied to virtual reality equipment, the virtual reality equipment comprises a display, a voice input device and a controller, and the voice-assisted interaction method comprises the following steps:
acquiring voice interaction information input by a user;
responding to the voice interaction information, and acquiring a virtual object list of the current scene, wherein the virtual object list comprises virtual object names and current attribute information of virtual objects;
analyzing a user instruction from the voice interaction information;
and executing the user instruction on the virtual object in the current scene, and updating the attribute information according to the execution result of the user instruction.
CN202110120704.2A 2021-01-28 2021-01-28 Virtual reality equipment and voice-assisted interaction method Pending CN112905007A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110120704.2A CN112905007A (en) 2021-01-28 2021-01-28 Virtual reality equipment and voice-assisted interaction method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110120704.2A CN112905007A (en) 2021-01-28 2021-01-28 Virtual reality equipment and voice-assisted interaction method

Publications (1)

Publication Number Publication Date
CN112905007A true CN112905007A (en) 2021-06-04

Family

ID=76119943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110120704.2A Pending CN112905007A (en) 2021-01-28 2021-01-28 Virtual reality equipment and voice-assisted interaction method

Country Status (1)

Country Link
CN (1) CN112905007A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452786A (en) * 2023-06-08 2023-07-18 北京微应软件科技有限公司 Virtual reality content generation method, system, computer device and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575156A (en) * 2014-07-25 2017-04-19 微软技术许可有限责任公司 Smart placement of virtual objects to stay in the field of view of a head mounted display
CN107229393A (en) * 2017-06-02 2017-10-03 三星电子(中国)研发中心 Real-time edition method, device, system and the client of virtual reality scenario
CN108881784A (en) * 2017-05-12 2018-11-23 腾讯科技(深圳)有限公司 Virtual scene implementation method, device, terminal and server
CN109276887A (en) * 2018-09-21 2019-01-29 腾讯科技(深圳)有限公司 Information display method, device, equipment and the storage medium of virtual objects
CN109725782A (en) * 2017-10-27 2019-05-07 腾讯科技(深圳)有限公司 A kind of method, apparatus that realizing virtual reality and smart machine, storage medium
CN110751535A (en) * 2019-09-27 2020-02-04 王小刚 VR-based panoramic shopping system and method
CN111013142A (en) * 2019-11-19 2020-04-17 腾讯科技(深圳)有限公司 Interactive effect display method and device, computer equipment and storage medium
CN111508482A (en) * 2019-01-11 2020-08-07 阿里巴巴集团控股有限公司 Semantic understanding and voice interaction method, device, equipment and storage medium
CN111672131A (en) * 2020-06-05 2020-09-18 腾讯科技(深圳)有限公司 Virtual article acquisition method, device, terminal and storage medium
CN111821691A (en) * 2020-07-24 2020-10-27 腾讯科技(深圳)有限公司 Interface display method, device, terminal and storage medium
CN112099628A (en) * 2020-09-08 2020-12-18 平安科技(深圳)有限公司 VR interaction method and device based on artificial intelligence, computer equipment and medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106575156A (en) * 2014-07-25 2017-04-19 微软技术许可有限责任公司 Smart placement of virtual objects to stay in the field of view of a head mounted display
CN108881784A (en) * 2017-05-12 2018-11-23 腾讯科技(深圳)有限公司 Virtual scene implementation method, device, terminal and server
CN107229393A (en) * 2017-06-02 2017-10-03 三星电子(中国)研发中心 Real-time edition method, device, system and the client of virtual reality scenario
CN109725782A (en) * 2017-10-27 2019-05-07 腾讯科技(深圳)有限公司 A kind of method, apparatus that realizing virtual reality and smart machine, storage medium
CN109276887A (en) * 2018-09-21 2019-01-29 腾讯科技(深圳)有限公司 Information display method, device, equipment and the storage medium of virtual objects
CN111508482A (en) * 2019-01-11 2020-08-07 阿里巴巴集团控股有限公司 Semantic understanding and voice interaction method, device, equipment and storage medium
CN110751535A (en) * 2019-09-27 2020-02-04 王小刚 VR-based panoramic shopping system and method
CN111013142A (en) * 2019-11-19 2020-04-17 腾讯科技(深圳)有限公司 Interactive effect display method and device, computer equipment and storage medium
CN111672131A (en) * 2020-06-05 2020-09-18 腾讯科技(深圳)有限公司 Virtual article acquisition method, device, terminal and storage medium
CN111821691A (en) * 2020-07-24 2020-10-27 腾讯科技(深圳)有限公司 Interface display method, device, terminal and storage medium
CN112099628A (en) * 2020-09-08 2020-12-18 平安科技(深圳)有限公司 VR interaction method and device based on artificial intelligence, computer equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116452786A (en) * 2023-06-08 2023-07-18 北京微应软件科技有限公司 Virtual reality content generation method, system, computer device and storage medium
CN116452786B (en) * 2023-06-08 2023-10-10 北京交通大学 Virtual reality content generation method, system, computer device and storage medium

Similar Documents

Publication Publication Date Title
KR102552551B1 (en) Keyboards for virtual, augmented and mixed reality display systems
CN110636353B (en) Display device
WO2015188614A1 (en) Method and device for operating computer and mobile phone in virtual world, and glasses using same
CN111970456B (en) Shooting control method, device, equipment and storage medium
CN114286142B (en) Virtual reality equipment and VR scene screen capturing method
CN112732089A (en) Virtual reality equipment and quick interaction method
US20220291808A1 (en) Integrating Artificial Reality and Other Computing Devices
CN114302221B (en) Virtual reality equipment and screen-throwing media asset playing method
CN113066189B (en) Augmented reality equipment and virtual and real object shielding display method
CN112073770A (en) Display device and video communication data processing method
CN112905007A (en) Virtual reality equipment and voice-assisted interaction method
WO2022193931A1 (en) Virtual reality device and media resource playback method
WO2022151882A1 (en) Virtual reality device
CN115129280A (en) Virtual reality equipment and screen-casting media asset playing method
CN115509361A (en) Virtual space interaction method, device, equipment and medium
CN111782053B (en) Model editing method, device, equipment and storage medium
CN112732088B (en) Virtual reality equipment and monocular screen capturing method
WO2022111005A1 (en) Virtual reality (vr) device and vr scenario image recognition method
CN116931713A (en) Virtual reality equipment and man-machine interaction method
CN112667079A (en) Virtual reality equipment and reverse prompt picture display method
CN116935084A (en) Virtual reality equipment and data verification method
CN114286077A (en) Virtual reality equipment and VR scene image display method
CN114363705A (en) Augmented reality equipment and interaction enhancement method
CN114327032A (en) Virtual reality equipment and VR (virtual reality) picture display method
CN116266090A (en) Virtual reality equipment and focus operation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination