CN116931713A

CN116931713A - Virtual reality equipment and man-machine interaction method

Info

Publication number: CN116931713A
Application number: CN202210321874.1A
Authority: CN
Inventors: 袁毅; 郑贵桢
Original assignee: Hisense Electronic Technology Shenzhen Co ltd
Current assignee: Hisense Electronic Technology Shenzhen Co ltd
Priority date: 2022-03-29
Filing date: 2022-03-29
Publication date: 2023-10-24

Abstract

The application provides virtual reality equipment and a human-computer interaction method, which are characterized in that a camera is used for detecting hand gesture data of a user to generate a hand gesture data set, first-order difference processing is carried out on all hand gesture data in the hand gesture data set to obtain a first inter-frame difference set, the first inter-frame difference in the first inter-frame difference set within a preset range is replaced by a second inter-frame difference to obtain a second inter-frame difference set, target hand gesture data are obtained according to the inter-frame difference in the second inter-frame difference set, so that control instructions corresponding to the target hand gesture data are executed, errors of the obtained hand gesture data are reduced, the success rate of effective human-computer interaction is improved, and user experience is facilitated.

Description

Virtual reality equipment and man-machine interaction method

Technical Field

The embodiment of the application relates to the technical field of virtual reality. And more particularly, to a virtual reality device and a man-machine interaction method.

Background

Virtual Reality (VR) technology is a display technology that simulates a Virtual environment by a computer, thereby giving an environmental immersion. A virtual reality device is a device that presents a virtual picture to a user using virtual display technology. The user can perform man-machine interaction with the virtual reality device through a specific hand gesture so as to change the virtual picture.

When the user performs man-machine interaction with the virtual reality device, the virtual reality device can identify key points of the hand of the user according to video frames shot by the camera, and calculate hand gesture data of the user according to the key points of the hand of the user, so that relevant instructions input by the user are judged, and man-machine interaction is completed.

However, when the virtual reality device calculates hand gesture data of the user, the virtual reality device cannot accurately identify each hand key point of the user due to the influence of imaging quality of the camera and the hand moving speed of the user, so that errors exist in the calculated hand gesture data, effective man-machine interaction cannot be realized, and user experience is not facilitated.

Disclosure of Invention

The exemplary embodiment of the application provides virtual reality equipment and a man-machine interaction method, which are used for solving the problem that effective man-machine interaction cannot be realized because errors exist in hand gesture data of a user calculated by the virtual reality equipment in the prior art.

In one aspect, the present application provides a virtual reality device comprising:

a display;

the camera is configured to detect hand gesture data of a user in real time;

a controller configured to control the operation of the device,

Responding to the change of the hand gesture of the user, acquiring a hand gesture data set, wherein the hand gesture data set comprises hand gesture data arranged according to a preset sequence, and the hand gesture data is used for representing the hand gesture of the user at a target moment;

performing first-order difference processing on the hand gesture data in the hand gesture data set to obtain a first inter-frame difference set, wherein the first inter-frame difference set comprises first inter-frame differences between hand gesture data of any adjacent frames;

replacing a first inter-frame difference within a preset range with a second inter-frame difference to obtain a second inter-frame difference set, wherein the second inter-frame difference is one inter-frame difference of adjacent frames of the first inter-frame difference;

and acquiring target hand gesture data according to the inter-frame differences in the second inter-frame difference set so as to execute a control instruction corresponding to the target hand gesture data.

In another aspect, the present application provides a user interaction method, including:

According to the virtual reality equipment and the man-machine interaction method, the camera can detect the hand gesture data of the user, the hand gesture data set is generated, the first-order difference processing is carried out on the hand gesture data in the hand gesture data set to obtain the first inter-frame difference set, the first inter-frame difference in the first inter-frame difference set in the preset range is replaced by the second inter-frame difference to obtain the second inter-frame difference set, and the target hand gesture data are obtained according to the inter-frame difference in the second inter-frame difference set so as to execute the control instruction corresponding to the target hand gesture data, so that the error of the obtained hand gesture data is reduced, the success rate of effective man-machine interaction is improved, and the user experience is facilitated.

Drawings

In order to more clearly illustrate the embodiments of the present application or the implementation of the related art, the drawings that are required for the embodiments or the related art description will be briefly described, and it is apparent that the drawings in the following description are some embodiments of the present application and that other drawings may be obtained according to these drawings for a person having ordinary skill in the art.

FIG. 1 illustrates a display system architecture diagram including a virtual reality device, according to some embodiments;

FIG. 2 illustrates a VR scene global interface schematic in accordance with some embodiments;

FIG. 3 illustrates a recommended content region schematic diagram of a global interface, according to some embodiments;

FIG. 4 illustrates an application shortcut entry area schematic for a global interface in accordance with some embodiments;

FIG. 5 illustrates a suspension diagram of a global interface, according to some embodiments;

FIG. 6 illustrates an interaction diagram of a user with a virtual digital inputter in a global interface according to some embodiments;

FIG. 7 illustrates an interaction diagram of a user with a virtual music keyboard in a global interface, according to some embodiments;

FIG. 8 illustrates a data processing flow diagram of a human-machine interaction method according to some embodiments;

FIG. 9 illustrates a flow diagram for batch processing hand gesture data, according to some embodiments;

FIG. 10 illustrates a Gaussian distribution diagram of a first set of interframe differences according to some embodiments;

FIG. 11 illustrates a front-to-back comparison of eliminating hand pose data errors according to some embodiments;

fig. 12 shows a schematic flow chart of a man-machine interaction method provided by the application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of exemplary embodiments of the present application more apparent, the technical solutions of exemplary embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the exemplary embodiments of the present application, and it is apparent that the described exemplary embodiments are only some embodiments of the present application, not all embodiments.

All other embodiments, which can be made by a person skilled in the art without inventive effort, based on the exemplary embodiments shown in the present application are intended to fall within the scope of the present application. Furthermore, while the disclosed content is presented in terms of an exemplary one or more examples, it should be appreciated that the various aspects of the disclosure can also individually constitute a complete set of aspects.

It should be understood that the terms "first," "second," "third," and the like in the description and in the claims and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate, such as where appropriate, for example, implementations other than those illustrated or described in connection with the embodiments of the application.

Furthermore, the terms "comprise" and "have," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a product or apparatus that comprises a list of elements is not necessarily limited to those elements expressly listed, but may include other elements not expressly listed or inherent to such product or apparatus.

The term "module" as used in this disclosure refers to any known or later developed hardware, software, firmware, artificial intelligence, fuzzy logic, or combination of hardware and/or software code that is capable of performing the function associated with that element.

Reference throughout this specification to "multiple embodiments," "some embodiments," "one embodiment," or "an embodiment," etc., means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, appearances of the phrases "in various embodiments," "in some embodiments," "in at least one other embodiment," or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. Thus, a particular feature, structure, or characteristic shown or described in connection with one embodiment may be combined, in whole or in part, with features, structures, or characteristics of one or more other embodiments without limitation. Such modifications and variations are intended to be included within the scope of the present application.

In an embodiment of the present application, the virtual reality device 500 generally refers to a display device that can be worn on the face of a user to provide a immersive experience for the user, including, but not limited to, VR glasses, augmented reality devices (Augmented Reality, AR), VR gaming devices, mobile computing devices, and other wearable computers. The technical scheme is described by taking VR glasses as an example in some embodiments of the present application, and it should be understood that the provided technical scheme can be applied to other types of virtual reality devices at the same time. The virtual reality device 500 may operate independently or be connected to other intelligent display devices as an external device, where the display device may be an intelligent television, a computer, a tablet computer, a server, or the like.

The virtual reality device 500 may display a media asset screen after being worn on the face of the user, providing near images for both eyes of the user to bring an immersive experience. To present the asset screen, the virtual reality device 500 may include a plurality of components for displaying the screen and face wear. Taking VR glasses as an example, the virtual reality device 500 may include components such as a housing, a position fixture, an optical system, a display assembly, a gesture detection circuit, an interface circuit, and the like. In practical applications, the optical system, the display assembly, the gesture detection circuit and the interface circuit may be disposed in the housing, so as to present a specific display screen; the two sides of the shell are connected with position fixing pieces so as to be worn on the face of a user.

When the gesture detection circuit is used, gesture detection elements such as a gravity acceleration sensor and a gyroscope are arranged in the gesture detection circuit, when the head of a user moves or rotates, the gesture of the user can be detected, detected gesture data are transmitted to a processing element such as a controller, and the processing element can adjust specific picture content in the display assembly according to the detected gesture data.

As shown in fig. 1, in some embodiments, the virtual reality device 500 may be connected to the display device 200, and a network-based display system is constructed between the virtual reality device 500, the display device 200, and the server 400, and data interaction may be performed in real time, for example, the display device 200 may obtain media data from the server 400 and play the media data, and transmit specific screen content to the virtual reality device 500 for display.

The display device 200 may be a liquid crystal display, an OLED display, a projection display device, among others. The particular display device type, size, resolution, etc. are not limited, and those skilled in the art will appreciate that the display device 200 may be modified in performance and configuration as desired. The display device 200 may provide a broadcast receiving tv function, and may additionally provide an intelligent network tv function of a computer supporting function, including, but not limited to, a network tv, an intelligent tv, an Internet Protocol Tv (IPTV), etc.

The display device 200 and the virtual reality device 500 also communicate data with the server 400 through a variety of communication means. The display device 200 and the virtual reality device 500 may be allowed to communicate via a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various contents and interactions to the display device 200. By way of example, display device 200 receives software program updates, or accesses a remotely stored digital media library by sending and receiving information, as well as Electronic Program Guide (EPG) interactions. The server 400 may be a cluster, or may be a plurality of clusters, and may include one or more types of servers. Other web service content such as video on demand and advertising services are provided through the server 400.

In the course of data interaction, the user may operate the display device 200 through the mobile terminal 300 and the remote controller 100. The mobile terminal 300 and the remote controller 100 may communicate with the display device 200 by a direct wireless connection or by a non-direct connection. That is, in some embodiments, the mobile terminal 300 and the remote controller 100 may communicate with the display device 200 through a direct connection manner of bluetooth, infrared, etc. When the control instruction is transmitted, the mobile terminal 300 and the remote controller 100 may directly transmit the control instruction data to the display device 200 through bluetooth or infrared.

In other embodiments, the mobile terminal 300 and the remote controller 100 may also access the same wireless network with the display device 200 through a wireless router to establish indirect connection communication with the display device 200 through the wireless network. When transmitting the control command, the mobile terminal 300 and the remote controller 100 may transmit the control command data to the wireless router first, and then forward the control command data to the display device 200 through the wireless router.

In some embodiments, the user may also use the mobile terminal 300 and the remote controller 100 to directly interact with the virtual implementation device 500, for example, the mobile terminal 300 and the remote controller 100 may be used as handles in a virtual reality scene to implement functions such as somatosensory interaction.

In some embodiments, the display components of the virtual reality device 500 include a display screen and drive circuitry associated with the display screen. To present a specific picture and bring about a stereoscopic effect, two display screens may be included in the display assembly, corresponding to the left and right eyes of the user, respectively. When the 3D effect is presented, the picture contents displayed in the left screen and the right screen are slightly different, and a left camera and a right camera of the 3D film source in the shooting process can be respectively displayed. Because of the content of the screen observed by the left and right eyes of the user, the display screen with a strong stereoscopic impression can be observed when the user wears the display screen.

The optical system in the virtual reality device 500 is an optical module composed of a plurality of lenses. The optical system is arranged between the eyes of the user and the display screen, and the optical path can be increased through the refraction of the optical signals by the lens and the polarization effect of the polaroid on the lens, so that the content presented by the display component can be clearly presented in the visual field of the user. Meanwhile, in order to adapt to the vision condition of different users, the optical system also supports focusing, namely, the position of one or more of the lenses is adjusted through a focusing assembly, the mutual distance among the lenses is changed, and therefore the optical path is changed, and the picture definition is adjusted.

The interface circuit of the virtual reality device 500 may be used to transfer interaction data, and besides transferring gesture data and displaying content data, in practical application, the virtual reality device 500 may also be connected to other display devices or peripheral devices through the interface circuit, so as to implement more complex functions by performing data interaction with the connection device. For example, the virtual reality device 500 may be connected to a display device through an interface circuit, so that a displayed screen is output to the display device in real time for display. For another example, the virtual reality device 500 may also be connected to a handle via interface circuitry, which may be operated by a user in his hand, to perform related operations in the VR user interface.

Wherein the VR user interface can be presented as a plurality of different types of UI layouts depending on user operation. For example, the user interface may include a global interface, such as the global UI shown in fig. 2 after the AR/VR terminal is started, which may be displayed on a display screen of the AR/VR terminal or may be displayed on a display of the display device. The global UI may include a recommended content area 1, a business class extension area 2, an application shortcut entry area 3, and a hover area 4.

The recommended content area 1 is used for configuring TAB columns of different classifications; media resources, topics and the like can be selectively configured in the columns; the media assets may include 2D movies, educational courses, travel, 3D, 360 degree panoramas, live broadcasts, 4K movies, program applications, games, travel, etc. services with media asset content, and the fields may select different template styles, may support simultaneous recommended programming of media assets and themes, as shown in fig. 3.

In some embodiments, the content recommendation area 1 may also include a main interface and a sub-interface. As shown in fig. 3, the portion located in the center of the UI layout is a main interface, and the portions located at both sides of the main interface are sub-interfaces. The main interface and the auxiliary interface can be used for respectively displaying different recommended contents. For example, according to the recommended type of the sheet source, the service of the 3D sheet source may be displayed on the main interface; and the service of the 2D film source is displayed in the left side sub-interface, and the service of the panoramic film source is displayed in the right side sub-interface.

Obviously, for the main interface and the auxiliary interface, different service contents can be displayed and simultaneously presented as different content layouts. And, the user can control the switching of the main interface and the auxiliary interface through specific interaction actions. For example, by controlling the focus mark to move left and right, and moving the focus mark to the right when the focus mark is at the rightmost side of the main interface, the sub-interface at the right side can be controlled to be displayed at the middle position of the UI layout, at this time, the main interface is switched to the service for displaying the full-view film source, and the sub-interface at the left side is switched to the service for displaying the 3D film source; and the right side sub-interface is switched to the service of displaying the 2D patch source.

In addition, in order to facilitate the user to watch, the main interface and the auxiliary interface can be displayed respectively through different display effects. For example, the transparency of the secondary interface can be improved, so that the secondary interface obtains a blurring effect, and the primary interface is highlighted. The auxiliary interface can be set as gray effect, the main interface is kept as color effect, and the main interface is highlighted.

In some embodiments, a status bar may be further provided on top of the recommended content area 1, and a plurality of display controls may be provided in the status bar, including time, network connection status, power, and other common options. The content included in the status bar may be user-defined, e.g., weather, user avatar, etc., may be added. The content contained in the status bar may be selected by the user to perform the corresponding function. For example, when the user clicks on a time option, the virtual reality device 500 may display a time device window in the current interface or jump to a calendar interface. When the user clicks on the network connection status option, the virtual reality device 500 may display a WiFi list on the current interface or jump to the network setup interface.

The content displayed in the status bar may be presented in different content forms according to the setting status of a specific item. For example, the time control may be displayed directly as specific time text information and display different text at different times; the power control may be displayed as different pattern styles according to the current power remaining situation of the virtual reality device 500.

The status bar is used to enable the user to perform a common control operation, so as to implement quick setting of the virtual reality device 500. Since the setup procedure for the virtual reality device 500 includes a number of items, all of the commonly used setup options cannot generally be displayed in the status bar. To this end, in some embodiments, an expansion option may also be provided in the status bar. After the expansion option is selected, an expansion window may be presented in the current interface, and a plurality of setting options may be further provided in the expansion window for implementing other functions of the virtual reality device 500.

For example, in some embodiments, after the expansion option is selected, a "shortcut center" option may be set in the expansion window. After clicking the shortcut center option, the user may display a shortcut center window by the virtual reality device 500. The shortcut center window can comprise screen capturing, screen recording and screen throwing options for respectively waking up corresponding functions.

The traffic class extension area 2 supports extension classes that configure different classes. And if the new service type exists, supporting configuration independent TAB, and displaying the corresponding page content. The service classification in the service classification expansion area 2 can also be subjected to sequencing adjustment and offline service operation. In some embodiments, the service class extension area 2 may include content: movie, education, travel, application, my. In some embodiments, the traffic class extension area 2 is configured to display the large traffic class TAB and support more classes configured, the icon of which supports the configuration as shown in fig. 3.

The application shortcut entry area 3 may specify that pre-installed applications, which may be specified as a plurality, are displayed in front for operational recommendation, supporting configuration of special icon styles to replace default icons. In some embodiments, the application shortcut entry area 3 further includes a left-hand movement control, a right-hand movement control for moving the options target, for selecting different icons, as shown in fig. 4.

The hover region 4 may be configured to be above the left diagonal side, or above the right diagonal side of the fixation region, may be configured as an alternate character, or may be configured as a jump link. For example, the suspension jumps to an application or displays a specified function page after receiving a confirmation operation, as shown in fig. 5. In some embodiments, the suspension may also be configured without jump links, purely for visual presentation.

In some embodiments, the global UI further includes a status bar at the top for displaying time, network connection status, power status, and more shortcut entries. After the handle of the AR/VR terminal is used, namely the handheld controller selects the icon, the icon displays a text prompt comprising left and right expansion, and the selected icon is stretched and expanded left and right according to the position.

For example, after selecting the search icon, the search icon will display the icon containing the text "search" and the original icon, and after further clicking the icon or text, the search icon will jump to the search page; for another example, clicking on the favorites icon jumps to favorites TAB, clicking on the history icon defaults to positioning the display history page, clicking on the search icon jumps to the global search page, clicking on the message icon jumps to the message page.

In some embodiments, the interaction may be performed through a peripheral device, e.g., a handle of the AR/VR terminal may operate a user interface of the AR/VR terminal, including a back button; the home key can realize the reset function by long-time pressing; volume up and down buttons; and the touch area can realize clicking, sliding and holding drag functions of the focus.

In other embodiments, the virtual reality device 500 may be configured with an image capture device that may capture real scene images in different ways. For example, when the user uses the augmented reality function for the environment in which the user is located, the image pickup device is specifically a camera provided on the virtual reality apparatus 500. In the use process, the virtual reality device 500 can shoot the environment where the current user is located through the camera, so as to obtain a real scene picture corresponding to the current use scene. For another example, when the user uses the augmented reality function for an environment other than the own scene, the image capturing apparatus is specifically a video transmission apparatus to obtain a real scene picture video stream of the specified use environment.

The virtual reality device 500 may also invoke data related to the virtual object while acquiring the real scene picture. For example, the augmented reality device 500 may obtain a virtual object model in a virtual object library of a local memory or cloud server. Obviously, the virtual object data invoked by the virtual reality device 500 may also include different forms, depending on different AR applications and the user's needs. For example, the virtual object data may include, but is not limited to, one or more of text, images, three-dimensional models, and combinations of video pictures.

When the user uses the augmented reality function for the environment where the user is located, namely, when the image acquisition device is a camera, the camera can shoot the hand image of the user so as to acquire the hand gesture of the user, and interaction is performed according to the hand gesture of the user. For example, the user may send a "screen capture" instruction to the virtual reality device 500 through a specific preset hand gesture, and the virtual reality device 500 may detect the hand gesture of the user through the camera, and when the user makes the preset gesture, may consider that the user sends the "screen capture" instruction to the virtual reality device 500. For example, it may be set to: when it is detected that the user has drawn a V-word, it is determined that the user has input a "screen capture" instruction to the virtual reality apparatus 500. Or may be set to: when it is detected that the hand gesture of the user is unfolded from five fingers to a fist, it is determined that the user has input a "screen capturing" instruction to the virtual reality device 500.

When the user performs interaction with the virtual reality device 500 through a specific hand gesture, the user may directly perform an operation on the virtual UI interface to control the virtual reality device 500 to perform a corresponding command. As shown in fig. 6, a number input unit 610 and a number display area 620 are displayed on the UI interface, the number input unit 610 includes a plurality of number keys, and a user can move his/her hand to the positions of the number keys and make an action of clicking the number keys so that the number display area 620 displays the corresponding numbers. Alternatively, as shown in fig. 7, a music keyboard 700 is displayed on the UI interface, where the music keyboard 700 includes a plurality of music keys, and a user may place a finger on a specific music key to perform a playing action, and trigger the virtual reality device 500 to play corresponding music.

Therefore, when the user performs the interactive operation with the virtual reality device 500 through the specific hand gesture, the user needs to rely on the recognition of the hand gesture of the user by the virtual reality device 500, and calculate the hand gesture data of the user through the 3D gesture algorithm, so that the virtual reality device 500 performs the corresponding operation.

However, due to the influence of the imaging quality of the camera and the movement speed of the hand of the user, the hand gesture of the user cannot be accurately identified, so that a larger error exists in the calculated hand gesture data, for example, the user moves the hand to the position of the digital key, but the clicking action is not performed yet, the action that the user moves the hand to the correct digital key cannot be accurately shot due to the influence of the shooting angle of the camera, so that the calculated hand gesture data is the hand gesture data when the user performs the clicking action, or the calculated hand gesture data is the hand gesture data when the user stops the hand on other digital keys, so that the action of the user is misjudged as clicking the digital key, and invalid man-machine interaction is generated. For another example, when the user plays on the music keyboard 700 by making the hand model 630 on the playing action control UI, the camera is easy to shoot the video with the "ghost" effect because the playing action of the user is too fast, so that the hand gesture of the user cannot be accurately identified, and invalid man-machine interaction is generated.

In order to solve the above problem, reduce the error of the recognized hand gesture data, improve the power generation of the effective man-machine interaction, in some embodiments of the present application, a virtual reality device 500 is provided, where the virtual reality device 500 may be a wearable device, a VR game device, an AR game device, and the like, and the virtual reality device 500 should at least include a display, a camera, a memory, and a controller. The display comprises a left display and a right display which are respectively used for displaying virtual reality pictures so as to show picture contents at different visual angles to the left eye and the right eye of a user, and a 3D effect is formed. The camera may be used to capture a user's gesture video to detect a change in the user's gesture. The memory may be used to store application programs and video captured by the camera. The controller is configured to execute the application program stored in the memory to implement the interactive function.

In some embodiments, as shown in fig. 8, when a user controls the virtual reality device 500 by changing hand gestures, the virtual reality device 500 obtains a hand gesture data set in response to a change of the hand gestures of the user, where the hand gesture data set includes hand gesture data arranged in a preset order, and the hand gesture data is used to characterize the hand gesture of the user at a target time. In the process of using the virtual reality device 500, a user can shoot a video including the hand position of the user in real time through a camera, and the video is stored in a memory in frames according to the shooting sequence. When the camera captures that the hand gesture of the user changes, the controller reads a first number of video frames stored in the memory, each video frame comprises a hand image of the user, and the controller can acquire hand gesture data of the user according to the hand images of the user.

In some embodiments, the controller controls to perform image recognition on the video frames stored in the memory, and may read hand gesture data of the user in each video frame, where the hand gesture data of the user may include a preset number of hand joints, for example, the hand gesture data of the user may include 21 hand joints identified in the video frame under a reference coordinate system of the camera, that is, three joints and 4 points of a fingertip of each finger and one key point at a wrist in a hand image of the user, and the key point corresponding to each hand joint may have 3 degrees of freedom, and an output dimension is 21×3.

In some embodiments, the obtained hand gesture data of the user is arranged according to the sequence of the corresponding video frames to generate a hand gesture data set, for example, seven video frames, a1, a2, a3, a4, a5, a6, a7, which are arranged according to the recording sequence of the video are sequentially acquired, and are represented as f= (a 1, a2, a3, a4, a5, a6, a 7) in a set form, wherein F represents a video frame set, each video frame is respectively subjected to image recognition, the hand gesture data of the user in each video frame is acquired, and the corresponding hand is generated Part posture data set F ₁ = (b 1, b2, b3, b4, b5, b6, b 7), where bn represents hand gesture data of the user identified from video frame an, n is a positive integer, i.e. b1, b2, b3, b4, b5, b6, b7 represents hand gesture data of the user identified from video frame a1, a2, a3, a4, a5, a6, a 7.

In some embodiments, the difference processing may be performed on each hand gesture data in the hand gesture data set to obtain a first inter-frame difference set, where the first inter-frame difference set includes a first inter-frame difference between hand gesture data of any adjacent frame, and the similarity between hand gesture data identified from any two adjacent video frames may be obtained according to the first inter-frame difference, so as to determine whether abnormal data exists according to the similarity between hand gesture data identified from any two adjacent video frames.

In some embodiments, the difference between any hand pose data and the next hand pose data of its adjacent frame may be determined as the first inter-frame difference of the hand pose data, the first inter-frame difference of each hand pose data in the set of hand pose data is obtained, and the first inter-frame difference set is obtained. For example, a hand gesture data set F ₁ Differential processing is carried out on each hand gesture data in the= (b 1, b2, b3, b4, b5, b6, b 7) to obtain a first inter-frame difference set F ₂ = (c 1, c2, c3, c4, c5, c 6), wherein cn=b (n+1) -bn, cn represents the first inter-frame difference of the hand gesture data bn of the user, n is a positive integer, i.e. c1=b2-b 1, c1 represents the first inter-frame difference of b 1; c2 =b3-b 2, c2 represents the first inter-frame difference of b 2; c3, c4, c5, c6 respectively represent the first interframe differences b3, b4, b5, b6, which are calculated in a manner consistent with cn=b (n+1) -bn described above, and the present application will not be repeated, but since c7=b8-b 7, the first interframe difference set F ₂ B8 is absent, so that the calculation of b7, i.e. the first inter-frame difference set F, can be ignored ₂ The number of items in (a) is equal to the hand gesture data set F ₁ The number of entries in (2) is reduced by 1.

In some embodiments, when each hand gesture data in the hand gesture data set is subjected to differential processing, if any hand gesture data is associated with itThe difference of the next hand pose data of the adjacent frame is determined as the first inter-frame difference of the hand pose data, and since the last hand pose data in the hand pose data set does not have the next hand pose data adjacent thereto, the difference of the last hand pose data in the hand pose data set and the previous hand pose data adjacent thereto can be determined as the first inter-frame difference of the last hand pose data, for example, the hand pose data set F ₁ Differential processing is carried out on each hand gesture data in the= (b 1, b2, b3, b4, b5, b6, b 7) to obtain a first inter-frame difference set F ₂ = (c 1, c2, c3, c4, c5, c6, c 7), wherein c7 represents a first inter-frame difference of b7, c7=b7-b 6, c1, c2, c3, c4, c5, c6 represents a first inter-frame difference of b1, b2, b3, b4, b5, b6, respectively, for c1, c2, c3, c4, c5, c6 there is cn=bn-b (n-1), cn represents a first inter-frame difference of hand posture data bn of the user, n=1, 2, 3,4,5,6; first inter-frame difference set F ₂ Item number and hand gesture data set F in (1) ₁ The number of items in (a) is the same.

In some embodiments, the difference processing is performed on each hand gesture data in the hand gesture data set, a difference between any hand gesture data and a previous hand gesture data of an adjacent frame can be determined as a first inter-frame difference of the hand gesture data, the first inter-frame difference of each hand gesture data in the hand gesture data set is obtained, and a first inter-frame difference set is obtained. For example, a hand gesture data set F ₁ Difference processing is carried out on the hand pose data in the modes of= (b 1, b2, b3, b4, b5, b6 and b 7) to obtain a first inter-frame difference set F ₂ = (c 2, c3, c4, c5, c6, c 7), wherein cn=bn-b (n-1), cn represents the first inter-frame difference of the hand pose data bn of the user, n is a positive integer, i.e. c2=b2-b 1, c2 represents the first inter-frame difference of b 2; c3 =b3-b 2, c3 represents the first inter-frame difference of b 3; c4 C5, c6, c7 respectively represent the first inter-frame differences of b4, b5, b6, b7, and the calculation method thereof follows the above cn=bn-b (n-1), and the application will not be repeated, but it should be noted that, since b1 is the first item in the hand gesture data set, and the hand gesture data set does not have the front adjacent to b1One hand pose data, so that the computation of the first inter-frame difference set of the hand pose data b1, i.e. the first inter-frame difference set F, can be directly ignored ₂ The number of items in (a) is equal to the hand gesture data set F ₁ The number of entries in (a) is reduced by 1.

In some embodiments, when the difference between each hand posture data in the set of hand posture data is determined to be the first inter-frame difference of the hand posture data, if the difference between any one hand posture data and the previous hand posture data of its adjacent frame is determined to be the first inter-frame difference of the hand posture data, since the leading hand posture data in the set of hand posture data does not exist the previous hand posture data adjacent thereto, the difference between the leading hand posture data in the set of hand posture data and the next hand posture data adjacent thereto may be determined to be the first inter-frame difference of the leading hand posture data, e.g., the set of hand posture data F ₁ Differential processing is carried out on each hand gesture data in the= (b 1, b2, b3, b4, b5, b6, b 7) to obtain a first inter-frame difference set F ₂ = (c 1, c2, c3, c4, c5, c6, c 7), wherein c1 represents a first inter-frame difference of b1, c1=b2-b 1, c2, c3, c4, c5, c6, c7 represents a first inter-frame difference of b2, b3, b4, b5, b6, b7, respectively, and for c2, c3, c4, c5, c6, c7 there is cn=bn-b (n-1), cn represents a first inter-frame difference of hand posture data bn of the user, n=2, 3, 4,5,6,7; first inter-frame difference set F ₂ Item number and hand gesture data set F in (1) ₁ The number of items in (a) is the same.

In some embodiments, if the duration of time when the hand gesture of the user changes is longer, the number of relevant video frames including the change of the hand gesture of the user in the video captured by the camera is increased, so that the number of terms in the video set is also increased, by performing picture recognition on each video frame in the video set, the hand gesture data of the user in each video frame is obtained, and a hand gesture data set is generated, where the hand gesture data set includes hand gesture data recognized from each video frame arranged according to the sequence of the video frames, and since the number of terms in the hand gesture data set is also greatly increased, in order to further improve the accuracy of the obtained target hand gesture data of the user, as shown in fig. 9, only the hand gesture data of the user in the hand gesture data set is processed in batches, and after each processing, whether the hand gesture data set has unselected hand gesture data is determined; if so, continuing to select the preset number of hand gesture data, wherein the hand gesture data selected at this time are continuous hand gesture data in the hand gesture data set and comprise at least one hand gesture data which is not selected, and the inter-frame difference of each hand gesture data in the hand gesture data set can be obtained through batch processing so as to update the first inter-frame difference set.

In some embodiments, if the preset number is represented by X, when the hand gesture data in the hand gesture data set is subjected to differential processing, the 1 st to the X-th items may be selected first, and the first inter-frame differences of the 1 st to the (X-1) -th hand gesture data may be calculated, so as to obtain a first inter-frame difference set, where the first inter-frame difference sets share the (X-1) first inter-frame difference sets. Then judging whether unselected data exists in the hand gesture data set, if so, continuing to select the 2 nd to the X th hand gesture data in the hand gesture data set, calculating the first inter-frame difference of the 2 nd to the (X+1) th hand gesture data, updating the first inter-frame difference set according to the obtained inter-frame difference of the 2 nd to the (X+1) th hand gesture data, replacing the 2 nd to the X-1 st data in the first inter-frame difference with the first inter-frame difference of the 2 nd to the X-1 st hand gesture data obtained by the calculation, and placing the first inter-frame difference of the X th hand gesture data obtained by the calculation at the end of the data in the first inter-frame difference set, namely the original first inter-frame difference set F ₂ = (c 1, c2, c3, c4, …, c (X-1)), updated first inter-frame difference set F ₂ = (c 1, c2, c3, c4, …, c (X-1), cX), then continuing to determine whether there is unselected data in the hand gesture data set, if so, continuing to process the hand gesture data in the hand gesture data set until the (n-X) th to nth items in the hand gesture data set are selectedWherein n hand pose data are shared in the hand pose data set, calculating a first inter-frame difference of the (n-X) -th to (n-1) -th hand pose data, updating the first inter-frame difference set according to the obtained (n-X) -th to (n-1) -th hand pose data, replacing the data of the (n-X) -th to (n-2) -th in the first inter-frame difference with the first inter-frame difference of the (n-X) -th to (n-2) -th hand pose data obtained by the calculation, and placing the first inter-frame difference of the (n-1) -th hand pose data obtained by the calculation at the end of the data in the first inter-frame difference set, namely obtaining a first inter-frame difference set F by the previous calculation ₂ = (c 1, c2, c3, c4, …, c (n-2)), updated first inter-frame difference set F ₂ = (c 1, c2, c3, c4, …, c (n-2), c (n-1)), then all the data in the first hand gesture data set are selected, the first inter-frame difference can be calculated by the hand gesture data of the previous (n-1) item of the first hand gesture data set, and since there are only n hand gesture data in the first hand gesture data set, the first inter-frame difference cannot be calculated by the nth hand gesture data, so the nth hand gesture data can be ignored.

In some embodiments, a first inter-frame difference in a first set of inter-frame differences may be analyzed to determine whether abnormal data is present and replace the abnormal data with a second inter-frame difference, the second inter-frame difference being a first inter-frame difference adjacent to the abnormal data, e.g., a first set of inter-frame differences F ₂ If c1 in (c 1, c2, c3, c4, c5, c 6) is abnormal data, c1 may be replaced with c2 adjacent to c1, and the replaced first inter-frame difference set is determined as a second inter-frame difference set, the second inter-frame difference set F ₃ = (c 2, c2, c3, c4, c5, c 6), for example, a first inter-frame difference set F ₂ If c2 and c4 are abnormal data, c2 can be replaced by c3 adjacent to c2, c4 can be replaced by c5 adjacent to c4, and a second inter-frame difference set F can be obtained ₃ = (c 1, c3, c3, c5, c5, c 6), for example, a first inter-frame difference set F ₂ If c2 and c3 are abnormal data, c2 can be replaced by c1 adjacent to c2, c3 can be replaced by c4 adjacent to c3, and a second inter-frame difference set F can be obtained ₃ ＝(c1， c1，c4，c4，c5，c6)。

In some embodiments, outliers may be preset, including a first preset value and a second preset value, which may be based on the mean μ and variance σ in the first inter-frame difference set ² Obtaining, wherein the first preset value is mean mu and variance sigma ² Sum of μ and variance σ of a second preset value ² If the difference between the target first inter-frame difference and the two adjacent first inter-frame differences in the first inter-frame difference set is larger than the first preset value or smaller than the second preset value, judging that the target first inter-frame difference is abnormal data, replacing the target first inter-frame difference with any one of the adjacent first inter-frame differences, and if the difference between the target first inter-frame difference and only one of the adjacent first inter-frame differences is larger than the first preset value or smaller than the second preset value, continuously judging whether the first inter-frame difference adjacent to the target first inter-frame difference is abnormal data. For example, a first inter-frame difference set F ₂ = (c 1, c2, c3, c4, c5, c 6), if c3>μ+σ ² Or c3 < mu-sigma ² Judging that c3 is abnormal data, and replacing c3 with c2 or c4 to obtain a second inter-frame difference set F ₃ = (c 1, c2, c2, c4, c5, c 6) or F ₃ ＝(c1，c2，c4，c4，c5，c6)。

Referring to fig. 10, a gaussian distribution of a first set of interframe differences is provided for an exemplary embodiment of the present application. As shown in fig. 10, the abscissa corresponding to two inflection points in the gaussian distribution diagram is the first preset value μ+σ ² And a second preset value mu-sigma ² The first set of interframe differences has a ratio of 68.26% of data distributed between the first preset value and the second preset value, and only 68.26% of the first set of interframe differences are normal interframe differences, the remainder are abnormal data, and the abnormal data need to be replaced by the second interframe differences to reduce errors.

In some embodiments, the hand gesture data corresponding to the inter-frame difference data in the second inter-frame difference set may be determined as fourth hand gesture data, and the hand gesture of the user may be obtained according to the fourth hand gesture data, so as to eliminate the error of the hand gesture data initially identified in the video frame, as shown in fig. 11, to eliminate the errorAnd (3) comparing the front and back images except for errors of the hand gesture data identified in the video frames. For example, from a set of hand gesture data F ₁ = (b 1, b2, b3, b4, b5, b6, b 7), a first inter-frame difference set F is calculated ₂ = (c 1, c2, c3, c4, c5, c 6), replacing the outlier data in the first inter-frame difference set to obtain a second inter-frame difference set F ₃ = (c 1, c3, c3, c5, c5, c 6), wherein the second inter-frame difference set F ₃ The second and fourth terms of (2) are replaced, thus obtaining a second inter-frame difference set F ₃ Hand gesture data set F corresponding to each inter-frame difference ₄ = (b 1, b3, b3, b5, b5, b 6), hand gesture data set F ₄ B1, b3, b3, b5, b5, b6 in (b) are fourth hand gesture data. According to the fourth hand gesture data b1, b3, b3, b5, b5, b6, the hand gesture of the user can be obtained, and the control instruction corresponding to the hand gesture can be executed.

In some embodiments, in order to further improve accuracy of the fourth hand gesture data, a weighting coefficient applied to the fourth hand gesture data may be preset, and the fourth hand gesture data may be subjected to weighted summation processing to obtain target hand gesture data, and according to the target hand gesture data, a hand gesture of the user is obtained, for example, the fourth hand gesture data may be subjected to weighted summation processing by the following formula:

wherein n is a positive integer, b represents target hand gesture data,representing the fourth hand gesture data, alpha ₁ ，α ₂ …α _n Weighting coefficients of the fourth hand gesture data respectively and satisfy alpha ₁ +α ₂ +…+α _n ＝1。

In some embodiments, when no abnormal data exists in the first inter-frame difference set, that is, when all the values of the first inter-frame differences in the first inter-frame difference set are between a first preset value and a second preset value, the hand gesture data set acquired before is directly determined to be a target hand gesture data set, weighted summation processing is performed on each hand gesture data in the target hand gesture data set according to a certain proportion, the obtained result is the target hand gesture data, the hand gesture of the user can be obtained according to the target hand gesture data, and the control instruction corresponding to the hand gesture is executed.

The application also provides a man-machine interaction method, referring to fig. 12, which is a flowchart of the man-machine interaction method provided by the embodiment of the application, as shown in fig. 12, the man-machine interaction method provided by the application comprises the following steps:

s1: responding to the change of the hand gesture of the user, acquiring a hand gesture data set, wherein the hand gesture data set comprises hand gesture data arranged according to a preset sequence, and the hand gesture data is used for representing the hand gesture of the user at a target moment;

in some embodiments, the acquiring the hand gesture data set in response to a change in the hand gesture of the user further comprises:

responding to the change of the hand gesture of the user, and acquiring a plurality of video frames comprising hand images of the user; performing image recognition on the video frame to obtain hand gesture data of a user in the video frame; and generating a hand gesture data set according to the hand gesture data, wherein the hand gesture data set comprises the hand gesture data arranged according to the sequence of the video frames.

S2: performing first-order difference processing on the hand gesture data in the hand gesture data set to obtain a first inter-frame difference set, wherein the first inter-frame difference set comprises first inter-frame differences between hand gesture data of any adjacent frames;

In some embodiments, the performing first-order difference processing on the hand gesture data in the hand gesture data set to obtain a first inter-frame difference set further includes: determining a difference between the hand pose data and first hand pose data as an inter-frame difference of the hand pose data, wherein the first hand pose data is the hand pose data of an adjacent frame of the hand pose data; and obtaining a first inter-frame difference set according to the inter-frame difference of each hand gesture data.

In some embodiments, the obtaining a first set of inter-frame differences according to the inter-frame differences of each hand gesture data further includes: selecting a preset number of second hand gesture data, wherein the second hand gesture data are continuous hand gesture data in the hand gesture data set; and obtaining a first inter-frame difference set according to the inter-frame differences of the second hand gesture data, wherein the first inter-frame difference set comprises the inter-frame differences of the second hand gesture data arranged according to the sequence of the second hand gesture data.

S3: and replacing the first inter-frame difference within a preset range with a second inter-frame difference to obtain a second inter-frame difference set, wherein the second inter-frame difference is one inter-frame difference of the adjacent frames of the first inter-frame difference.

In some embodiments: the method further comprises the steps of: judging whether the unselected hand gesture data exists in the hand gesture data set; if yes, selecting a preset number of third hand gesture data, wherein the third hand gesture data are continuous hand gesture data in the hand gesture data set, and the third hand gesture data comprise at least one hand gesture data which is not selected as the second hand gesture data; and updating the first inter-frame difference set according to the inter-frame difference of the third hand gesture data.

In some embodiments, the method further comprises: according to the mean value and the variance of the first inter-frame difference in the first inter-frame difference set, a first preset value and a second preset value are obtained, wherein the first preset value is the sum of the mean value and the variance, the second preset value is the difference of the mean value and the variance, and the first preset value and the second preset value are used for determining the preset range.

In some embodiments, the replacing the first inter-frame difference within the preset range with the second inter-frame difference to obtain a second inter-frame difference set further includes: when a first inter-frame difference in a preset range exists, replacing the first inter-frame difference in the first inter-frame difference set in the preset range with the second inter-frame difference, wherein the first inter-frame difference is an inter-frame difference larger than the first preset value or smaller than the second preset value; and determining the first inter-frame difference set after the replacement operation as a second inter-frame difference set.

In some embodiments, the acquiring target hand pose data according to the inter-frame differences in the second inter-frame difference set further includes acquiring fourth hand pose data, the fourth hand pose data being hand pose data corresponding to the inter-frame differences in the second inter-frame difference set; and carrying out weighted summation processing on the fourth hand gesture data according to a preset weighting coefficient acting on each fourth hand gesture data to obtain target hand gesture data.

In some embodiments, the method further includes executing a control instruction corresponding to the set of hand gesture data when the first inter-frame difference is not present within a preset range.

S4: and acquiring target hand gesture data according to the inter-frame differences in the second inter-frame difference set so as to execute a control instruction corresponding to the target hand gesture data.

In summary, according to the virtual reality device and the human-computer interaction method provided by the application, the camera can detect the hand gesture data of the user, the hand gesture data set is generated, the first level difference processing is performed on each hand gesture data in the hand gesture data set to obtain the first inter-frame difference set, the first inter-frame difference in the first inter-frame difference set within the preset range is replaced by the second inter-frame difference to obtain the second inter-frame difference set, and the target hand gesture data is obtained according to the inter-frame difference in the second inter-frame difference set so as to execute the control instruction corresponding to the target hand gesture data, so that the error of the obtained hand gesture data is reduced, the success rate of effective human-computer interaction is improved, and the user experience is facilitated.

In a specific implementation, the present invention further provides a computer storage medium, where the computer storage medium may store a program, where the program may include some or all of the steps in each embodiment of the virtual reality device and the man-machine interaction method provided by the present invention when executed. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (random access memory, RAM), or the like.

It will be apparent to those skilled in the art that the techniques of embodiments of the present invention may be implemented in software plus the necessary general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be embodied essentially or in the portions contributing to the existing technology in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and include several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in the embodiments or portions of the embodiments of the present invention.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced equivalently; such modifications and substitutions do not depart from the spirit of the application.

The foregoing description, for purposes of explanation, has been presented in conjunction with specific embodiments. However, the illustrative discussions above are not intended to be exhaustive or to limit the embodiments to the precise forms disclosed above. Many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles and the practical application, to thereby enable others skilled in the art to best utilize the embodiments and various embodiments with various modifications as are suited to the particular use contemplated.

Claims

1. A virtual reality device, comprising:

a display;

The camera is configured to acquire hand gesture data of a user in real time;

a controller configured to control the operation of the device,

2. The virtual reality device of claim 1, wherein in the acquiring hand pose data set step responsive to a change in hand pose of the user, the controller is further configured to:

Responding to the change of the hand gesture of the user, and acquiring a plurality of video frames comprising hand images of the user;

performing image recognition on the video frame to acquire hand gesture data of a user in the video frame;

and generating a hand gesture data set according to the hand gesture data, wherein the hand gesture data set comprises the hand gesture data arranged according to the sequence of the video frames.

3. The virtual reality device of claim 2, wherein the first order difference processing is performed on the hand pose data in the hand pose data set to obtain a first inter-frame difference set, and the controller is further configured to:

determining a difference between the hand pose data and first hand pose data as a first inter-frame difference of the hand pose data, wherein the first hand pose data is the hand pose data of an adjacent frame of the hand pose data;

and obtaining a first inter-frame difference set according to the inter-frame difference of each hand gesture data.

4. A virtual reality device as claimed in claim 3, wherein in the step of deriving a first set of inter-frame differences from the inter-frame differences for each of the hand pose data, the controller is further configured to:

Selecting a preset number of second hand gesture data, wherein the second hand gesture data are continuous hand gesture data in the hand gesture data set;

and obtaining a first inter-frame difference set according to the inter-frame differences of the second hand gesture data, wherein the first inter-frame difference set comprises the inter-frame differences of the second hand gesture data arranged according to the sequence of the second hand gesture data.

5. The virtual reality device of claim 4, wherein the controller is further configured to:

judging whether the unselected hand gesture data exists in the hand gesture data set;

if yes, selecting a preset number of third hand gesture data, wherein the third hand gesture data are continuous hand gesture data in the hand gesture data set, and the third hand gesture data comprise at least one hand gesture data which is not selected as the second hand gesture data;

and updating the first inter-frame difference set according to the inter-frame difference of the third hand gesture data.

6. The virtual reality device of claim 5, wherein the controller is further configured to:

According to the mean value and the variance of the first inter-frame difference in the first inter-frame difference set, a first preset value and a second preset value are obtained, wherein the first preset value is the sum of the mean value and the variance, the second preset value is the difference of the mean value and the variance, and the first preset value and the second preset value are used for determining the preset range.

7. The virtual reality device of claim 6, wherein the replacing the first inter-frame difference with the second inter-frame difference within the predetermined range results in a second set of inter-frame differences, the controller is further configured to:

when a first inter-frame difference in a preset range exists, replacing the first inter-frame difference in the first inter-frame difference set in the preset range with the second inter-frame difference, wherein the first inter-frame difference is an inter-frame difference larger than the first preset value or smaller than the second preset value;

and determining the first inter-frame difference set after the replacement operation as a second inter-frame difference set.

8. The virtual reality device of claim 7, wherein in the obtaining target hand pose data step from the inter-frame differences in the second set of inter-frame differences, the controller is further configured to:

Acquiring fourth hand gesture data, wherein the fourth hand gesture data is hand gesture data corresponding to an inter-frame difference in the second inter-frame difference set;

and carrying out weighted summation processing on the fourth hand gesture data according to a preset weighting coefficient acting on each fourth hand gesture data to obtain target hand gesture data.

9. The virtual reality device of claim 7, wherein the controller is further configured to:

and executing a control instruction corresponding to the hand gesture data set when the first inter-frame difference within the preset range does not exist.

10. A human-computer interaction method, comprising: