CN114332423A

CN114332423A - Virtual reality handle tracking method, terminal and computer-readable storage medium

Info

Publication number: CN114332423A
Application number: CN202111654472.5A
Authority: CN
Inventors: 张毅
Original assignee: Shenzhen Skyworth New World Technology Co ltd
Current assignee: Shenzhen Skyworth New World Technology Co ltd
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-04-12

Abstract

The invention discloses a virtual reality handle tracking method, a terminal and a computer readable storage medium, comprising the following steps: acquiring a plurality of frames of first images through a first camera device positioned at a head display end; establishing an SLAM map according to the multi-frame first image; extracting a first feature point in the first image; acquiring space coordinate information of a first feature point in an SLAM map; acquiring a second image through a second camera device positioned at the handle end; extracting second feature points in the second image, and acquiring space coordinate information of the second feature points in the SLAM map; acquiring second feature plane coordinate information of a second feature point in a second image; and calculating the current handle position information of the handle end by a PNP algorithm by combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point. The invention can effectively track and position the handle so as to optimize the virtual reality experience of the user.

Description

Virtual reality handle tracking method, terminal and computer-readable storage medium

Technical Field

The invention relates to the field of virtual reality, in particular to a virtual reality handle tracking method, a terminal and a computer readable storage medium.

Background

Virtual Reality (VR) is an "interactive computer-simulated environment that senses the state and behavior of a user and replaces or enhances sensory feedback to one or more sensory systems, thereby giving the user a feeling of immersion in the Virtual environment of the simulated environment. The virtual reality technology is characterized by high immersion, and when a user is in a virtual environment, such as being personally on the scene; and when the user angles, the virtual environment changes accordingly.

With the development of virtual reality technology, the function of the handle becomes more and more important. After a user wears a VR head display (head-mounted display equipment), interaction with a virtual reality scene can be achieved through a handle. Currently, the tracking method of the handle is generally electromagnetic tracking and ultrasonic tracking.

The electromagnetic tracking is that an electromagnetic emitter is embedded in the handle, an electromagnetic receiver is embedded in the VR head, and the position and the posture information of the handle in a three-dimensional space are resolved in real time by utilizing the electromagnetic tracking principle. The ultrasonic tracking is that an ultrasonic transmitter is embedded in the handle, an ultrasonic receiver is embedded in the VR head display, and the position and attitude information of the handle in a three-dimensional space are resolved in real time by utilizing the ultrasonic tracking principle.

However, the electromagnetic sensor of the handle is sensitive to electromagnetic signals in the environment and is easily interfered by complex electromagnetic signals in the environment, so that the electromagnetic sensor generates wrong electromagnetic tracking data of the handle. For example, when the electromagnetic sensor of the handle is closer to the host computer, or in an environment closer to a sound box, a television, a refrigerator, or the like, the handle is affected by other electromagnetic signals, and the tracking performance of the handle is poor, so that the virtual reality experience of the user is poor. Therefore, the handle using the electromagnetic sensor has a large limitation in use. Similarly, the handle using the ultrasonic sensor has a large limitation in use.

Disclosure of Invention

The invention mainly aims to provide a virtual reality handle tracking method, a terminal and a computer-readable storage medium, aiming at effectively tracking and positioning a handle so as to optimize the virtual reality experience of a user.

In order to achieve the above object, the present invention provides a virtual reality handle tracking method, which comprises the following steps:

acquiring a plurality of frames of first images through at least one first camera device positioned at a head display end;

establishing an SLAM map according to the multi-frame first image; extracting a first feature point in the first image; acquiring space coordinate information of a first feature point in an SLAM map;

acquiring a second image through at least one second camera device positioned at the handle end;

extracting a second feature point in the second image; matching the second characteristic point with the first characteristic point to acquire space coordinate information of the second characteristic point in the SLAM map; acquiring second feature plane coordinate information of a second feature point in a second image;

and calculating the current handle position information of the handle end by a PNP algorithm by combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point.

Optionally, in the step of matching the second feature point with the first feature point, the method includes the following steps:

generating a corresponding first feature descriptor from the first feature point;

generating a corresponding second feature descriptor from the second feature point;

the second feature descriptor is matched with the first feature descriptor.

Optionally, in the step of acquiring the spatial coordinate information of the first feature point in the SLAM map, the method includes the following steps:

performing BA optimization processing on first feature points in the multi-frame first image;

and acquiring the spatial coordinate information of the first feature point subjected to BA optimization processing in the SLAM map.

Optionally, after the step of acquiring the spatial coordinate information of the first feature point in the SLAM map, the method further includes the following steps:

acquiring first feature plane coordinate information of a first feature point in a first image;

and calculating the current head display position and attitude information of the head display end by a PNP algorithm by combining the first characteristic plane coordinate information and the space coordinate information of the first characteristic point in the SLAM map.

Optionally, extracting a second feature point in the second image; the step of matching the second feature point with the first feature point to obtain the spatial coordinate information of the second feature point in the SLAM map includes the following steps:

performing BA optimization processing on second feature points in the multi-frame second image;

and matching the second characteristic point subjected to BA optimization processing with the first characteristic point to acquire the space coordinate information of the second characteristic point in the SLAM map.

Optionally, after the step of calculating the current handle pose information of the handle end by using the PNP algorithm in combination with the second feature plane coordinate information and the space coordinate information corresponding to the second feature point, the method further includes the following steps:

acquiring second IMU information through a second inertia measurement unit positioned at the handle end;

and carrying out fusion operation processing on the handle pose information and the second IMU information.

Optionally, in the step of extracting the first feature point in the first image and the step of extracting the second feature point in the second image, the following steps are included:

acquiring first timestamp information through a first timestamp unit positioned at a head display end;

acquiring second timestamp information through a second timestamp unit positioned at the handle end;

acquiring a first image and a second image at the same time in the first time stamp information and the second time stamp information;

a first feature point and a second feature point are extracted from a first image and a second image at the same time.

Optionally, the step of extracting the first feature point and the second feature point from the first image and the second image at the same time includes the following steps:

and matching second feature points in the second image by using an optical flow tracking algorithm according to the first feature points in the first image.

In order to achieve the above object, the present invention further provides a virtual reality handle tracking device, configured to execute the virtual reality handle tracking method, including:

the first image acquisition module is used for processing a plurality of frames of first images acquired by a first camera device positioned at the head display end;

the first characteristic point processing module is used for processing the establishment of an SLAM map according to a plurality of frames of first images; extracting a first feature point in the first image; acquiring space coordinate information of a first feature point in an SLAM map;

the second image acquisition module is used for processing a second image acquired by a second camera device positioned at the handle end;

the second feature point processing module is used for processing and extracting second feature points in a second image; acquiring second feature plane coordinate information of a second feature point in a second image; matching the second characteristic points with the first characteristic points to obtain space coordinate information corresponding to the second characteristic points;

and the handle pose information module is used for processing and combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point, and calculating the current handle pose information of the handle end through a PNP algorithm.

In order to achieve the above object, the present invention further provides a terminal, including: a processor, a memory, and a virtual reality handle tracking program stored on the memory and executable on the processor, the virtual reality handle tracking program when executed by the processor implementing the steps of the virtual reality handle tracking method described above.

To achieve the above object, the present invention further provides a computer readable storage medium, which stores a virtual reality handle tracking program, and the virtual reality handle tracking program, when executed by a processor, implements the steps of the virtual reality handle tracking method described above.

Compared with the prior art, the invention has the beneficial effects that:

electromagnetic tracking and ultrasonic tracking different from the background technology are easily interfered by other electromagnetic signals and ultrasonic signals in the environment, so that the use limitation is large, and the user experience is not good. The method comprises the steps that a first camera device located at a head display end is used for obtaining a plurality of frames of first images; then, establishing an SLAM map according to the multi-frame first image; extracting a first feature point in the first image; acquiring space coordinate information of a first feature point in an SLAM map; then a second image is obtained through a second camera device positioned at the handle end; then extracting a second feature point in the second image; matching the second characteristic point with the first characteristic point to acquire space coordinate information of the second characteristic point in the SLAM map; acquiring second feature plane coordinate information of a second feature point in a second image; and finally, calculating the current handle position and pose information of the handle end by a PNP algorithm by combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point. By using the image acquisition technology of the camera device, the characteristic points in the environment are directly detected, and then the current pose information of the handle is calculated by sharing the SLAM map, so that the handle is not easily interfered by other factors in the environment and has higher robustness.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of a hardware structure of an embodiment of a mobile terminal;

FIG. 2 is a flowchart illustrating a virtual reality handle tracking method according to a first embodiment of the present invention;

FIG. 3 is a flowchart illustrating a virtual reality handle tracking method according to a second embodiment of the present invention;

FIG. 4 is a flowchart illustrating a virtual reality handle tracking method according to a third embodiment of the present invention;

FIG. 5 is a flowchart illustrating a virtual reality handle tracking method according to a fourth embodiment of the present invention;

FIG. 6 is a flowchart illustrating a virtual reality handle tracking method according to a fifth embodiment of the present invention;

fig. 7 is a flowchart illustrating a virtual reality handle tracking method according to a sixth embodiment of the invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and obviously, the description is only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In the following description, suffixes such as "module", "component", or "unit" used to denote elements are used only for facilitating the explanation of the present invention, and have no specific meaning in itself. Thus, "module", "component" or "unit" may be used mixedly.

The terminal may be implemented in various forms. For example, the terminal described in the present invention may include a mobile terminal such as a mobile phone, a tablet computer, a notebook computer, a palmtop computer, a Personal Digital Assistant (PDA), a Portable Media Player (PMP), a navigation device, a wearable device, a smart band, a pedometer, and the like, and a fixed terminal such as a Digital TV, a desktop computer, and the like.

The following description will be given by way of example of a mobile terminal, and it will be understood by those skilled in the art that the construction according to the embodiment of the present invention can be applied to a fixed type terminal, in addition to elements particularly used for mobile purposes.

Referring to fig. 1, which is a schematic diagram of a hardware structure of a mobile terminal for implementing various embodiments of the present invention, the mobile terminal 100 may include: RF (Radio Frequency) unit 101, WiFi module 102, audio output unit 103, a/V (audio/video) input unit 104, sensor 105, display unit 106, user input unit 107, interface unit 108, memory 109, processor 110, and power supply 111. Those skilled in the art will appreciate that the mobile terminal architecture shown in fig. 1 is not intended to be limiting of mobile terminals, which may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The following describes each component of the mobile terminal in detail with reference to fig. 1:

the radio frequency unit 101 may be configured to receive and transmit signals during information transmission and reception or during a call, and specifically, receive downlink information of a base station and then process the downlink information to the processor 110; in addition, the uplink data is transmitted to the base station. Typically, radio frequency unit 101 includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like. In addition, the radio frequency unit 101 can also communicate with a network and other devices through wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to GSM (Global System for Mobile communications), GPRS (General Packet Radio Service), CDMA2000(Code Division Multiple Access 2000), WCDMA (Wideband Code Division Multiple Access), TD-SCDMA (Time Division-Synchronous Code Division Multiple Access), FDD-LTE (Frequency Division duplex Long Term Evolution), and TDD-LTE (Time Division duplex Long Term Evolution).

WiFi belongs to short-distance wireless transmission technology, and the mobile terminal can help a user to receive and send e-mails, browse webpages, access streaming media and the like through the WiFi module 102, and provides wireless broadband internet access for the user. Although fig. 1 shows the WiFi module 102, it is understood that it does not belong to the essential constitution of the mobile terminal, and may be omitted entirely as needed within the scope not changing the essence of the invention.

The audio output unit 103 may convert audio data received by the radio frequency unit 101 or the WiFi module 102 or stored in the memory 109 into an audio signal and output as sound when the mobile terminal 100 is in a call signal reception mode, a call mode, a recording mode, a voice recognition mode, a broadcast reception mode, or the like. Also, the audio output unit 103 may also provide audio output related to a specific function performed by the mobile terminal 100 (e.g., a call signal reception sound, a message reception sound, etc.). The audio output unit 103 may include a speaker, a buzzer, and the like.

The a/V input unit 104 is used to receive audio or video signals. The a/V input Unit 104 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, the Graphics processor 1041 Processing image data of still pictures or video obtained by an image capturing device (e.g., a camera) in a video capturing mode or an image capturing mode. The processed image frames may be displayed on the display unit 106. The image frames processed by the graphic processor 1041 may be stored in the memory 109 (or other storage medium) or transmitted via the radio frequency unit 101 or the WiFi module 102. The microphone 1042 may receive sounds (audio data) via the microphone 1042 in a phone call mode, a recording mode, a voice recognition mode, or the like, and may be capable of processing such sounds into audio data. The processed audio (voice) data may be converted into a format output transmittable to a mobile communication base station via the radio frequency unit 101 in case of a phone call mode. The microphone 1042 may implement various types of noise cancellation (or suppression) algorithms to cancel (or suppress) noise or interference generated in the course of receiving and transmitting audio signals.

The mobile terminal 100 also includes at least one sensor 105, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor includes an ambient light sensor that can adjust the brightness of the display panel 1061 according to the brightness of ambient light, and a proximity sensor that can turn off the display panel 1061 and/or a backlight when the mobile terminal 100 is moved to the ear. As one of the motion sensors, the accelerometer sensor can detect the magnitude of acceleration in each direction (generally, three axes), can detect the magnitude and direction of gravity when stationary, and can be used for applications of recognizing the posture of a mobile phone (such as horizontal and vertical screen switching, related games, magnetometer posture calibration), vibration recognition related functions (such as pedometer and tapping), and the like; as for other sensors such as a fingerprint sensor, a pressure sensor, an iris sensor, a molecular sensor, a gyroscope, a barometer, a hygrometer, a thermometer, and an infrared sensor, which can be configured on the mobile phone, further description is omitted here.

The display unit 106 is used to display information input by a user or information provided to the user. The Display unit 106 may include a Display panel 1061, and the Display panel 1061 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.

The user input unit 107 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the mobile terminal. Specifically, the user input unit 107 may include a touch panel 1071 and other input devices 1072. The touch panel 1071, also referred to as a touch screen, may collect a touch operation performed by a user on or near the touch panel 1071 (e.g., an operation performed by the user on or near the touch panel 1071 using a finger, a stylus, or any other suitable object or accessory), and drive a corresponding connection device according to a predetermined program. The touch panel 1071 may include two parts of a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 110, and can receive and execute commands sent by the processor 110. In addition, the touch panel 1071 may be implemented in various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave. In addition to the touch panel 1071, the user input unit 107 may include other input devices 1072. In particular, other input devices 1072 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like, and are not limited to these specific examples.

Further, the touch panel 1071 may cover the display panel 1061, and when the touch panel 1071 detects a touch operation thereon or nearby, the touch panel 1071 transmits the touch operation to the processor 110 to determine the type of the touch event, and then the processor 110 provides a corresponding visual output on the display panel 1061 according to the type of the touch event. Although the touch panel 1071 and the display panel 1061 are shown in fig. 1 as two separate components to implement the input and output functions of the mobile terminal, in some embodiments, the touch panel 1071 and the display panel 1061 may be integrated to implement the input and output functions of the mobile terminal, and is not limited herein.

The interface unit 108 serves as an interface through which at least one external device is connected to the mobile terminal 100. For example, the external device may include a wired or wireless headset port, an external power supply (or battery charger) port, a wired or wireless data port, a memory card port, a port for connecting a device having an identification module, an audio input/output (I/O) port, a video I/O port, an earphone port, and the like. The interface unit 108 may be used to receive input (e.g., data information, power, etc.) from external devices and transmit the received input to one or more elements within the mobile terminal 100 or may be used to transmit data between the mobile terminal 100 and external devices.

The memory 109 may be used to store software programs and various data, and the memory 109 may be a computer storage medium, and the memory 109 stores the message alert program of the present invention. The memory 109 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. Further, the memory 109 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

The processor 110 is a control center of the mobile terminal, connects various parts of the entire mobile terminal using various interfaces and lines, and performs various functions of the mobile terminal and processes data by operating or executing software programs and/or modules stored in the memory 109 and calling data stored in the memory 109, thereby performing overall monitoring of the mobile terminal. Such as processor 110, executes a message alert program stored in memory 109 to implement the steps of various embodiments of the message alert method of the present invention.

Processor 110 may include one or more processing units; alternatively, the processor 110 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

The mobile terminal 100 may further include a power supply 111 (e.g., a battery) for supplying power to various components, and optionally, the power supply 111 may be logically connected to the processor 110 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system.

Although not shown in fig. 1, the mobile terminal 100 may further include a bluetooth module or the like, which is not described in detail herein.

Based on the hardware structure of the mobile terminal, the invention provides various embodiments of the method.

The invention provides a virtual reality handle tracking method, which comprises the following steps in a first embodiment of the virtual reality handle tracking method, referring to fig. 2:

step S10: acquiring a plurality of frames of first images through at least one first camera device positioned at a head display end;

step S20: establishing an SLAM map according to the multi-frame first image; extracting first feature points in the first image, and generating corresponding first feature descriptors from the first feature points; acquiring space coordinate information of a first feature point in an SLAM map;

step S30: acquiring a second image through at least one second camera device positioned at the handle end;

step S40: extracting second feature points in the second image, and generating corresponding second feature descriptors from the second feature points; matching the second feature descriptor with the first feature descriptor to acquire space coordinate information of a second feature point in the SLAM map; acquiring second feature plane coordinate information of a second feature point in a second image;

step S50: and calculating the current handle position information of the handle end by a PNP algorithm by combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point.

The electromagnetic sensor for the handle in the background art is sensitive to electromagnetic signals in the environment and is easily interfered by complex electromagnetic signals in the environment, so that the electromagnetic sensor generates wrong electromagnetic tracking data of the handle. For example, when the electromagnetic sensor of the handle is closer to the host computer, or in an environment closer to a sound box, a television, a refrigerator, or the like, the handle is affected by other electromagnetic signals, and the tracking performance of the handle is poor, so that the virtual reality experience of the user is poor. Therefore, the handle using the electromagnetic sensor has a large limitation in use. And similarly, the use limitation of the handle adopting the ultrasonic sensor is also large.

In order to solve the above technical problem, in this embodiment, a plurality of frames of first images are obtained through a first camera device located at a head display end; then, an SLAM (synchronous positioning and mapping) map is established according to the multi-frame first image; extracting first feature points in the first image, and generating corresponding first feature descriptors from the first feature points; acquiring space coordinate information of a first feature point in an SLAM map; then a second image is obtained through a second camera device positioned at the handle end; then, extracting second feature points in the second image, and generating corresponding second feature descriptors from the second feature points; matching the second feature descriptor with the first feature descriptor to acquire space coordinate information of a second feature point in the SLAM map; acquiring second feature plane coordinate information of a second feature point in a second image; and finally, calculating the current handle position and pose information of the handle end by a PNP algorithm by combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point. By using the image acquisition technology of the camera device, the characteristic points in the environment are directly detected, and then the current pose information of the handle is calculated by sharing the SLAM map, so that the handle is not easily interfered by other factors in the environment and has higher robustness.

The PNP (passive-n-Point) algorithm is a method for solving Point-to-Point motion from 3D to 2D. It describes how to estimate the pose of the camera when we know the spatial coordinates of the N feature points and their projection positions. Generally speaking, the PNP algorithm is to calculate the pose of the photographing device from the spatial coordinate information of N feature points in a known world coordinate system and the projection (plane coordinate information) of these feature points on the image. Wherein the spatial coordinate information of the feature points may be determined by triangulation or a depth map of an RGB-D camera.

The information transmission between the head display end and the handle end can be carried out in a wired or wireless mode. Specifically, after the second image is acquired by the second camera device at the handle end, the second image is compressed and transmitted to the head display end, and subsequent operation is performed through the host module of the head display end. In addition, after the second image is acquired by the second photographing device at the handle end, the second feature point extraction and the second feature descriptor generation can be performed at the handle end, the extracted second feature point and the second feature descriptor are sent to the head display end, and the host module at the head display end performs subsequent operation, so that the transmission bandwidth occupation can be reduced. In addition, under the condition that the computing force of the handle end is enough, the SLAN map can be shared between the handle end and the head display end, the handle end is used for computing the handle pose information by itself, and then the handle pose information is sent to the head display end, so that the occupied bandwidth of transmission is further reduced.

Further, in order to improve the accuracy and speed of the SLAM map building, a plurality of first image pickup devices may be provided at the head display end, and when a plurality of first image pickup devices are provided, the shooting directions of the plurality of first image pickup devices are respectively oriented to different angles. The plurality of first camera devices are used for synchronously acquiring the first images and sharing image information, and the SLAM map is established through the plurality of first images, so that the accuracy and the speed of establishing the SLAM map can be improved.

Further, in order to improve the accuracy of the handle pose information, the number of the second camera devices can be set at the handle end according to actual conditions. When the detection condition is good, a single second camera device can be used, and when the detection characteristics of the single second camera device are not obvious, a plurality of second camera devices can be used for assisting detection, so that the problem that the handle section fails under the scenes with unobvious characteristics can be optimized. And the positions of the plurality of second camera devices are specially distinguished, such as being respectively arranged at the front, the back, the left and the right of the handle end.

Further, the step of matching the second feature point with the first feature point in step S40 includes the steps of: generating a corresponding first feature descriptor from the first feature point; generating a corresponding second feature descriptor from the second feature point; the second feature descriptor is matched with the first feature descriptor. According to the arrangement, the first feature point and the second feature point are converted into the first feature descriptor and the second feature descriptor, and the first feature descriptor and the second feature descriptor are used for matching.

Further, based on the second embodiment of the virtual reality handle tracking method proposed by the first embodiment, referring to fig. 3, in step S20, the method includes the following steps:

step S21: performing BA optimization processing on first feature points in the multi-frame first image;

step S22: and acquiring the spatial coordinate information of the first feature point subjected to BA optimization processing in the SLAM map.

In this embodiment, when applying, the BA optimization processing is performed on the first feature point in the first images of the multiple frames: and then acquiring the space coordinate information of the first feature point in the SLAM map after the BA optimization processing. And optimizing the first characteristic point by using a BA optimization technology to obtain more accurate space coordinate information of the first characteristic point in the SLAM map and further obtain more accurate handle position pose information.

The BA optimization processing of the first characteristic point is to firstly calculate normalized space point coordinates corresponding to the plane coordinates on the first image of the previous frame according to the first camera device and the matched plane coordinates in the first images of the previous and next frames, then calculate plane coordinates re-projected on the first image of the next frame according to the coordinates of the space point, the re-projected plane coordinates (estimated values) and the matched plane coordinates (measured values) on the first image of the next frame cannot be completely overlapped, and the BA optimization processing aims to establish an equation for each matched first characteristic point and then establish the equation in a simultaneous mode to form an overdetermined equation and solve the optimal space point coordinates of the first characteristic point. The ba (bundle adjustment) is a method of extracting optimal 3D model and imaging device parameters (internal parameters and external parameters) from the visual reconstruction. After we make the best adjustment (adjustment) on the pose and the spatial position of the feature point, the light rays (beams of light rays) reflected from each feature point are finally converged to the optical center of the camera, which is called BA for short.

Further, based on the third embodiment of the virtual reality handle tracking method proposed by the first embodiment, referring to fig. 4, after step S20, the method includes the following steps:

step S23: acquiring first feature plane coordinate information of a first feature point in a first image;

step S24: and calculating the current head display position and attitude information of the head display end by a PNP algorithm by combining the first characteristic plane coordinate information and the space coordinate information of the first characteristic point in the SLAM map.

When the method is applied, first feature plane coordinate information of a first feature point in a first image is obtained; and calculating the current position and posture information of the head display terminal by a PNP algorithm by combining the first characteristic plane coordinate information and the space coordinate information of the first characteristic point in the SLAM map. In consideration of the fact that in some virtual reality, the poses of the handle end and the head display end need to be detected, and more real virtual experience is performed by simultaneously positioning the pose information of the handle end and the pose information of the head display end.

Further, based on the fourth embodiment of the virtual reality handle tracking method proposed by the first embodiment, referring to fig. 5, in step S40, the method includes the following steps:

step S41: performing BA optimization processing on second feature points in the multi-frame second image;

step S42: and matching the second characteristic point subjected to BA optimization processing with the first characteristic point to acquire the space coordinate information of the second characteristic point in the SLAM map.

When the method is applied, the BA optimization processing is firstly carried out on the second feature points in the multi-frame second image; and matching the second characteristic point subjected to BA optimization processing with the first characteristic point to acquire space coordinate information of the second characteristic point in the SLAM map. And optimizing the second characteristic point by using a BA optimization technology to obtain more accurate second characteristic point and further obtain more accurate handle position and pose information. The BA optimization processing principle is set forth above, and therefore, will not be described repeatedly.

Further, based on the fifth embodiment of the virtual reality handle tracking method proposed by the first embodiment, referring to fig. 6, after step S50, the method includes the following steps:

step S51: acquiring second IMU information through a second inertia measurement unit positioned at the handle end;

step S52: and carrying out fusion operation processing on the handle pose information and the second IMU information.

When the device is applied, first, second IMU information is obtained through a second inertia measurement unit positioned at the handle end; and then carrying out fusion operation processing on the handle pose information and the second IMU information. The IMU is called an inertial navigation system, and the main elements in the second inertial measurement unit are a gyroscope, an accelerometer and a magnetometer. Wherein the gyroscope can obtain the acceleration of each axis, the accelerometer can obtain the acceleration of x, y and z directions, and the magnetometer can obtain the information of the surrounding magnetic field. The main work is to fuse the data of the three sensors to obtain more accurate second IMU information. And then more accurate handle pose information is obtained by fusing the handle pose information and the second IMU information.

The fusion mode between the handle pose information and the second IMU information comprises loose coupling and tight coupling. The loose coupling takes the vision sensor and the IMU as two separate modules, both of which can calculate pose information and then fuse, typically via EKF. The tight coupling refers to a process of processing intermediate data obtained by the vision sensor and the IMU through an optimization filter, and the tight coupling needs to add image features into feature vectors to finally obtain pose information. For this reason, the final dimension of the system state vector is also very high, and the calculation amount is also large.

Further, based on the technical teaching of the above embodiment, the person skilled in the art may also perform the following steps after step S24:

step S25: acquiring first IMU information through a first inertial measurement unit positioned at a head display end;

step S26: and performing fusion operation processing on the head display attitude information and the first IMU information.

Through the steps, more accurate pose information of the head display is obtained.

Further, based on the sixth embodiment of the virtual reality handle tracking method proposed by the first embodiment, referring to fig. 7, in step S20 and step S40, the following steps are included:

step S61: acquiring first timestamp information through a first timestamp unit positioned at a head display end;

step S62: acquiring second timestamp information through a second timestamp unit positioned at the handle end;

step S63: acquiring a first image and a second image at the same time in the first time stamp information and the second time stamp information;

step S64: a first feature point and a second feature point are extracted from a first image and a second image at the same time.

When the method is applied, first time stamp information is acquired through a first time stamp unit located at a head display end, and second time stamp information is acquired through a second time stamp unit located at a handle end; then acquiring a first image and a second image at the same time in the first time stamp information and the second time stamp information; and finally, extracting the first characteristic point and the second characteristic point from the first image and the second image at the same moment. In order to improve the accuracy of the handle pose information and avoid the deviation of the information shared by the head display end and the handle end caused by delay in the signal transmission process. The first image of the head display end and the second image of the handle end at the same time are matched through the first time stamp unit and the second time stamp unit, so that the time synchronization of the head display end and the handle end is ensured. Meanwhile, the time stamp can be set to simplify the complexity of algorithm matching with respect to network delay.

Further, in step S64, the method further includes the following steps:

step S641: and matching second feature points in the second image by using an optical flow tracking algorithm according to the first feature points in the first image.

Compared with the method that corresponding feature descriptors are generated from feature points at first, the matching of the feature points viewed by the head display end and the handle end together is performed by using an optical flow tracking algorithm, the efficiency is higher, the speed is higher, and the pose information of the handle can be calculated more accurately. Of course, if the optical flow tracking algorithm cannot be used for matching, the feature descriptor matching algorithm is used for searching the second feature point in the SLAM map.

Wherein, the optical flow tracking algorithm is based on the principle of processing a continuous video frame sequence; detecting a possible foreground target by using a certain target detection method aiming at each video sequence; if a foreground target appears in a certain frame, finding out representative key feature points of the foreground target; for any two adjacent videos, the optimal position of the key feature point appearing in the previous frame in the current frame is searched, and therefore the position coordinate of the foreground target in the current frame is obtained; and the target can be tracked by iteration. The principle of the optical flow tracking algorithm is applied to the step, namely, the first image passing through the head display end is taken as a foreground target, the first feature point in the first image is taken as a key feature point, and then the position of the second feature point corresponding to the first feature point is searched in the second image at the handle end, so that the position coordinate of the second feature point is obtained, and the tracking of the feature point is realized.

In addition, an embodiment of the present invention further provides a terminal, where the terminal includes: a processor, a memory, and a virtual reality handle tracking program stored on the memory and executable on the processor, the virtual reality handle tracking program when executed by the processor implementing the steps of the virtual reality handle tracking method as described in the above embodiments.

In addition, an embodiment of the present invention further provides a computer-readable storage medium, where a virtual reality handle tracking program is stored on the computer-readable storage medium, and when being executed by a processor, the virtual reality handle tracking program implements the steps of the virtual reality handle tracking method according to the above embodiment.

It should be noted that other contents of the virtual reality handle tracking method, the terminal and the computer-readable storage medium disclosed by the present invention are prior art and are not described herein again.

It should be noted that, if directional indications (such as up, down, left, right, front, and back … …) are involved in the embodiment of the present invention, the directional indications are only used to explain the relative positional relationship between the components, the motion situation, and the like in a specific posture (as shown in the drawing), and if the specific posture is changed, the directional indications are changed accordingly.

Furthermore, it should be noted that the descriptions relating to "first", "second", etc. in the present invention are for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit to the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

The above are only alternative embodiments of the present invention, and not intended to limit the scope of the present invention, and all the applications of the present invention in other related fields are included in the scope of the present invention.

Claims

1. A virtual reality handle tracking method is characterized by comprising the following steps:

2. The virtual reality handle tracking method of claim 1, wherein: in the step of matching the second feature point with the first feature point, the method includes the steps of:

the second feature descriptor is matched with the first feature descriptor.

3. The virtual reality handle tracking method of claim 1, wherein: in the step of acquiring the spatial coordinate information of the first feature point in the SLAM map, the method includes the following steps:

4. The virtual reality handle tracking method of claim 1, wherein: after the step of acquiring the spatial coordinate information of the first feature point in the SLAM map, the method further comprises the following steps:

5. The virtual reality handle tracking method of claim 1, wherein: extracting a second feature point in the second image; the step of matching the second feature point with the first feature point to obtain the spatial coordinate information of the second feature point in the SLAM map includes the following steps:

6. The virtual reality handle tracking method of claim 1, wherein: after the step of calculating the current handle pose information of the handle end by the PNP algorithm by combining the second characteristic plane coordinate information and the space coordinate information corresponding to the second characteristic point, the method further comprises the following steps:

7. The virtual reality handle tracking method of claim 1, wherein: in the step of extracting the first feature point in the first image and the step of extracting the second feature point in the second image, the method includes the steps of:

8. The virtual reality handle tracking method of claim 7, wherein: the step of extracting the first feature point and the second feature point from the first image and the second image at the same time includes the steps of:

9. A terminal, characterized by: the terminal includes: a processor, a memory, and a virtual reality handle tracking program stored on the memory and executable on the processor, the virtual reality handle tracking program when executed by the processor implementing the steps of the virtual reality handle tracking method of any one of claims 1 to 8.

10. A computer-readable storage medium characterized by: the computer readable storage medium having stored thereon a virtual reality handle tracking program which when executed by a processor implements the steps of the virtual reality handle tracking method of any one of claims 1 to 8.